Today marked OpenAI's inaugural developer conference, sparking a dialogue reminiscent of when AWS, Azure, and GCP dominated headlines with their annual updates—events that were as much a cause for concern as they were for celebration among startups, fearing obsolescence with each new feature. This phenomenon has prompted conversations about a potential parallel in the generative AI industry. However, the landscape has evolved. The tech community has gleaned insights from past experiences with cloud platforms, leading startups to now strategically position themselves in niches that major platforms are less inclined to replicate, akin to PG&E not handling home wiring.
In this vein, concerns about generative AI advancements rendering companies obsolete are more applicable to individual developers than full-fledged startups. Today's event, while stirring both anticipation and apprehension, didn't necessarily affect the same demographic as before. The 'generative AI elite club,' as I often refer to it in my newsletter, is an exclusive circle reserved for companies with recent trillion-dollar market capitalizations, such as Alphabet, Meta, Microsoft, Amazon, Tesla, and Apple. Membership in this club is a prerequisite for significance in the gen AI space, with OpenAI and Anthropic being prime examples of entities endorsed by these giants. Even robust open-source initiatives like Llama require the patronage of these titans, as seen with Meta's involvement.
Thus, I surmise that today's unease about OpenAI's announcements is largely confined to these top-tier players, Microsoft being an exception. It's important to recognize that this competitive fervor—this 'arms race'—holds positive implications for humanity.
Now, let's delve into the specific feature enhancements unveiled today and explore their broader implications.
Context Length Expansion
A pivotal concern for developers immersed in generative AI has been the 'context window length'—a technical constraint that directly impacts their design and development strategies. Recognizing this limitation as a temporary hurdle, the industry has anticipated a breakthrough that would significantly expand this capacity, effectively accommodating more complex workloads. Today, OpenAI has heralded such a milestone. For GPT-4, the context length has been dramatically increased to 128k tokens, an expanse that could encompass the contents of a 300-page book. Meanwhile, for those utilizing GPT-3.5 with fine-tuning capabilities, the limit has now reached 16k tokens—effectively doubling the previous 8k threshold. This enhancement marks a substantial leap forward, opening new horizons for generative AI applications.
Sam referred to the forthcoming advancements as providing more control, but I perceive them to represent cross-cutting concerns. These are integral features that intersect various aspects of functionality and usability, enhancing the core capabilities of the platform.
JSON Mode Enhancement
The transformative adoption of JSON mode heralds a new era in API development. This evolution, as documented in the 55th issue of this newsletter, signifies a pivotal shift with models now generating valid JSON outputs. This breakthrough simplifies the developmental process, enabling smoother integrations and more streamlined workflows.
Function Calling Evolution
In the 2016 TechCrunch article, I discussed the emergence of functions as first-class citizens in cloud-native applications. Fast forward seven years, and the industry now embraces functions as standalone artifacts as standard practice. Addressing concerns that the community has veered too far towards labeling functions as a service an anti-pattern, I've advocated for a balanced, slightly coarse-grained approach. Echoing this sentiment, OpenAI has introduced the capability to invoke multiple functions concurrently, marking a significant enhancement in functional programming paradigms within the generative AI sphere.
The feature I'm most enthused about today is reproducible output. The inherent variability in generative AI results can be a hurdle in fields requiring consistency, such as the iterative processes at Roost.ai. Simple solutions like reducing temperature settings fall short. OpenAI's new seed parameter to ensure output consistency is a profound improvement that cannot be overstated, poised to revolutionize the predictability and reliability of AI-driven outputs.
Viewing Log Probabilities in API
The introduction of log probability visibility within the API opens up a new depth dimension for developers. This feature provides granular insights into the model's decision-making, showcasing the probabilities of each generated token. This level of detail is instrumental for fine-tuning, as it allows for a deeper understanding and optimization of model responses. With this tool, developers can refine their approach, ensuring the AI behaves in alignment with their objectives, thereby enhancing the model's accuracy and reliability.
Refining Data Integration and Optimization Through Retrieval Augmentation
The evolution of retrieval-augmented generation (RAG) stands at the forefront of data engineering’s transformation, a point I've championed in previous discussions. OpenAI’s latest API advancements capitalize on this evolution, offering the power to weave external data seamlessly into the AI's processing stream as needed. In the era preceding ChatGPT, which we might refer to as 'BC' for shorthand, the drudgery of data preparation—often jestingly termed 'data janitorial work'—dominated the bandwidth of data science projects. Now, in the ascendant 'ChatGPT Era,' the integration and curation of pertinent data have become more streamlined yet equally crucial. This ensures that generative AI applications are not just sophisticated, but also contextually aware, delivering precision-tailored responses with unprecedented accuracy.
Elevating AI with Fine-Tuning Prowess
As chronicled in editions 75, 76, and 93 of our newsletter, fine-tuning has consistently been the lever pulling generative AI towards new pinnacles of performance. OpenAI's extension of fine-tuning to GPT-3.5 is a testament to its unwavering commitment to this pursuit, effectively doubling the contextual breadth with the new 16k token limit. This enhancement is a boon for trailblazers like Roost.ai, who now stand on the cusp of accessing GPT-4's advanced fine-tuning capabilities. Though access remains exclusive and the cost substantial, as Sam indicated, the potential for performance optimization in a variety of applications makes this a noteworthy development. With fine-tuning, the promise of AI that not only understands but adapts to nuanced requirements is becoming an attainable reality, paving the way for innovations that are as efficient as they are intelligent.Miscellaneous Features
Enhanced Knowledge Base
Diving deeper into the capabilities of ChatGPT, OpenAI now acknowledge its expansive knowledge base, which extends up until April 2023, encompassing a wide array of information beyond the engineering realm.
Improved Pricing and Rate Limits
For GPT-4 users, rate limits have seen a welcome increase, doubling the tokens per request, with the option to negotiate for more. In tandem, OpenAI has significantly reduced the costs associated with both prompt-based tokens and completion tokens, leading to a cumulative price reduction of approximately 2.75 times—though detailed pricing is beyond the scope of this newsletter.
Alleviating Legal Concerns with Copyright Safeguards
In response to one of the most pressing apprehensions faced by generative AI practitioners, OpenAI has taken a decisive step to mitigate legal concerns. The organization has instituted a potent safeguard for its users against the intricate backdrop of copyright law. By committing to bear the legal expenses arising from copyright infringements when using AI-generated content, OpenAI provides a significant layer of security. This move is designed to fortify the confidence of developers and creators, assuring them that their innovative endeavors can proceed with the backing of OpenAI's legal support framework.