Roost.ai blog on Generative AI and Large Language Models

#192 o1's Reasoning: The Mezzanine Level to AGI

Written by Rishi Yadav | October 2024

<< previous edition: agentic discomfort

As we approach our 200th edition, we've chronicled the evolution of generative AI from early language models to advanced techniques like chain-of-thought reasoning. We've ventured into time-dilation and length-contraction grade concepts such as Artificial General Intelligence (AGI) and Artificial Super Intelligence (ASI), and even contemplated singularity at the speed of light. The recent release of OpenAI's GPT-4-o1 has at least nudged us into the supersonic era of AI development. Now is an opportune moment to view these stages through what data analytics enthusiasts might call a single pane of glass.

Generative AI: A Two-Pronged Evolution

Generative AI's landscape is evolving along two critical axes: token prediction and autonomous agency. Recent breakthroughs have dramatically enhanced language prediction and generation, with each release of massive to super-massive models pushing the boundaries. Meta's 405 billion parameter model currently stands as the heaviest element in the model periodic table, showcasing the relentless pursuit of scale in predictive capabilities.

Meanwhile, agentic AI is progressing in its own distinct way. The AI community is abuzz with talk of AI agents, with proponents championing the concept at the highest decibel possible to ensure they don't miss the boat. This clamor reflects the growing recognition of autonomous agency as a crucial frontier in AI development, even as it lags behind the rapid advancements in predictive models.

o1: A Small Step Towards a Giant Leap?

OpenAI recently unveiled the preview version of their latest model, o1, along with its speedier counterpart, o1-mini. The model's most intriguing feature is its enhanced chain-of-thought (CoT) reasoning capabilities, a concept we explored in edition 176.

Having experimented with o1 over the past few weeks, I find myself unimpressed - but this reaction is inconsequential. The trajectory of technological evolution often begins with underwhelming first steps before making quantum leaps forward. While one school of thought advocates waiting for impressive results before release, I firmly believe in the power of public evolution.

Models should be released as soon as they can withstand public scrutiny, allowing them to grow and improve in the crucible of real-world application and feedback. This approach fosters transparency, encourages collaborative improvement, and accelerates the pace of innovation.

Do Look Up: From o1@Mezzanine to AGI@First Floor

In #176, I discussed how models sometimes tend to shoot from the hip, engaging in what we might call Level 1 thinking (or lack thereof). This was a significant issue with ChatGPT, and my attempts to force slower, more deliberate thinking were rarely successful. I was following Dr. Andrew Ng's advice to let LLMs think, but my prompting skills were apparently lacking.

Now, o1 has multi-step reasoning built in, which forces the model to think for at least a few seconds (I've observed pauses of 6-10 seconds). This aspect of Chain of Thought (CoT) reasoning, which we might underestimate, is actually a critical step towards achieving AGI. In fact, it represents the mezzanine floor in the edifice of artificial intelligence before we ascend to the first floor of AGI.

Summary

The journey from current AI models to AGI is not a single giant leap, but a series of incremental steps. o1's enhanced reasoning capabilities, while not revolutionary on their own, represent a crucial evolutionary stage. As we stand on this mezzanine level of AI development, we're reminded that progress often comes in subtle forms. The true impact of these advancements may only become clear in retrospect, as we continue to build towards the lofty goal of Artificial General Intelligence.