AI in Motion: Mid-Year Review of 2024's Enterprise Innovations
As we pass the halfway mark of 2024, it's time to revisit the 7 big questions that have been fueling discussions from members of our AI Center of Excellence and the product and engineering leaders in our community of AI Pioneers. From the declining cost of intelligence to the spread of AI agents in the enterprise, the landscape in AI is very rapidly evolving and it it now time to check in on each of these trends to see where we are heading towards the end of the year:
Question #1: Is this the year we take Google seriously?
Google is Serious, but the Crown Goes to Meta
At the beginning of the year, I asked about Google’s position and influence in the AI market, given the perception it had fallen behind other players. 2024 has certainly been pivotal for Google: the AI team has delivered on its models with Gemini 1.5 Flash & 1.5 Pro, significant input and output tokens price drop, and this week’s new experimental models release. While not leading the field on all benchmarks, Google offers the best pricing in the market:
Source: Artificial Analysis
However, we cannot give them the crown. The true breakout leader of 2024, in fact, is Meta. It is hard to overstate how much the Llama releases have fundamentally changed the generative AI landscape, and they are showing no signs of slowing. Meta shared this week that the Llama models are approaching 350 million downloads on Hugging Face – an over 10x increase to where they were a year ago. Meta’s commitment to open source, and Llama 3.1 achieving state-of-the-art on evals, gives them the 2024 crown.
Look out for: Llama 4 training has started and the model reportedly will focus on agentic behavior; worth the read from Latent Space: ‘Current models have “a gap of intelligence” when it comes to agentic workflows, as they are unable to plan without the user relying on prompting techniques and loops like ReAct, Chain of Thought, or frameworks like Autogen and Crew. That may be fixed soon?’
Source: Meta AI
Question #2: Is this the year Nvidia becomes a cloud provider?
Nvidia = A Cloud Contender No More
In a Bloomberg interview shortly after Q2 ‘24 Earnings, NVIDIA's CEO Jensen Huang emphasized that NVIDIA won't become a cloud provider to compete with the hyperscalers. Instead, the company will continue to focus on enabling others to build and operate AI infrastructure by providing them with powerful chips and software. explained that their goal is to support cloud providers and enterprises – not to compete with them. Best elucidated by Jensen himself, an excerpt from the video interview: “Our GPU cloud was designed to be the best version of NVIDIA cloud that’s built within each cloud. NVIDIA GDX cloud is built inside GCP, inside Azure, inside AWS, inside OCI. And so we build our cloud within theirs so that we can implement our best version of our cloud, work with them to make that cloud, that AI infrastructure, as performant, as great TCO as possible, and so that strategy has worked incredibly well.” He further went on to explain that they are using DGX Cloud for themselves – to help design chips, for their work in self-driving cars and robotics, for the omniverse, and for their “AI Foundry” - they make AI models for companies (“We are a foundry for AI like TSMC is a foundry for our chips”).
Look out for: NVIDIA’s focus this year has turned to “Sovereign AI” – AI developed by governments using their own infra and data – will be a growing theme.
Question #3: Will there still be a GPU shortage?
The GPU Dilemma: Easing Shortages For Inference, Less so for Training
GPU shortage continues on the training side as NVIDIA ramp ups production of its next-generation B100 GPUs. Despite their year-long efforts, supply remains tight. NVIDIA has faced production issues with the Blackwell chips – in this week’s earnings call Jensen shared that Blackwell samples are now shipping to partners & customers (an apparent 2-3mo delay). H100s, at least, see some easing: delivery lead times for H100s have improved, and customers can now access H100 clusters on-demand via providers like Lambda. Meanwhile, on the inference side, things look much better: just this week, for example, Cerebras debuted its AI inference service (which they claim is 10-20x faster than H100s).
Look out for: As AI training and inference continue to push full steam ahead, expect a growing focus on data, rather than just hardware (read the full thread!)
Question #4: Will the transformer architecture hit a plateau?
Transformers are Not Going Anywhere
As AI architecture evolves, the Transformer remains a central pillar, but 2024 has seen some innovations. AI21 Labs recently released the new Jamba models, which integrate State-Space Models (SSM) with Transformer layers, creating a hybrid that significantly reduces memory usage while enhancing long-context processing; and earlier in the year the Mamba-2 paper was released after strong community enthusiasm for the original Mamba model (a great technical overview can be read here). Meanwhile, FlexAttention, a new architecture introduced this year, pushes the boundaries of attention mechanisms in high-resolution vision-language models, reducing computational costs while maintaining performance. These advancements, while rooted in Transformer principles, suggest a future where AI models dynamically adapt their processing strategy to achieve greater generalization and efficiency across diverse tasks.
What we have not yet seen is a major breakthrough in architecture for language models that is not based on the Transformer – all major models rely on some form of the architecture. The Transformer hasn’t gone anywhere.
Look out for: Transformers still reigning supreme, at least in the short term.
Source: FlexAttention for Efficient High-Resolution Vision-Language Models
Question #5: How rapidly will we see AI “agents”?
2024 = The Rise of AI Agents
This year has marked significant strides in deploying these agents across various business functions. I’ve written extensively about autonomous agents and their promise in fundamentally changing the business model for software applications. While autonomous agents are not quite widespread yet, many startups are now being created to imagine the AI employees of the future: the SOC Analyst (Dropzone AI), Financial Analyst (Brightwave), Security Engineer (Pixee), Sales (Aomni), Voice Dubbing and Generation (ElevenLabs), Outbound Calls (Bland AI), AI Marketplace/Task Economy (Payman), and many more.
Look out for: As the field becomes more grounded and demands economic value, 2025 will see less “moonshot” internal AI projects and more widespread AI agent / vendor adoption – demonstrating ROI will be the gating item for implementation.
Question #6: Who will start making money in AI?
The Usual Suspects + Early Signs of an Expanding Field
The usual giants—Nvidia, OpenAI, and Microsoft—continue to profit from AI: NVIDIA reported >$30B in revenue in 2024 Q2 earnings, OpenAI recently shared that 92% of Fortune 500 companies are using its products, and that API usage has 2x since the release of GPT-4o mini, and Microsoft has shared a 60% increase in O365 Copilot users this past quarter.
Other players, however, are rapidly monetizing AI, signaling a broadening market impact. Adobe’s AI-powered tools in Creative Cloud have driven subscription growth, while Salesforce’s Einstein AI boosts customer retention through predictive analytics. Intuit’s AI features in TurboTax and QuickBooks enhance user satisfaction and reduce churn (to learn more around Intuit’s AI development process, read our blog on the A3 Paradox alongside Tapas Moturu, Chief Architect at Intuit), and ServiceNow’s AI streamlines IT operations, increasing enterprise adoption. Zoom monetizes AI with features like real-time transcription, and Shopify leverages AI for personalized marketing, boosting conversions. This expansion of AI monetization across diverse sectors highlights the growing importance of AI beyond early frontrunners.
Look out for: The democratization of AI access through open-source and efficient infrastructure, enabling more companies to unlock AI’s potential and create new revenue streams. The ongoing development of more efficient AI models and infrastructure, such as Nvidia's upcoming B100 chip, promises to further reduce costs and expand access.
Question #7: Can we deploy AI without human “alignment”?
Multi-Model Models Change the Game
Ensuring AI systems align with human values remains a paramount challenge in 2024, particularly as models become more autonomous and complex. Techniques like RLHF, RLAIF, and AI 'red teaming' have become standard practices to ensure safety and ethical compliance. These approaches are increasingly integrated into AI development cycles, especially in light of stricter regulatory demands globally. Companies are also exploring new methods, such as using multimodal models to cross-validate outputs across different data types, thereby enhancing the robustness and alignment of AI systems.
Look out for: Multimodal AI audit systems that cross-validate outputs from different AI models (e.g., text, vision, and audio) to detect and correct biases or misalignments more effectively, thereby enhancing trust in AI decisions.
What's Next in AI? Join our AI Pioneers Summit!
Every month in AI has felt like a whole year, and the remainder of 2024 will undoubtedly bring further insights and shifts. With so much innovation, we are excited for our annual AI Pioneers Summit that is upcoming on October 16th. As always, feel free to reach out if you are interested in joining us!