Amazon partners with Hugging Face to enhance AI model efficiency 

AMZN.O announced a new collaboration on Wednesday between its cloud computing division, Amazon Web Services (AWS), and the artificial intelligence startup Hugging Face. Inc. (AMZN.O) announced a new collaboration on Wednesday between its cloud computing division, Amazon Web Services (AWS), and the artificial intelligence startup Hugging Face. This partnership aims to streamline the process of running thousands of AI models on Amazon’s specialized computing chips.

Hugging Face, a startup valued at $4.5 billion, has emerged as a pivotal platform for AI researchers and developers. It serves as a primary hub for sharing and experimenting with chatbots and other AI software. Hugging Face has garnered support from major tech giants, including Amazon, Google (GOOGL.O), and Nvidia (NVDA.O). The platform is notably popular among developers seeking open-source AI models such as Meta Platforms’ (META.O) Llama 3.

Once developers customize an open-source AI model, their next step is typically to deploy the model within a software application. The new partnership between Amazon and Hugging Face facilitates this deployment on AWS’s custom chip, Inferentia2. This chip is designed to optimize the efficiency and cost-effectiveness of running AI models, a process known as inference.

Jeff Boudier, head of product and growth at Hugging Face, emphasized the importance of efficiency in AI operations. “One thing that’s very important to us is efficiency—making sure that as many people as possible can run models and that they can run them in the most cost-effective way,” Boudier stated. This partnership aims to democratize access to powerful AI tools by reducing the financial barriers to running complex AI models.

For AWS, this collaboration represents a strategic move to attract more AI developers to its cloud services. Although Nvidia is the dominant force in the AI training market, AWS is positioning its chips as the superior option for inference tasks, which involve applying trained models to new data. Inference can occur at a high frequency, with models potentially being queried thousands of times per hour.

“You train these models maybe once a month. But you may be running inference against them tens of thousands of times an hour. That’s where Inferentia2 really shines,” said Matt Wood, who oversees artificial intelligence products at AWS. Wood highlighted the cost advantages of using Inferentia2 for inference, suggesting that AWS’s chips can operate these models more economically over time.

The partnership between Amazon and Hugging Face underscores a broader trend in the tech industry towards integrating AI more deeply into cloud services. By leveraging Amazon’s custom chips, Hugging Face can offer its users enhanced performance and reduced costs, potentially accelerating innovation and adoption in the AI sector.

This move also signals Amazon’s intent to strengthen its foothold in the AI ecosystem, competing with other cloud service providers by offering specialized hardware solutions tailored for AI workloads. As AI continues to evolve and expand into various industries, collaborations like this are likely to play a crucial role in shaping the future of cloud computing and artificial intelligence.