Meta Unveils its First Open AI Model That Can Process Images

Meta has released new versions of its renowned open source AI model Llama, including small and medium-sized models capable of running workloads on edge and mobile devices.

Llama 3.2 models were showcased at the company's annual Meta Connect event. They can support multilingual text production and vision apps like image recognition.

“This is our first open source, multimodal model, and it’s going to enable a lot of interesting applications that require visual understanding,” stated Mark Zuckerberg, CEO of Meta.

Llama 3.2 is based on the huge open source model Llama 3.1, which was released in late July. The previous Llama model was the largest open-source AI model in history, with 405 billion parameters (parameters are the adjustable variables within an AI model that help it learn patterns from data). The size shows the AI's ability to interpret and generate human-like text.

The new Llama models presented at Meta Connect 2024 are significantly reduced in size. Meta explained that they choose to develop smaller models because not all researchers have the required computational resources and expertise to run a model as large as Llama 3.1.

In terms of performance, Meta's new Llama 3.2 models compete with industry-leading systems from Anthropic and OpenAI. The 3B model exceeds Google's Gemma 2 2.6B and Microsoft's Phi 3.5-mini in tasks such as instruction following and content summarisation. The 90B version, the largest of the models, surpasses both Claude 3-Haiku and GPT-4o-mini on a variety of benchmarks, including the widely used MMLU test, an industry-leading AI model evaluation tool.

How to access Llama 3.2 models

The new Llama 3.2 models are open source, so anyone can download and use them to power AI applications. The models can be downloaded straight from llama.com or Hugging Face, a popular open source repository platform. Llama 3.2 models are also available through a number of cloud providers, including Google Cloud, AWS, Nvidia, Microsoft Azure, and Grow, among others.

According to figures published in early September, demand for Meta's Llama models from cloud customers increased tenfold between January and July, and is expected to rise much more in the wake of the new 3.2 line of models. Meta partner Together AI is providing free access to the vision version of Llama 3.2 11B on its platform till the end of the year.

Vipul Ved Prakash, founder and CEO of Together AI, stated that the new multimodal models will drive the adoption of open-source AI among developers and organisations.

“We’re thrilled to partner with Meta to offer developers free access to the Llama 3.2 vision model and to be one of the first API providers for Llama Stack,” Prakash noted. “With Together AI's support for Llama models and Llama Stack, developers and enterprises can experiment, build, and scale multimodal applications with the best performance, accuracy, and cost.”

Search This Blog

Sections

Popular Posts

Blog Archive

Labels

Report Abuse

About Me

Showing result(s) for

Popular Posts

Pages

Meta Unveils its First Open AI Model That Can Process Images

Footer About

Search This Blog

Sections

Popular Posts

Blog Archive

Labels

Report Abuse

About Me

Showing result(s) for

Popular Posts

Pages

Menu Item

Next

Newer Post

Previous

Older Post

Artificial Intelligence

Llama 3.2

Machine learning

Mera

Technology