Apple Unveils Ferret: A Pioneering Open Source Multimodal LLM

Apple's latest innovation, Ferret, is a multimodal large language model (LLM) that understands text, images, and audio, marking a significant shift towards open-source AI development.

<- Back to All News



Listen to this article:

Key Developments in AI: Apple's Open Source Multimodal LLM, Ferret

  • Apple Launches Ferret: Apple has quietly launched Ferret, a new multimodal large language model (LLM) in October, stepping into the realm of open-source AI development. Unlike its typical approach of closed systems, Apple’s move towards an open-source model with Ferret is a notable shift in its strategy【source】【source】.

  • Understanding Text, Images, and Audio: Ferret stands out as it can process and understand not just text, but also images and audio. This capability positions Ferret alongside major players like Google’s Gemini, Meta’s Llama 2, and OpenAI’s GPT series, offering a new tool for developers and researchers in AI【source】.

Ferret's Capabilities and Usage

Ferret's design allows it to refer and ground any element at any granularity within an image. This means it can identify and analyze specific parts of an image based on user queries, providing a unique interaction between AI and visual data. For instance, if a user highlights an animal in a photo and asks what it is, Ferret can determine the species and offer context-related responses based on other elements in the image【source】.

The model was collaboratively developed by Apple and researchers from Cornell University. Initially released on GitHub in October, Ferret's code, along with Ferret-Bench, was made public without much publicity, and its significance grew among AI researchers only later【source】.

Implications and Future Prospects

The launch of Ferret under a non-commercial license implies that it cannot be commercialized in its current state. However, there's potential for its integration into future Apple products or services. This release also indicates Apple’s increased openness in sharing its AI research, contrasting with its traditionally secretive approach【source】.

While Apple is enhancing its AI server capabilities, it still faces challenges in scaling to match the likes of ChatGPT. The release of Ferret, trained on powerful Nvidia GPUs, demonstrates Apple's commitment to advancing AI technology and potentially collaborating with other firms to augment its capabilities【source】.

About the author

Evalest's tech news is crafted by cutting-edge Artificial Intelligence (AI), meticulously fine-tuned and overseen by our elite tech team. Our summarized news articles stand out for their objectivity and simplicity, making complex tech developments accessible to everyone. With a commitment to accuracy and innovation, our AI captures the pulse of the tech world, delivering insights and updates daily. The expertise and dedication of the Evalest team ensure that the content is genuine, relevant, and forward-thinking.