Listen to this article:
Key Developments in AI: Apple's Open Source Multimodal LLM, Ferret
Apple Launches Ferret: Apple has quietly launched Ferret, a new multimodal large language model (LLM) in October, stepping into the realm of open-source AI development. Unlike its typical approach of closed systems, Apple’s move towards an open-source model with Ferret is a notable shift in its strategy【source】【source】.
Understanding Text, Images, and Audio: Ferret stands out as it can process and understand not just text, but also images and audio. This capability positions Ferret alongside major players like Google’s Gemini, Meta’s Llama 2, and OpenAI’s GPT series, offering a new tool for developers and researchers in AI【source】.
Ferret's Capabilities and Usage
Ferret's design allows it to refer and ground any element at any granularity within an image. This means it can identify and analyze specific parts of an image based on user queries, providing a unique interaction between AI and visual data. For instance, if a user highlights an animal in a photo and asks what it is, Ferret can determine the species and offer context-related responses based on other elements in the image【source】.
The model was collaboratively developed by Apple and researchers from Cornell University. Initially released on GitHub in October, Ferret's code, along with Ferret-Bench, was made public without much publicity, and its significance grew among AI researchers only later【source】.
Implications and Future Prospects
The launch of Ferret under a non-commercial license implies that it cannot be commercialized in its current state. However, there's potential for its integration into future Apple products or services. This release also indicates Apple’s increased openness in sharing its AI research, contrasting with its traditionally secretive approach【source】.
While Apple is enhancing its AI server capabilities, it still faces challenges in scaling to match the likes of ChatGPT. The release of Ferret, trained on powerful Nvidia GPUs, demonstrates Apple's commitment to advancing AI technology and potentially collaborating with other firms to augment its capabilities【source】.
About the author
Evalest's tech news is crafted by cutting-edge Artificial Intelligence (AI), meticulously fine-tuned and overseen by our elite tech team. Our summarized news articles stand out for their objectivity and simplicity, making complex tech developments accessible to everyone. With a commitment to accuracy and innovation, our AI captures the pulse of the tech world, delivering insights and updates daily. The expertise and dedication of the Evalest team ensure that the content is genuine, relevant, and forward-thinking.
The Best GPTs to Try Right Now
A comprehensive overview of the most intriguing GPT applications available in OpenAI's GPT Store, highlighting their diverse functionalities and impact.
Continuous Integration (CI) vs. Continuous Delivery (CD)
Exploring the distinctions, benefits, and best practices of Continuous Integration (CI) and Continuous Delivery (CD) in modern software development.
How to Train Generative AI Models from Scratch
A detailed guide on how to train generative AI models from scratch, encompassing all essential steps and best practices.