Apple recently disclosed findings from its research paper, "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training", which focused on a novel strategy for creating artificial intelligence (AI) tools. MM1 employs a diverse range of image-caption combinations, interrelated image-text documents, and text-only data to generate both visual and textual information.
Apple states that this new model will enhance AI functionalities—including image captioning, responding to visual queries, and natural language inferencing—with significant precision. The research conducted by the tech giant concentrated on merging various kinds of training data and model configurations. This strategy would enable AI to amalgamate visual and text data, resulting in more precise results.
To advance in the AI field, Apple is reportedly in discussions with Google, an Alphabet Inc. subsidiary, to license Google's Gemini large-language models to power new features scheduled to be introduced to the iPhone as part of iOS 18, according to Bloomberg.
This potential collaboration would be a significant advancement for both Apple and Google, pulling them ahead of OpenAI, who currently leads the AI field. Bloomberg also reported that Apple is in negotiations with OpenAI, creators of ChatGPT, discussing generative artificial intelligence.
Previously, Google had entered a search engine deal with Apple where it agreed to pay Apple $18 billion annually to remain the default search engine across the company's devices.