World’s first large multimodal model (LMM) on an Android phone

Qualcomm Research
Qualcomm Research
2.1 هزار بار بازدید - 5 ماه پیش - Generative AI and large language
Generative AI and large language models (LLMs) have taken the world by storm, but until recently LLMs have been mostly limited to text inputs. In this MWC 2024 technology demo, we showcase the world’s first large multimodal model (LMM) on an Android phone. LLMs can now see.

Qualcomm AI Research is demonstrating Large Language and Vision Assistant (LLaVA), a 7+ billion parameter LMM that can accept multiple types of data inputs, including text and images, and generate multi-turn conversations with an AI assistant about an image. This LMM runs at a responsive token rate on device, which results in enhanced privacy, reliability, personalization, and cost. Our full-stack AI optimization helps to to achieve high performance at low power. LMMs with language understanding and visual comprehension enable many use cases, such as identifying and discussing complex visual patterns, objects, and scenes.

Visit the Qualcomm AI Research website
https://www.qualcomm.com/research/art...

Develop with the Qualcomm AI Stack
https://www.qualcomm.com/products/tec...

Sign up for our newsletter
https://assets.qualcomm.com/mobile-co...
5 ماه پیش در تاریخ 1402/12/06 منتشر شده است.
2,104 بـار بازدید شده
... بیشتر