LLaVA - This Open Source Model Can SEE Just like GPT-4-V
16.6 هزار بار بازدید -
9 ماه پیش
-
In this video, we look
In this video, we look at the newly released LLaVA-1.5-13B which is the latest Open Source Multi-Modal model that can see images.
LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.
LET'S CONNECT:
☕ Buy me a Coffee: https://ko-fi.com/promptengineering
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: Discord: discord
📧 Business Contact: [email protected]
💼Consulting: https://calendly.com/engineerprompt/c...
LINKS:
Llava-Github: https://llava-vl.github.io/
Llava Demo: https://llava.hliu.cc/
LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.
LET'S CONNECT:
☕ Buy me a Coffee: https://ko-fi.com/promptengineering
|🔴 Support my work on Patreon: Patreon.com/PromptEngineering
🦾 Discord: Discord: discord
📧 Business Contact: [email protected]
💼Consulting: https://calendly.com/engineerprompt/c...
LINKS:
Llava-Github: https://llava-vl.github.io/
Llava Demo: https://llava.hliu.cc/
9 ماه پیش
در تاریخ 1402/07/19 منتشر شده
است.
16,635
بـار بازدید شده