تاریک روشن

سی‌وید

سـرگـرمی
کـودکـان
ورزشــی
عــلـم و فـنـاوری
خــودرو و وســایـل نـقـلـیه
مـوسـیقـی
اخــبـار
بـازی و سـرگـرمی
حـیـوانـات و طـبـیعت
مــذهـبـی

تاریک روشن

صفحه اصلی
کمک به خیریه محک

سی‌وید

سـرگـرمی
کـودکـان
ورزشــی
عــلـم و فـنـاوری
خــودرو و وســایـل نـقـلـیه
مـوسـیقـی
اخــبـار
بـازی و سـرگـرمی
حـیـوانـات و طـبـیعت
مــذهـبـی

تاریک روشن

صفحه اصلی
کمک به خیریه محک

Mistral-7B

Data Science Gems منتشر شده در تاریخ 1402/10/28

673 بار بازدید - 7 ماه پیش - Mistral 7B is a 7–billion-parameter

Mistral 7B is a 7–billion-parameter language model engineered for superior performance and efficiency. Mistral 7B outperforms the best open 13B model (Llama 2) across all evaluated benchmarks, and the best released 34B model (Llama 1) in reasoning, mathematics, and code generation. Mistral 7B leverages grouped-query attention (GQA) for faster inference, coupled with sliding window attention (SWA) to effectively handle sequences of arbitrary length with a reduced inference cost. Mistral 7B – Instruct is a model fine-tuned to follow instructions. It surpasses Llama 2 13B – chat model both on human and automated benchmarks. These models are released under the Apache 2.0 license.

In this video, I will talk about the following: What is the architecture for Mistral-7B? How does Mistral-7B perform?

For more details, please look at https://arxiv.org/abs/2310.06825 and https://mistral.ai/news/announcing-mi... and https://github.com/mistralai/mistral-src

Jiang, Albert Q., Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand et al. "Mistral 7B." arXiv preprint arXiv:2310.06825 (2023).

#science_technology
#deep_learning
#deep_learning_for_nlp
#natural_language_processing
#large_language_models
#llms
#llama
#instruction_finetuning
#mistral
#mistral7b

7 ماه پیش در تاریخ 1402/10/28 منتشر شده است.

673 بـار بازدید شده

... بیشتر

14:00

Mixtral of Experts

59:26

Mamba: Linear-Time Sequence Modeling with Selective State Spaces

1:05:03

Investing in India by Understanding India

13:47

Stop, Intel’s Already Dead! - AMD Ryzen 9600X & 9700X Review

12:13

Infini attention and Infini Transformer

14:41

Submarine Fleet Strength by Country, Compared

8:10

Toyota’s INSANE NEW Battery Admits No Competition!

18:59

Tesla Employee Goes Against Elon To Save Tesla Sales | Here's What Happened

9:13

A Cool Functional Equation

7:14

11:54

LLMLingua: Compressing Prompts for Accelerated Inference of LLMs

9:00

ClaudeDev: This Mind-Blowing Coding Agent Can Build SaaS Apps in Minutes!

13:42

LaMP: Personalization Benchmark for LLMs

10:34

18:07

Supporting Infinite Context Length using TransformerFAM

16:28

Visual Instruction Tuning [Moon Ye-Bin]

40:06

CPLIP - Zero-Shot Learning for Histopathology: Sajid Javed, 23/07/24

28:41

VW ID7 Pro 1000 km challenge

اشــتـراک گـذاری

دانــلـود

این امکان در حال حاضر وجود ندارد.

بـیـشــتر

شناسه ویدئو : Gn_IWa52jqo