Breaking Down Meta's Billion Dollar LLM Blueprint [Llama-3.1 Full Breakdown]

bycloud
bycloud
42.9 هزار بار بازدید - 3 هفته پیش - Try out Poe now and
Try out Poe now and save your $$ on multi-subscriptions! https://quora.1stcollab.com/bycloudai

check out my newsletter:
https://mail.bycloud.ai

Llama-3.1's 92 page paper is an engineering paper that most people wouldn't nearly care as much, but would be seen as the goldmine paper of LLM for any AI developers. Why is that? Let's find out what Meta researchers shared how an chungus of a model is trained and optimized.

Llama-3.1 405B
[Paper] https://arxiv.org/abs/2407.21783

This video is supported by the kind Patrons & YouTube Members:
🙏Andrew Lescelius, alex j, Chris LeDoux, Alex Maurice, Miguilim, Deagan, FiFaŁ, Robert Zawiasa, Owen Ingraham, Daddy Wen, Tony Jimenez, Panther Modern, Jake Disco, Demilson Quintao, Penumbraa, Shuhong Chen, Hongbo Men, happi nyuu nyaa, Carol Lo, Mose Sakashita, Miguel, Bandera, Gennaro Schiano, gunwoo, Ravid Freedman, Mert Seftali, Mrityunjay, Richárd Nagyfi, Timo Steiner, Henrik G Sundt, projectAnthony,  Brigham Hall, Kyle Hudson, Kalila, Jef Come, Jvari Williams, Tien Tien, BIll Mangrum, owned, Janne Kytölä, SO, Richárd Nagyfi, Hector, Drexon, Claxvii 177th, Inferencer, Michael Brenner, Akkusativ, Oleg Wock, FantomBloth, Thipok Tham, Clayton Ford

[Discord] Discord: discord
[Twitter] Twitter: bycloudai
[Patreon] Patreon: bycloud

[Music 1] massobeats - gingersweet
[Music 2] massobeats - lush

[Profile & Banner Art] Twitter: pygm7
[Video Editor] Silas

0:00 Intro
2:52 model architecture
4:50 scaling law
6:39 compute & hardware optimization
10:24 Poe
11:48 Training recipe
17:49 Data mix
3 هفته پیش در تاریخ 1403/06/06 منتشر شده است.
42,936 بـار بازدید شده
... بیشتر