تاریک روشن

سی‌وید

سـرگـرمی
کـودکـان
ورزشــی
عــلـم و فـنـاوری
خــودرو و وســایـل نـقـلـیه
مـوسـیقـی
اخــبـار
بـازی و سـرگـرمی
حـیـوانـات و طـبـیعت
مــذهـبـی

تاریک روشن

صفحه اصلی
کمک به خیریه محک

سی‌وید

سـرگـرمی
کـودکـان
ورزشــی
عــلـم و فـنـاوری
خــودرو و وســایـل نـقـلـیه
مـوسـیقـی
اخــبـار
بـازی و سـرگـرمی
حـیـوانـات و طـبـیعت
مــذهـبـی

تاریک روشن

صفحه اصلی
کمک به خیریه محک

Run Llama 2 with 32k Context Length!

Trelis Research منتشر شده در تاریخ 1402/06/17

5.5 هزار بار بازدید - 11 ماه پیش - Achieve long context length by

Achieve long context length by using Code Llama to scale to 32k tokens.
- Get up to 16k tokens on a Colab 40 GB GPU
- Get up to 32k tokens on an 80 GB A100 on RunPod (or AWS or Azure)

Tips:
- Use Flash attention and BetterTransformer
- Use GPTQ quantization
- Use the 13B model for better quality.

Free jupyter notebook: https://github.com/TrelisResearch/cod...

Purchase the PRO notebook: https://buy.stripe.com/fZe14Q5tP0zpaM...
- Allows for saving and re-loading of conversations
- Allows for uploading and analysis of documents
- Works on Google Colab or on a Server, e.g. AWS, Azure, RunPod (affiliate link: https://tinyurl.com/yjxbdc9w)

Trelis ADVANCED Inference Repo:
- Server Setup
- API setup with Runpod
- Function calling api scripts
Learn more: https://trelis.com/enterprise-server-...

0:00 How to run Llama 2 with longer context length
0:50 Run Llama 2 with 16k context in Google Colab
2:20 How to run a GPTQ model in Colab
3:43 Run Llama 2 7B with 32k context length using RunPod
6:20 Run Llama 2 13B for better performance! 16k context length
8:15 Streaming Llama 2 13B on 16k context length
9:50 Adjusting max token output and temperature
10:20 Streaming Llama 2 13B on 16k context length and 0 temperature
11:25 STREAMING LLAMA 2 13B ON 32k CONTEXT LENGTH!
12:50 PRO NOTEBOOK - Save Chats and Files. Easily adjust context length.
16:40 THEORY BONUS: How to get longer context length?
17:45 How does GPTQ work?
18:00 How does Flash attention work?
19:45 What is the best model for long context length?
20:20 What is better Llama 2 or Code-llama or YaRN?
21:30 Tips for long context lengths

#science_technology
#llama_2_model
#llama_2_7b
#llama_2_13b
#long_context_length
#llama_2_32k_context
#llama_2_7b_32k
#llama_2_32k
#llama_32k
#llama_long_context
#llama_2_long_context
#llama_long_input
#llama_2
#how_to_increase_context_length
#llm_long_context_length
#long_context_language_model
#code_llama_long_context
#code_llama_32k
#codellama_long_context_length
#ai
#code_llama_2
#codellama
#code_llama

11 ماه پیش در تاریخ 1402/06/17 منتشر شده است.

5,511 بـار بازدید شده

... بیشتر

24:58

Top Ten Fine Tuning Tips

11:37

Demo: Rapid prototyping with Gemma and Llama.cpp

10:30

All You Need To Know About Running LLMs Locally

1:07:30

Chat with Multiple PDFs | LangChain App Tutorial in Python (Free LLMs and Embeddings)

28:01

NEW! Use Llama 3.1 in your automations today!

36:31

Preparing Fineweb - A Finely Cleaned Common Crawl Dataset

8:09

JPEG is Dying - And that's a bad thing

18:30

"How to give GPT my business knowledge?" - Knowledge embedding 101

21:54

Chat with Multiple PDFs using Llama 2 and LangChain (Use Private LLM & Free Embeddings for QA)

45:21

You need to learn AI in 2024! (And here is your roadmap)

43:40

Fine-tuning on Wikipedia Datasets

27:28

LangChain Crash Course: Build a AutoGPT app in 25 minutes!

20:34

Why does crude oil seep out of the ground on this beautiful Caribbean Island?

59:37

Data Preparation Tips and Tricks

35:45

Anonymizing Sensitive Data in LLM Prompts

27:39

How to use IPAdapter models in ComfyUI

20:07

ALL ROADS LEAD to AI CODING: Cursor, Aider in the browser, Multi file Prompting

10:21

Is CODE LLAMA Really Better Than GPT4 For Coding?!

35:11

Anyone can Fine Tune LLMs using LLaMA Factory: End-to-End Tutorial

12:14

LLaMA2 with LangChain - Basics | LangChain TUTORIAL

اشــتـراک گـذاری

دانــلـود

این امکان در حال حاضر وجود ندارد.

بـیـشــتر

شناسه ویدئو : ELax81LjFhU