Fine-Tuning LLMs: Best Practices and When to Go Small // Mark Kim-Huang // MLOps Meetup #124

MLOps.community
MLOps.community
9.3 هزار بار بازدید - پارسال - MLOps Community Meetup
MLOps Community Meetup #123! Two weeks ago, we talked to Mark Kim-Huang, Co-Founder, Head of AI at Preemo Inc.

//Abstract
With the open source releasing foundational models at a blistering pace there has never been a better time to develop an AI-powered product.  In this talk, we walk you through the challenges and state-of-the-art techniques that can help you fine-tune your own LLMs.  Additionally, we will provide guidance on how to determine when a small model would be more appropriate for your use case.

// Bio
Mark is a co-founder and Head of AI at Preemo, a platform that helps companies build custom AI applications by making it extremely easy to fine-tune foundational models and deploy them into production.  He has been a tech lead in machine learning teams at Splunk and Box developing and deploying production systems for streaming analytics, personalization, and forecasting.  Previously in another life, he was also an algorithmic trader at quantitative hedge funds.

// Jobs board
https://mlops.pallet.xyz/jobs

// Related links
Website: https://www.preemo.io/
Presentation slides: https://docs.google.com/presentation/...
LLMs in Production Conference Part 2 (June 15-16) Registration: https://home.mlops.community/home/eve...

---------- ✌️Connect With Us ✌️------------  
Join our Slack community:  https://go.mlops.community/slack
Follow us on Twitter:  @mlopscommunity
Sign up for the next meetup:  https://go.mlops.community/register
Catch all episodes, Feature Store, Machine Learning Monitoring, and Blogs: https://mlops.community/

Connect with Demetrios on LinkedIn: LinkedIn: dpbrinkm
Connect with Ben on LinkedIn: LinkedIn: ben-epstein
Connect with Mark on LinkedIn: LinkedIn: markhng525

Timestamps:
[00:00] Introduction to Mark Kim-Huang
[00:39] Join the LLMs in Production Conference Part 2 on June 15-16!
[01:32] Fine-Tuning LLMs: Best Practices and When to Go Small
[02:31] Model approaches
[05:09] You might think that you could just use OpenAI but only older base models are available
[06:10] Why custom LLMs over closed source models?
[08:24] Small models work well for simple tasks
[09:51] Types of Fine-Tuning
[11:06] Strategies for improving fine-tuning performance
[11:24] Challenges
[13:18] Define your task
[13:26] Task framework
[14:47] Defining task(s)
[15:39] Clustering task diversifies training data and improves out-of-domain performance
[16:45] Prompt engineering
[17:05] Constructing a prompt
[19:23] Synthesize more data
[20:50] Constructing a prompt
[23:09] Increase fine-tuning efficiency with LoRa
[23:51] Naive data parallelism with mixed precision is inefficient
[26:13] Further reading on mixed precision
[26:40] Parameter efficient fine-tuning with LoRa
[33:12] LoRa Data Parallel with Mixed Precision
[34:44] Summary
[36:20] Q&A
[36:44] Mark's journey to LLMs
[38:47] Task clustering mixing with existing data sets
[40:13] LangChain Auto Evaluator evaluating LLMs
[42:26] Cloud platforms costs
[45:00] Vector database used at Preemo
[46:55] Finding a reasoning path of a model on Prompting
[51:40] When to fine-tune versus prompting with a context window
[53:20] Wrap up
پارسال در تاریخ 1402/03/12 منتشر شده است.
9,365 بـار بازدید شده
... بیشتر