OpenAI Whisper: Robust Speech Recognition via Large-Scale Weak Supervision | Paper and Code

Aleksa Gordić - The AI Epiphany
Aleksa Gordić - The AI Epiphany
34.7 هزار بار بازدید - 2 سال پیش - ❤️ Become The AI Epiphany
❤️ Become The AI Epiphany Patreon ❤️
Patreon: theaiepiphany

👨‍👩‍👧‍👦 Join our Discord community 👨‍👩‍👧‍👦
Discord: discord

In this video I cover Whisper, an ASR system from OpenAI's "Robust Speech Recognition via Large-Scale Weak Supervision" paper.

Trained on a huge multi-lingual, multi-task weakly supervised dataset it achieves a very high effective robustness and accuracy closing the gap with the human baseline using only an off-the-shelf transformer.

I walk you through both the paper as well as the actual code. Let me know whether the code part helped!

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
✅ Paper: https://cdn.openai.com/papers/whisper...
✅ Code: https://github.com/openai/whisper

✅ Nice explanation of mel spectrograms: Mel Spectrograms Explained Easily
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

⌚️ Timetable:
00:00:00 Intro
00:02:05 Paper overview
00:07:30 Collecting a large scale weakly supervised dataset
00:13:55 Evaluation metric issues (WER)
00:16:05 Effective robustness
00:18:40 Scaling laws in progress
00:26:30 Decoding is hacky
00:28:30 Code walk-through
00:30:25 Model architecture (diagram vs code)
00:33:30 Transcription task
00:34:10 Loading the audio, mel spectrograms
00:37:50 Language detection
00:45:00 Transcription task continued
00:47:35 Suppressing token logits
00:52:00 Voice activity detection
00:53:35 Decoding and heuristics
01:01:56 Outro

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
💰 BECOME A PATREON OF THE AI EPIPHANY ❤️

If these videos, GitHub projects, and blogs help you,
consider helping me out by supporting me on Patreon!

The AI Epiphany - Patreon: theaiepiphany
One-time donation - https://www.paypal.com/paypalme/theai...

Huge thank you to these AI Epiphany patreons:
Eli Mahler
Petar Veličković

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

💼 LinkedIn - LinkedIn: aleksagordic
🐦 Twitter - Twitter: gordic_aleksa
👨‍👩‍👧‍👦 Discord - Discord: discord

📺 YouTube - theaiepiphany
📚 Medium - Medium: gordicaleksa
💻 GitHub - https://github.com/gordicaleksa
📢 AI Newsletter - https://aiepiphany.substack.com/

▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬

#whisper #openai #asr
2 سال پیش در تاریخ 1401/07/02 منتشر شده است.
34,731 بـار بازدید شده
... بیشتر