OpenAI Whisper Speaker Diarization.mp3

Start End Text Info
0
Hey, welcome to One Little Coder. OpenAI Whisper is really good in transcribing languages,
CPS: 17 Duration: 5.12s
1
transcribing audios from any languages to English. So you can give an English audio and it can
CPS: 20 Duration: 4.74s
2
transcribe it to English or you can also give any other language it can transcribe and translate
CPS: 20 Duration: 4.68s
3
it to English. But OpenAI Whisper, what it cannot do out of box is speaker diarization.
CPS: 15 Duration: 5.58s
4
What is a speaker diarization? So if you have got a conversation where two people are talking,
CPS: 22 Duration: 4.16s
5
do you want to label those two people individually? Sometimes it is very important
CPS: 14 Duration: 5.92s
6
whenever you're transcribing a podcast or something like that, which is not currently
CPS: 20 Duration: 4.16s
7
possible out of box using OpenAI Whisper. So that stops a lot of people from using OpenAI Whisper.
CPS: 15 Duration: 6.36s
8
We have got some open source contributions, where we have got an amazing person called
CPS: 19 Duration: 4.42s
9
Dwarkesh Patel. He has created a Google collab notebook and also a Gradio demo
CPS: 15 Duration: 5s
10
that helps us do speaker diarization using OpenAI Whisper, also an additional model from
CPS: 13 Duration: 6.8s
11
speech brain. So we are going to learn in this video how we can use Dwarkesh Patel's notebook,
CPS: 14 Duration: 6.44s
12
Google collab notebook to do speaker diarization and also how we can utilize the Gradio application
CPS: 13 Duration: 7.26s
13
so that we don't have to code everything. The first thing that I'm going to show you is
CPS: 20 Duration: 4.36s
14
how you can leverage the Gradio application, which means you just have to drag and drop
CPS: 17 Duration: 5.12s
15
and you have speaker diarization in place. The next thing is I'm going to take you through
CPS: 18 Duration: 4.92s
16
the code so that you have an understanding about what is happening in the code. So that
CPS: 20 Duration: 4.24s
17
gives you the flexibility to tweak if you need to change anything. The first thing that you
CPS: 19 Duration: 4.84s
18
need to do is as you can see this Gradio application that is currently hosted on
CPS: 16 Duration: 4.92s
19
Hugging Facepaces where you can upload an audio. I just uploaded an audio.
CPS: 18 Duration: 4.06s