Web – OpenAI’s Whisper vs Youtube’s automatic subtitle translation.

Today I tested a video tutorial hastily created by me in English. From the point of view of the English language, I am not at a native level and to be sure that the content in English was correct, I used two tools: automatic translation with subtitles and OpenAI’s Whisper algorithm. About the automatic translation of YouTube videos, you already know that it works. Here’s what you should know about OpenAI’s Whisper:
Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web., see OpenAI’s Whisper webpage.
To use OpenAI’s Whisper with a video I used the tools from the website Hugging Face, more precisely one created by jeffistyping known as Jeff.
This tool can be found no this webpage.
These tools can allow you to report your video content to a neutral supervisor based on artificial intelligence and not on a human factor.
Processing with OpenAI’s Whisper required several interventions and settings until it was possible to extract a text from the video.
You can see how my pronunciation of “add” was understood as “had“.
The result created by OpenAI’s Whisper is this:

You will see in the two images how OpenAI’s Whisper understands my pronunciation better than the YouTube tool, but from data processing, it is also normal.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.