Transcribe Audio to Text for FREE in Google Colab using OpenAI Whisper (Step-by-step Instructions)

Yigal Baruch

1. First, we’ll need to open a Colab Notebook. To do that, you can just visit this link https://colab.research.google.com/#create=true and Google will generate a new Colab notebook for you. Alternatively, you can go anywhere in your Google Drive > Right Click (in an empty space like you want to create a new file) > More > Google Colaboratory. A new tab will open with your new notebook. It’s called Untitled.ipynb but you can rename it anything you want.

2. Next, we want to make sure our notebook is using a GPU. Google often allocates us a GPU by default, but not always. To do this, in our Google Colab menu, go to Runtime > Change runtime type. Next, a small window will pop up. Under Hardware accelerator make sure T4 GPU is selected and click Save.

3. Now we can install Whisper. (You can also check install instructions in the official Github repository). To install it, just paste the following lines in a cell. To run the commands, click the play button at the left of the cell or press Ctrl + Enter. The install process should take 1-2 minutes.

!pip install git+https://github.com/openai/whisper.git
!sudo apt update && sudo apt install ffmpeg

4. Now, we can upload a file to transcribe it. To do this, open the File Browser at the left of the notebook by pressing the folder icon.

Now you can press the upload file button at the top of the file browser, or just drag and drop a file from your computer and wait for it to finish uploading.

5. Next, we can simply run Whisper to transcribe the audio file using the following command. If this is the first time you’re running Whisper, it will first download some dependencies.

!whisper "Filename.mp3" --model medium

You can check out all the options you can use in the command-line for Whisper by running !whisper -h in Google Colab.

Transcribe Audio to Text for FREE in Google Colab using OpenAI Whisper (Step-by-step Instructions)

Report Page