How to Use the OpenAI Whisper Model to Convert Audio File Batch Into Text File via Python (Google Colab)?

DigNo Ape
2 min readMar 5, 2024

--

Few days ago, my friend shared hundreds of audio files (.mp3) and asked me how to get the transcripts either manually or via online tools. I was thinking that we can leverage OpenAI’s Whisper model to handle this task so I am going to demonstrate how I converted them into text saved in a document.

Step 1: Store audio files in Google Drive.

Step 2: Create your own API key from OpenAI.

Step 3: Import packages and mount your drive. Use os.listdir to get the files names in the directory path (folder).

import os
from openai import OpenAI

from google.colab import drive
drive.mount('/content/drive')
client = OpenAI(api_key = 'Your API Key')
directory_path = 'Where you save your audio files'
directory_files = os.listdir(directory_path)

Step 4: Loop through the files and convert the audio files into transcription.text and then append (‘a’) it into Downloaded.txt.

for file in directory_files:
audio_file_path = '{}/{}'.format(directory_path, file)
print(audio_file_path)
audio_file= open(audio_file_path, "rb")
transcription = client.audio.transcriptions.create(
model="whisper-1",
file=audio_file
)
with open('/content/drive/MyDrive/Downloaded.txt', 'a') as writefile:
writefile.write("\n")
writefile.write(file)
writefile.write("\n")
writefile.write(transcription.text)
writefile.write("\n")

Notes

I was converting around 200 audio files (1–2 mins each) yesterday and here is the usage and charge for your reference.

Thank you!

If you want to support Informula, you can buy us a coffee here :)

𝗕𝘂𝘆 𝗺𝗲 𝗮 𝗰𝗼𝗳𝗳𝗲𝗲

--

--

DigNo Ape
DigNo Ape

Written by DigNo Ape

我們秉持著從原人進化的精神,不斷追求智慧的累積和工具的運用來提升生產力。我們相信,每一個成員都擁有無限的潛力,透過學習和實踐,不斷成長和進步。

No responses yet