Transcribing audio with less pain
Like so many people I’ve never really liked transcribing audio, for example from interviews or focus groups. It is time-consuming and boring. Of course, you can outsource this but that unfortunately costs money. So I thought: “how can I do this quicker with available services.”
Last year with a colleague I wrote an article on exactly this: using the Youtube auto-captioning feature to more quickly transcribe audio. The quality of Youtube’s voice recognition has improved considerably in the last decade. The paper gives three examples, from interview audio, a classroom recording, and a Chilcott inquiry interview to show how useful this can be for transcribing audio ‘as a first transcript version’. I just posted the pre-publication.
To demonstrate the procedure, I applied it to my recent podcast with TES.
- You first need to get hold of an audio file. I assume you have it from your data collection. Sometimes you can obtain them like using apps in the browser like DownThemAll! (that one is for Firefox),
- Before being able to upload to Youtube, you need to make a video file out of it. For windows, I prefer Movie Maker. Unfortunately this has been discontinued, but you can still find it here. I make a video with an image and the audio as accompanying sound.
- Now this ‘movie’ (actually audio with one image) can be uploaded to Youtube. After a few hours Youtube should have created closed captions for the audio. Ensure that privacy settings are set correctly.
- The captions can be downloaded as text file via multiple tools like DIY captions or downsub. Some software is non-web-browser based, and some can also work with private settings (just as long as you are the ‘owner’ of the file, of course). The result might be a subtitle file, which could further be edited with subtitle software.
- You can see that this version already is pretty good. I think it captures it for around 80%. It took maybe 15 minutes of actual labour and some time for the Youtube captioning to do its work, for a 40 minute audio file. This saves me a lot of time.