Getting Audio File Information With HTML’s File API And Audio Element
I’ve been cranking out features and capabilities on SignalLeaf for just over a month now, and it’s ever so close to being ready for some public beta testing. I do have one alpha tester, though. John Sonmez is slowly moving some of his “Get Up And Code” podcast over to SignalLeaf, and is providing feedback to me along the way. One of the features that he asked for, after uploading his first file, was automatic file information gathering from the .mp3 file. It turns out information such as the file’s content type, size and duration can all be obtained quite easily with modern browsers – no server side code needed. This was a huge relief for me as I’m hosting SignalLeaf on Heroku and did not want to stand up a real server somewhere, to store files and process the audio file.
Basic File Info With The File API
Most of the file information that I need can be gathered with the HTML File API. File name, size and content type, for example, can be pulled from this easily. The only thing you need is a
When the “change” event on the file input fires, you can grab the file information using the .files attribute of the input element. In this case, I’m using jQuery to capture the “change” event, but I’m grabbing the e.currentTarget directly after that. This gives me the HTML file input element, which has the .files attribute on it. This attribute is an array of files, for scenarios where you are selecting multiple files. In this case, there is only a single file selected so I’m grabbing the first item in the array.
Once I have the file object, I can get the .name, .size and .type information and populate the HTML in my document, store these in fields, or do whatever else I need to do with them. But while this information is great, it isn’t enough. I need to know the song duration as well, so that podcasters uploading an mp3 file to SignalLeaf won’t have to manually enter the song duration.
Audio Element And Duration
Once I had that in place, I set up a “canplaythrough” event listener to tell me when the song had been loaded. This event fires when the song is loaded and can play all the way through. From there, I read the duration and then using momentjs, I convert the duration from seconds in to a more useful “hh:mm:ss” format.
This worked well for files hosted somewhere on the web, but I needed to load the file from the local machine of the person using SignalLeaf, before it was uploaded anywhere. It turns out there are a couple of options for this.
First Attempt: FileReader And Data URL
The first thing I tried to do was read the file contents in to memory and create a Data URI with the FileReader API. I found this article on HTML5 Rocks, and it gave me all the information I needed for this attempt. So I set up a FileReader and built a Data URL – a base64 encoded version of a binary file.
My first test was successful! I was able to get the file information that I wanted. But when I tried to use this on files that were more than a few seconds long, I noticed the browser was locking up. The larger the file, the longer it locked up.
It turns out this is a really bad idea for large audio files. Even with an audio file that is only 6 minutes long, my browser locked up for 3 or 4 seconds. Now imagine a podcast episode that is 30 or 40 minutes long. It would likely lock up the browser for 20 or 30 seconds, or even crash the browser. The problem is that using the Data URL encoded file gives you a base64 encoded version of the file, which is then stuffed in to the
Fixing It With Object URLs
Shortly after running in to this problem and complaining about it on twitter, Chris Wagner suggested I use URL.createObjectURL instead. I’ve used URL.createObjectURL in the past, but had forgotten about it. The last time I used it was when I was helping to build the Hilo.js sample app for Microsoft Patterns & Practices.
The gist of the URL.createObjectUrl function, is that it returns a URL that points to a memory location in the browser, for an object. This object URL can be used in most places where a URL is supported. I used this to load large image files into memory for that project, and it makes sense to use it for a large audio file as well.
Now when I select a file, the audio information is parsed and displayed nearly instantly. It doesn’t seem to matter whether I load a 5 meg or 50 meg audio file, either. The browser is pulling the file in to memory, and the
One important note: if you’re building a single page application and using createObjectURL, you will also need to know about revokeObjectURL. I spent 2 weeks profiling the Hilo.js app memory usage, to figure out that we were leaking memory everywhere with our use of createObjectURL. Revoking the URL will release the memory, allowing your single page app to clean itself up.
A Complete Demo
WIth all that said and done, It’s fairly easy to get the complete set of audio information that I need from a .mp3 file. Here’s a complete demo of the code, which can be found at this JSFiddle.
Find a .mp3 file, or other audio file that the
By the way, if you want to know more about SignalLeaf and how it is simplifying podcast audio hosting, be sure to sign up for the mailing list at the bottom of SignalLeaf.com.