Getting Audio File Information With HTML’s File API And Audio Element

I’ve been cranking out features and capabilities on SignalLeaf for just over a month now, and it’s ever so close to being ready for some public beta testing. I do have one alpha tester, though. John Sonmez is slowly moving some of his “Get Up And Code” podcast over to SignalLeaf, and is providing feedback to me along the way. One of the features that he asked for, after uploading his first file, was automatic file information gathering from the .mp3 file. It turns out information such as the file’s content type, size and duration can all be obtained quite easily with modern browsers – no server side code needed. This was a huge relief for me as I’m hosting SignalLeaf on Heroku and did not want to stand up a real server somewhere, to store files and process the audio file.

Basic File Info With The File API

Most of the file information that I need can be gathered with the HTML File API. File name, size and content type, for example, can be pulled from this easily. The only thing you need is a <file> input element and a little bit of JavaScript (jQuery in this case).

<input type="file" id="file" />

<p>
  <label>File Name:</label>
  <span id="filename"></span>
</p>

<p>
  <label>File Type:</label>
  <span id="filetype"></span>
</p>

<p>
  <label>File Size:</label>
  <span id="filesize"></span>
</p>

$("#file").change(function(e){
    var file = e.currentTarget.files[0];
   
    $("#filename").text(file.name);
    $("#filetype").text(file.type);
    $("#filesize").text(file.size);    
});

When the “change” event on the file input fires, you can grab the file information using the .files attribute of the input element. In this case, I’m using jQuery to capture the “change” event, but I’m grabbing the e.currentTarget directly after that. This gives me the HTML file input element, which has the .files attribute on it. This attribute is an array of files, for scenarios where you are selecting multiple files. In this case, there is only a single file selected so I’m grabbing the first item in the array.

Once I have the file object, I can get the .name, .size and .type information and populate the HTML in my document, store these in <input> fields, or do whatever else I need to do with them. But while this information is great, it isn’t enough. I need to know the song duration as well, so that podcasters uploading an mp3 file to SignalLeaf won’t have to manually enter the song duration.

Audio Element And Duration

After some tweeting and asking if I there was a JavaScript library to get song duration in the browser, I found out that the HTML <audio> element has this built in to it. Once the <audio> element has loaded the file set in it’s “src” attribute, I can read the .duration of the element which returns the song duration in seconds.

<audio id="my-audio" src="http://example.com/somefile.mp3"></audio>

<script>
  var myAudio = document.getElmentById("my-audio");
  myAudio.duration; // => duration, in seconds
</script>

Once I had that in place, I set up a “canplaythrough” event listener to tell me when the song had been loaded. This event fires when the song is loaded and can play all the way through. From there, I read the duration and then using momentjs, I convert the duration from seconds in to a more useful “hh:mm:ss” format.

$("#my-audio").on("canplaythrough", function(e){
    var seconds = e.currentTarget.duration;
    var duration = moment.duration(seconds, "seconds");
    
    var time = "";
    var hours = duration.hours();
    if (hours > 0) { time = hours + ":" ; }
    
    time = time + duration.minutes() + ":" + duration.seconds();
    $("#duration").text(time);
});

This worked well for files hosted somewhere on the web, but I needed to load the file from the local machine of the person using SignalLeaf, before it was uploaded anywhere. It turns out there are a couple of options for this.

First Attempt: FileReader And Data URL

The first thing I tried to do was read the file contents in to memory and create a Data URI with the FileReader API. I found this article on HTML5 Rocks, and it gave me all the information I needed for this attempt. So I set up a FileReader and built a Data URL – a base64 encoded version of a binary file.

$("#my-audio").on("canplaythrough", function(e){
    var seconds = e.currentTarget.duration;
    var duration = moment.duration(seconds, "seconds");
    
    var time = "";
    var hours = duration.hours();
    if (hours > 0) { time = hours + ":" ; }
    
    time = time + duration.minutes() + ":" + duration.seconds();
    $("#duration").text(time);
});

$("#file").change(function(e){
  var file = e.currentTarget.files[0];

  var reader = new FileReader();
  
  reader.onLoad = function(encodedFile){
    $("#my-audio").prop(src, encodedFile);
  };
  
  reader.readAsDataURL(file);
});

My first test was successful! I was able to get the file information that I wanted. But when I tried to use this on files that were more than a few seconds long, I noticed the browser was locking up. The larger the file, the longer it locked up.

It turns out this is a really bad idea for large audio files. Even with an audio file that is only 6 minutes long, my browser locked up for 3 or 4 seconds. Now imagine a podcast episode that is 30 or 40 minutes long. It would likely lock up the browser for 20 or 30 seconds, or even crash the browser. The problem is that using the Data URL encoded file gives you a base64 encoded version of the file, which is then stuffed in to the <audio> tag’s “src” property. You can imagine a browser not liking the 30 or 40 megs worth of data, and having a hard time encoding it and storing the string in this element.

Fixing It With Object URLs

Shortly after running in to this problem and complaining about it on twitter, Chris Wagner suggested I use URL.createObjectURL instead. I’ve used URL.createObjectURL in the past, but had forgotten about it. The last time I used it was when I was helping to build the Hilo.js sample app for Microsoft Patterns & Practices.

The gist of the URL.createObjectUrl function, is that it returns a URL that points to a memory location in the browser, for an object. This object URL can be used in most places where a URL is supported. I used this to load large image files into memory for that project, and it makes sense to use it for a large audio file as well.

$("#file").change(function(e){
    var file = e.currentTarget.files[0];
   
    var objectUrl = URL.createObjectURL(file);
    $("#audio").prop("src", objectUrl);
});

Now when I select a file, the audio information is parsed and displayed nearly instantly. It doesn’t seem to matter whether I load a 5 meg or 50 meg audio file, either. The browser is pulling the file in to memory, and the <audio> element is pointed at that memory location.

One important note: if you’re building a single page application and using createObjectURL, you will also need to know about revokeObjectURL. I spent 2 weeks profiling the Hilo.js app memory usage, to figure out that we were leaking memory everywhere with our use of createObjectURL. Revoking the URL will release the memory, allowing your single page app to clean itself up.

A Complete Demo

WIth all that said and done, It’s fairly easy to get the complete set of audio information that I need from a .mp3 file. Here’s a complete demo of the code, which can be found at this JSFiddle.

Find a .mp3 file, or other audio file that the <audio> input element supports. Once you have selected it with the file chooser, you will see the file name, type, size and duration loaded in to the HTML – all thanks to the File API, <audio> element and URL.createObjectURL.

By the way, if you want to know more about SignalLeaf and how it is simplifying podcast audio hosting, be sure to sign up for the mailing list at the bottom of SignalLeaf.com.


Post Footer automatically generated by Add Post Footer Plugin for wordpress.

About Derick Bailey

Derick Bailey is an entrepreneur, problem solver (and creator? :P ), software developer, screecaster, writer, blogger, speaker and technology leader in central Texas (north of Austin). He runs SignalLeaf.com - the amazingly awesome podcast audio hosting service that everyone should be using, and WatchMeCode.net where he throws down the JavaScript gauntlets to get you up to speed. He has been a professional software developer since the late 90's, and has been writing code since the late 80's. Find me on twitter: @derickbailey, @mutedsolutions, @backbonejsclass Find me on the web: SignalLeaf, WatchMeCode, Kendo UI blog, MarionetteJS, My Github profile, On Google+.
This entry was posted in audio, FileAPI, HTML5, Javascript, JQuery, SignalLeaf. Bookmark the permalink. Follow any comments here with the RSS feed for this post.
  • Teddy

    Nice post. I updated the fiddle to call URL.revokeObjectURL since you made the point of mentioning it in the article: http://jsfiddle.net/s4P2v/2/

    • http://mutedsolutions.com Derick Bailey

      #facepalm – thanks. :) I updated my fiddle to handle this.

  • http://stackoverflow.com/users/425275/ime-vidas Šime Vidas

    The HTML code is barely visible in the code blocks (dark blue text on black bg; screen is here: https://twitter.com/simevidas/status/383456130631868416 ). Can you change the syntax highlighter theme?

  • Pat Hall

    This is great, I just tried URL.createObjectURL on a 1.3 Gig file, thinking it would crash the browser, but it worked a trick.