Experiments with audio, part II.I

I’m working on a project to try and expose audio spectrum data from Firefox’s audio element.  Today I give an update on yesterday's progress, since there are pretty pictures, and then ask some questions.

Yesterday I wrote about our first steps to find and extract audio spectrum data from the <audio> element.  At the end of my post I wondered aloud whether or not the numbers I'd produced were meaningful.

After that post I spent some time working with Al MacDonald and Thomas Saunders--my partners in these experiments--and a number of interesting things happened.  First, what were simply numbers to me proved to be meaningful to them, and after helping them build Firefox with my changes, they started playing with the data.  Al made a simple audio test case, and Thomas worked the data into a JavaScript friendly form.  Next Al took the audio and analyzed the wave form in Audacity, before creating a real-time canvas visualization of the data we'd extracted from the browser.  The results speak for themselves.

Having an early success is encouraging, but like one of my students is fond of saying, success brings you to your next problem.  Knowing that this data is meaningful means I now need to figure out the right way to expose it in the DOM.  When I get this data I'm deep in C++, nowhere near JS running in the browser.  What I need is a proper API for making this data available.

My choice of words is important: "right way," "proper API."  I could (and probably will for my next test) just drop-kick the data across the the content boundary.  But what's the right way to do this?  I did some investigations into our implementation of DOM events last night.  I was thinking that maybe I should pass the data out within a custom DOM event.  However, I don't see any events that use this model.  There doesn't seem to be much data pushed with the event.  Another option is to dispatch an event and then fill a buffer that can be read via a getAudioData() call.  But even when I get this working (if this is the right thing to do), I'll next have to worry about how to make that data meaningful in terms of sync with the actual audio that the user is hearing.  A little bit out of sync is almost as bad as being totally random, especially if you're trying to time visualizations or other UI updates to sound.

So I'm convinced we're on track, and also feeling a bit lost.  I know that "the perfect is the enemy of the good," so I won't halt my work until I can settle all these questions.  It's clear to me that I'm going to have to get this wrong before I get it right.  But I'd value some input from those closer to our DOM implementation and the JS community on which paths to explore.  Thankfully, Chris Blizzard has started that ball rolling by introducing us to some more people.  The most enjoyable part of experiments like these for me is the chance to work in community.

Show Comments