Low Latency & Polyphonic Audio in PhoneGap

UPDATE 10/07/2013: This plugin has been updated to support PhoneGap 3.0 method signatures and command line interface.  You can access the latest at: https://github.com/triceam/LowLatencyAudio


If you have ever tried to develop any kind of application using HTML5 audio that is widely supported, then you have likely pulled all the hair from your head. In its current state, HTML5 Audio is wrought with issues… lack of consistent codec support across browsers & operating systems, no polyphony (a single audio clip can not be played on top of itself), and lack of concurrency (on some of the leading mobile browsers you can only play one audio file at a time, if at all). Even the leading HTML5 games for desktop browsers don’t even use HTML5 audio (they use Flash). Don’t believe me? Just take a look at Angry Birds, Cut the Rope, or Bejeweled in a proxy/resource monitor…

The Problem

You want fast & responsive audio for your mobile applications.   This is especially the case for multimedia intensive and/or gaming applications.

HTML5 audio is not *yet* ready for prime-time. There are some great libraries like SoundManager, which can help you try to use HTML5 audio with a failover to Flash, but you are still limited without polyphony or concurrency. In desktop browsers, Flash fixes these issues, and Flash is still vastly superior to HTML5 for audio programming.

If you are building mobile applications, you can have great audio capabilities by developing apps with AIR. However, what if you aren’t using AIR? In native applications, you can access the underlying audio APIs and have complete control.

If you are developing mobile applications with PhoneGap, you can use the Media class, which works great. If you want polyphony, then you will have to do some work managing audio files for yourself, which can get tricky. You can also write native plugins that integrate with the audio APIs for the native operating systems, which is what i will be covering in this post.

Before continuing further, let’s take a minute to understand what I am talking about when I refer to concurrency, polyphony, and low-latency…

Concurrency

Concurrency in audio programming refers to the ability to play multiple audio resources simultaneously.  HTML5 in most mobile devices does not support this – not in iOS, not in Android.  In fact, HTML5 Audio does not work *at all* in Android 2.x and earlier.  Native APIs do support this, and so does PhoneGap’s Media class, which is based on Android MediaPlayer and iOS AVAudioPlayer.

Polyphony

Producing many sounds simultaneously; many-voiced.

In this case, polyphony is the production of multiple sounds simultaneously (I’m not referring to the concept of polyphany in music theory). In describing concurrency, I refered to the ability to play 2 separate sounds at the same time, where with polyphony I refer to the ability to play the same sound “on top” of itself. There can be multiple “voices” of the same sound. In the most literal of definitions concurrency could be considered a part of polyphony, and polyphony a part of concurrency… Hopefully you get what I’m trying to say. In its current state, HTML5 audio supports neither concurrency or polyphony.  The PhoneGap Media class does not support polyphony, however you can probably manage multiple media instances via javascript to achieve polyphonic behavior – this requires additional work in the JavaScript side of things to juggle resources.

Low Latency

Low latency refers to “human-unnoticeable delays between an input being processed and the corresponding output providing real time characteristics” according to wikipedia.   In this case, I refer to low latency audio, meaning that there is an imperceptible delay between when a sound is triggered, and when it actually plays.   This means that sounds will play when expected, not after a wait.   This means a bouncing ball sound should be heard as you see the ball bouncing on the screen.   Not after it has already bounced.

In HTML5, you can auto-load a sound so that it is ready when you need it, but don’t expect to play more than one at a time.  With the PhoneGap Media class, the audio file isn’t actually requested until you invoke “play”.   This occurs inside “startPlaying” on Android, and “play” on iOS.   What I wanted was a way to preload the audio so that it is immediately ready for use at the time it is needed.

The Solution

PhoneGap makes it really easy to build natively installed applications using a familiar paradim: HTML & JavaScript.   Luckily, PhoneGap also allows you to tie into native code using the native plugin model.   This enables you to write your own native code and expose that code to your PhoneGap application via a JavaScript interface… and that is exactly what I did to enable low-latency, concurrent, and polyphonic audio in a PhoneGap experience.

I created PhoneGap native plugins for Android and iOS that allow you to preload audio, and playback that audio quickly, with a very simple to use API.   I’ll get into details how this works further in the post, but you can get a pretty good idea of what I mean by viewing the following two videos.

The first is a basic “Drum Machine”.  You just tap the pads to play an audio sample.

The second is a simple user interface that allows you to layer lots of complex audio, mimicking scenarios that may occur within a video gaming context.

Assets used in this example from freesound.org.  See README for specific links & attribution.

You may have noticed a slight delay in this second video between the tap and the actual sounds.  This is because I am using “touchStart” events in the first example, and just using a normal <a href=”javascript:foo()”> link in the second.  There is always a delay for “normal” links in all multi-touch devices/environments because there has to be time for the device to detect a gesture event. You can bypass this delay in mobile web browsers by using touch events for all input.

Side Note:  I have also noticed that touch events are slightly slower to be recognized on Android devices than iOS.   My assumption is that this is related to specific device capabilities – this is more noticeable on the Amazon Kindle Fire than the Motorola Atrix.   The delay does not appear to be a delay in the actual audio playback.

How it works

The native plugins expose a very simple API for hooking into native Audio capabilities.   The basic usage is:

  • Preload the audio asset
  • Play the audio asset
  • When done, unload the audio asset to conserve resources

The basic components of a PhoneGap native plugin are:

  • A JavaScript interface
  • Corresponding Native Code classes
You can learn more about getting started with native plugins on the PhoneGap wiki.

Let’s start by examining the native plugin’s JavaScript API.  You can see that it just hands off the JavaScript calls to the native layer via PhoneGap:

[js]
var PGLowLatencyAudio = {

preloadFX: function ( id, assetPath, success, fail) {
return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "preloadFX", [id, assetPath]);
},

preloadAudio: function ( id, assetPath, voices, success, fail) {
return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "preloadAudio", [id, assetPath, voices]);
},

play: function (id, success, fail) {
return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "play", [id]);
},

stop: function (id, success, fail) {
return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "stop", [id]);
},

loop: function (id, success, fail) {
return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "loop", [id]);
},

unload: function (id, success, fail) {
return PhoneGap.exec(success, fail, "PGLowLatencyAudio", "unload", [id]);
}
};[/js]

You would invoke the native functionality by first preloading the audio files BEFORE you need them:

[js]PGLowLatencyAudio.preloadAudio(‘background’, ‘assets/background.mp3’, 1);
PGLowLatencyAudio.preloadFX(‘explosion’, ‘assets/explosion.mp3’);
PGLowLatencyAudio.preloadFX(‘machinegun’, ‘assets/machine gun.mp3’);
PGLowLatencyAudio.preloadFX(‘missilestrike’, ‘assets/missle strike.mp3’);
PGLowLatencyAudio.preloadAudio(‘thunder’, ‘assets/thunder.mp3’, 1);[/js]

When you need to play an effect you just call either the play or loop functions, passing in the unique sound ID:

[js]PGLowLatencyAudio.play(‘background’);
PGLowLatencyAudio.play(‘explosion’);
PGLowLatencyAudio.play(‘machinegun’);[/js]

Next, let’s examine some intricacies of the plugin…   One thing to keep in mind is that I do not have callbacks to the phonegap app once a media asset is loaded.   If you need “loaded” callbacks, you will need to add those yourself.


preloadFX: function ( id, assetPath, success, fail)
params:

id – string unique ID for the audio file
assetPath – the relative path to the audio asset within the www directory
success – success callback function
fail – error/fail callback function

detail:

The preloadFX function loads an audio file into memory.  These are lower-level audio methods and have minimal overhead. These assets should be short (less than 5 seconds). These assets are fully concurrent and polyphonic.

On Android, assets that are loaded using preloadFX are managed/played using the Android SoundPool class. Sound files longer than 5 seconds may have errors including (not playing, clipped content, not looping) – all will fail silently on the device (debug output will be visible if connected to debugger).

On iOS, assets that are loaded using preloadFX are managed/played using System Sound Services from the AudioToolbox framework. Audio loaded using this function is played using AudioServicesPlaySystemSound. These assets should be short, and are not intended to be looped or stopped.


preloadAudio: function ( id, assetPath, voices, success, fail)
params:

id – string unique ID for the audio file
assetPath – the relative path to the audio asset within the www directory
voicesthe number of polyphonic voices available
success – success callback function
fail – error/fail callback function

detail:

The preloadAudio function loads an audio file into memory.  These have more overhead than assets laoded via preloadFX, and can be looped/stopped. By default, there is a single “voice” – only one instance that will be stopped & restarted when you hit play. If there are multiple voices (number greater than 0), it will cycle through voices to play overlapping audio.  You must specify multiple voices to have polyphonic audio – keep in mind, this takes up more device resources.

On Android, assets that are loaded using preloadAudio are managed/played using the Android MediaPlayer.

On iOS, assets that are loaded using preloadAudio are managed/played using AVAudioPlayer.


play: function (id, success, fail)
params:

id – string unique ID for the audio file
success – success callback function
fail – error/fail callback function

detail:

Plays an audio asset.  You only need to pass the audio ID, and the native plugin will determine the type of asset and play it.


loop: function (id, success, fail)
params:

id – string unique ID for the audio file
success – success callback function
fail – error/fail callback function

detail:

Loops an audio asset infinitely.  On iOS, this only works for assets loaded via preloadAudio.  This works for all asset types for Android, however it is recommended to keep usage consistent between platforms.


stop: function (id, success, fail)
params:

id – string unique ID for the audio file
success – success callback function
fail – error/fail callback function

detail:

Stops an audio file.  On iOS, this only works for assets loaded via preloadAudio.  This works for all asset types for Android, however it is recommended to keep usage consistent between platforms.

unload: function (id, success, fail)
params:

id – string unique ID for the audio file
success – success callback function
fail – error/fail callback function

detail:

Unloads an audio file from memory.   DO NOT FORGET THIS!  Otherwise, you will cause memory leaks.


I’m not just doing this for myself, the audio is completely open source for you to take advantage of as well.  You can download the full code, as well as all examples from github at github:

UPDATE 10/07/2013: This plugin has been updated to support PhoneGap 3.0 method signatures and command line interface.  You can access the latest at: https://github.com/triceam/LowLatencyAudio

Enjoy!

219 replies on “Low Latency & Polyphonic Audio in PhoneGap”