OpenSL ES for Android

This article describes the Android native audio APIs based on the Khronos Group OpenSL ES 1.0.1 standard, as of Android API level 9 (Android version 2.3).

OpenSL ES provides a C language interface that is also callable from C++, and exposes features similar to the audio portions of these Android APIs callable from Java programming language code:

As with all of the Android Native Development Kit (NDK), the primary purpose of OpenSL ES for Android is to facilitate the implementation of shared libraries to be called from Java programming language code via Java Native Interface (JNI). NDK is not intended for writing pure C/C++ applications. That said, OpenSL ES is a full-featured API, and we expect that you should be able to accomplish most of your audio needs using only this API, without up-calls to Java.

Note: though based on OpenSL ES, Android native audio at API level 9 is not a conforming implementation of any OpenSL ES 1.0.1 profile (game, music, or phone). This is because Android does not implement all of the features required by any one of the profiles. Any known cases where Android behaves differently than the specification are described in section "Android extensions" below.

Getting started

Example code

Recommended

Supported and tested example code, usable as a model for your own code, is located in NDK folder platforms/android-9/samples/native-audio/.

Not recommended

The OpenSL ES 1.0.1 specification contains example code in the appendices (see section "References" below for the link to this specification). However, the examples in Appendix B: Sample Code and Appendix C: Use Case Sample Code use features not supported by Android API level 9. Some examples also contain typographical errors, or use APIs that are likely to change. Proceed with caution in referring to these; though the code may be helpful in understanding the full OpenSL ES standard, it should not be used as is with Android.

Adding OpenSL ES to your application source code

OpenSL ES is a C API, but is callable from both C and C++ code.

At a minimum, add the following line to your code:

#include <SLES/OpenSLES.h>
If you use Android extensions, also include these headers:
#include <SLES/OpenSLES_Android.h>
#include <SLES/OpenSLES_AndroidConfiguration.h>

Makefile

Modify your Android.mk as follows:
LOCAL_LDLIBS += libOpenSLES

Audio content

There are many ways to package audio content for your application, including:
Resources
By placing your audio files into the res/raw/ folder, they can be accessed easily by the associated APIs for Resources. However there is no direct native access to resources, so you will need to write Java programming language code to copy them out before use.
Assets
By placing your audio files into the assets/ folder, they will be directly accessible by the Android native asset manager APIs. See the header files android/asset_manager.h and android/asset_manager_jni.h for more information on these APIs, which are new for API level 9. The example code located in NDK folder platforms/android-9/samples/native-audio/ uses these native asset manager APIs in conjunction with the Android file descriptor data locator.
Network
You can use the URI data locator to play audio content directly from the network. However, be sure to read section "Security and permissions" below.
Local filesystem
The URI data locator supports the file: scheme for local files, provided the files are accessible by the application. Note that the Android security framework restricts file access via the Linux user ID and group ID mechanism.
Recorded
Your application can record audio data from the microphone input, store this content, and then play it back later. The example code uses this method for the "Playback" clip.
Compiled and linked inline
You can link your audio content directly into the shared library, and then play it using an audio player with buffer queue data locator. This is most suitable for short PCM format clips. The example code uses this technique for the "Hello" and "Android" clips. The PCM data was converted to hex strings using a bin2c tool (not supplied).
Synthesis
Your application can synthesize PCM data on the fly and then play it using an audio player with buffer queue data locator. This is a relatively advanced technique, and the details of audio synthesis are beyond the scope of this article.
Finding or creating useful audio content for your application is beyond the scope of this article, but see the "References" section below for some suggested web search terms.

Note that it is your responsibility to ensure that you are legally permitted to play or record content, and that there may be privacy considerations for recording content.

Debugging

For robustness, we recommend that you examine the SLresult value which is returned by most APIs. Use of assert vs. more advanced error handling logic is a matter of coding style and the particular API; see the Wikipedia article on assert for more information. In the supplied example, we have used assert for "impossible" conditions which would indicate a coding error, and explicit error handling for others which are more likely to occur in production.

Many API errors result in a log entry, in addition to the non-zero result code. These log entries provide additional detail which can be especially useful for the more complex APIs such as Engine::CreateAudioPlayer.

Use adb logcat, the Eclipse ADT plugin LogCat pane, or ddms logcat to see the log.

Supported features from OpenSL ES 1.0.1

This section summarizes available features in this API level. In some cases, there are limitations which are described in the next sub-section.

Global entry points

Supported global entry points:

Objects and interfaces

The following figure indicates objects and interfaces supported by Android's OpenSL ES implementation. A green cell means the feature is supported.

Supported objects and interfaces

Limitations

This section details limitations with respect to the supported objects and interfaces from the previous section.

Buffer queue data locator

An audio player or recorder with buffer queue data locator supports PCM data format only.

Device data locator

The only supported use of an I/O device data locator is when it is specified as the data source for Engine::CreateAudioRecorder. It should be initialized using these values, as shown in the example:
SLDataLocator_IODevice loc_dev =
  {SL_DATALOCATOR_IODEVICE, SL_IODEVICE_AUDIOINPUT,
  SL_DEFAULTDEVICEID_AUDIOINPUT, NULL};

Dynamic interface management

RemoveInterface and ResumeInterface are not supported.

Effect combinations

It is meaningless to have both environmental reverb and preset reverb on the same output mix.

The platform may ignore effect requests if it estimates that the CPU load would be too high.

Effect send

SetSendLevel supports a single send level per audio player.

Environmental reverb

Environmental reverb does not support the reflectionsDelay, reflectionsLevel, or reverbDelay fields of struct SLEnvironmentalReverbSettings.

MIME data format

The MIME data format can be used with URI data locator only, and only for player (not recorder).

The Android implementation of OpenSL ES requires that mimeType be initialized to either NULL or a valid UTF-8 string, and that containerType be initialized to a valid value. In the absence of other considerations, such as portability to other implementations, or content format which cannot be identified by header, we recommend that you set the mimeType to NULL and containerType to SL_CONTAINERTYPE_UNSPECIFIED.

Supported formats include WAV PCM, WAV alaw, WAV ulaw, MP3, Ogg Vorbis, AAC LC, HE-AACv1 (aacPlus), HE-AACv2 (enhanced aacPlus), and AMR [provided these are supported by the overall platform, and AAC formats must be located within an MP4 container]. MIDI is not supported. WMA is not part of the open source release, and compatibility with Android OpenSL ES has not been verified.

The Android implementation of OpenSL ES does not support direct playback of DRM or encrypted content; if you want to play this, you will need to convert to cleartext in your application before playing, and enforce any DRM restrictions in your application.

Object

Resume, RegisterCallback, AbortAsyncOperation, SetPriority, GetPriority, and SetLossOfControlInterfaces are not supported.

PCM data format

The PCM data format can be used with buffer queues only. Supported PCM playback configurations are 8-bit unsigned or 16-bit signed, mono or stereo, little endian byte ordering, and these sample rates: 8000, 11025, 12000, 16000, 22050, 24000, 32000, 44100, or 48000 Hz. For recording, the supported configurations are device-dependent, however generally 16000 Hz mono 16-bit signed is usually available.

Note that the field samplesPerSec is actually in units of milliHz, despite the misleading name. To avoid accidentally using the wrong value, you should initialize this field using one of the symbolic constants defined for this purpose (such as SL_SAMPLINGRATE_44_1 etc.)

Playback rate

A single playback rate range from 500 per mille to 2000 per mille inclusive is supported, with property SL_RATEPROP_NOPITCHCORAUDIO.

Record

The SL_RECORDEVENT_HEADATLIMIT and SL_RECORDEVENT_HEADMOVING events are not supported.

Seek

SetLoop enables whole file looping. The startPos and endPos parameters are ignored.

URI data locator

The URI data locator can be used with MIME data format only, and only for an audio player (not audio recorder). Supported schemes are http: and file:. A missing scheme defaults to the file: scheme. Other schemes such as https:, ftp:, and content: are not supported. rtsp: is not verified.

Data structures

Android API level 9 supports these OpenSL ES 1.0.1 data structures:

Platform configuration

OpenSL ES for Android is designed for multi-threaded applications, and is thread-safe.

OpenSL ES for Android supports a single engine per application, and up to 32 objects. Available device memory and CPU may further restrict the usable number of objects.

slCreateEngine recognizes, but ignores, these engine options:

Planning for future versions of OpenSL ES

The Android native audio APIs at level 9 are based on Khronos Group OpenSL ES 1.0.1 (see section "References" below). As of the time of this writing, the OpenSL ES working group is preparing a revised version of the standard. The revised version will likely include new features, clarifications, correction of typographical errors, and some incompatibilities. Most of the expected incompatibilities are relatively minor, or are in areas of OpenSL ES not supported by Android API level 9. However, even a small change can be significant for an application developer, so it important to prepare for this.

The Android team is committed to preserving future API binary compatibility for developers to the extent feasible. It is our intention to continue to support future binary compatibility of the 1.0.1-based API, even as we add support for later versions of the standard. An application developed with this version should work on future versions of the Android platform, provided that you follow the guidelines listed in section "Planning for binary compatibility" below.

Note that future source compatibility will not be a goal. That is, if you upgrade to a newer version of the NDK, you may need to modify your application source code to conform to the new API. We expect that most such changes will be minor; see details below.

Planning for binary compatibility

We recommend that your application follow these guidelines, to improve future binary compatibility:

Planning for source compatibility

As mentioned, source code incompatibilities are expected in the next version of OpenSL ES from Khronos Group. Likely areas of change include: Any actual source code incompatibilities will be explained in detail when the time comes.

Android extensions

The API for Android extensions is defined in SLES/OpenSLES_Android.h. Consult that file for details on these extensions. Unless otherwise noted, all interfaces are "explicit".

Note that use these extensions will limit your application's portability to other OpenSL ES implementations. If this is a concern, we advise that you avoid using them, or isolate your use of these with #ifdef etc.

The following figure shows which Android-specific interfaces and data locators are available for each object type.

Android extensions

Android configuration interface

The Android configuration interface provides a means to set platform-specific parameters for objects. Unlike other OpenSL ES 1.0.1 interfaces, the Android configuration interface is available prior to object realization. This permits the object to be configured and then realized. Header file SLES/OpenSLES_AndroidConfiguration.h documents the available configuration keys and values: Here is an example code fragment that sets the Android audio stream type on an audio player:
// CreateAudioPlayer and specify SL_IID_ANDROIDCONFIGURATION
// in the required interface ID array. Do not realize player yet.
// ...
SLAndroidConfigurationItf playerConfig;
result = (*playerObject)->GetInterface(playerObject,
    SL_IID_ANDROIDCONFIGURATION, &playerConfig);
assert(SL_RESULT_SUCCESS == result);
SLint32 streamType = SL_ANDROID_STREAM_ALARM;
result = (*playerConfig)->SetConfiguration(playerConfig,
    SL_ANDROID_KEY_STREAM_TYPE, &streamType, sizeof(SLint32));
assert(SL_RESULT_SUCCESS == result);
// ...
// Now realize the player here.
Similar code can be used to configure the preset for an audio recorder.

Android effects interfaces

The Android effect, effect send, and effect capabilities interfaces provide a generic mechanism for an application to query and use device-specific audio effects. A device manufacturer should document any available device-specific audio effects.

Portable applications should use the OpenSL ES 1.0.1 APIs for audio effects instead of the Android effect extensions.

Android file descriptor data locator

The Android file descriptor data locator permits the source for an audio player to be specified as an open file descriptor with read access. The data format must be MIME.

This is especially useful in conjunction with the native asset manager.

Android simple buffer queue data locator and interface

The Android simple buffer queue data locator and interface are identical to the OpenSL ES 1.0.1 buffer queue locator and interface, except that Android simple buffer queues may be used with both audio players and audio recorders, and are limited to PCM data format. [OpenSL ES 1.0.1 buffer queues are for audio players only, and are not restricted to PCM data format.]

For recording, the application should enqueue empty buffers. Upon notification of completion via a registered callback, the filled buffer is available for the application to read.

For playback there is no difference. But for future source code compatibility, we suggest that applications use Android simple buffer queues instead of OpenSL ES 1.0.1 buffer queues.

Dynamic interfaces at object creation

For convenience, the Android implementation of OpenSL ES 1.0.1 permits dynamic interfaces to be specified at object creation time, as an alternative to adding these interfaces after object creation with DynamicInterfaceManagement::AddInterface.

Buffer queue behavior

The OpenSL ES 1.0.1 specification requires that "On transition to the SL_PLAYSTATE_STOPPED state the play cursor is returned to the beginning of the currently playing buffer." The Android implementation does not necessarily conform to this requirement. For Android, it is unspecified whether a transition to SL_PLAYSTATE_STOPPED operates as described, or leaves the play cursor unchanged.

We recommend that you do not rely on either behavior; after a transition to SL_PLAYSTATE_STOPPED, you should explicitly call BufferQueue::Clear. This will place the buffer queue into a known state.

A corollary is that it is unspecified whether buffer queue callbacks are called upon transition to SL_PLAYSTATE_STOPPED or by BufferQueue::Clear. We recommend that you do not rely on either behavior; be prepared to receive a callback in these cases, but also do not depend on receiving one.

It is expected that a future version of OpenSL ES will clarify these issues. However, upgrading to that version would result in source code incompatibilities (see section "Planning for source compatibility" above).

Reporting of extensions

Engine::QueryNumSupportedExtensions, Engine::QuerySupportedExtension, Engine::IsExtensionSupported report these extensions:

Programming notes

These notes supplement the OpenSL ES 1.0.1 specification, available in the "References" section below.

Objects and interface initialization

Two aspects of the OpenSL ES programming model that may be unfamiliar to new developers are the distinction between objects and interfaces, and the initialization sequence.

Briefly, an OpenSL ES object is similar to the object concept in programming languages such as Java and C++, except an OpenSL ES object is only visible via its associated interfaces. This includes the initial interface for all objects, called SLObjectItf. There is no handle for an object itself, only a handle to the SLObjectItf interface of the object.

An OpenSL ES object is first "created", which returns an SLObjectItf, then "realized". This is similar to the common programming pattern of first constructing an object (which should never fail other than for lack of memory or invalid parameters), and then completing initialization (which may fail due to lack of resources). The realize step gives the implementation a logical place to allocate additional resources if needed.

As part of the API to create an object, an application specifies an array of desired interfaces that it plans to acquire later. Note that this array does not automatically acquire the interfaces; it merely indicates a future intention to acquire them. Interfaces are distinguished as "implicit" or "explicit". An explicit interface must be listed in the array if it will be acquired later. An implicit interface need not be listed in the object create array, but there is no harm in listing it there. OpenSL ES has one more kind of interface called "dynamic", which does not need to be specified in the object create array, and can be added later after the object is created. The Android implementation provides a convenience feature to avoid this complexity; see section "Dynamic interfaces at object creation" above.

After the object is created and realized, the application should acquire interfaces for each feature it needs, using GetInterface on the initial SLObjectItf.

Finally, the object is available for use via its interfaces, though note that some objects require further setup. In particular, an audio player with URI data source needs a bit more preparation in order to detect connection errors. See the next section "Audio player prefetch" for details.

After your application is done with the object, you should explicitly destroy it; see section "Destroy" below.

Audio player prefetch

For an audio player with URI data source, Object::Realize allocates resources but does not connect to the data source (i.e. "prepare") or begin pre-fetching data. These occur once the player state is set to either SL_PLAYSTATE_PAUSED or SL_PLAYSTATE_PLAYING.

Note that some information may still be unknown until relatively late in this sequence. In particular, initially Player::GetDuration will return SL_TIME_UNKNOWN and MuteSolo::GetChannelCount will return zero. These APIs will return the proper values once they are known.

Other properties that are initially unknown include the sample rate and actual media content type based on examining the content's header (as opposed to the application-specified MIME type and container type). These too, are determined later during prepare / prefetch, but there are no APIs to retrieve them.

The prefetch status interface is useful for detecting when all information is available. Or, your application can poll periodically. Note that some information may never be known, for example, the duration of a streaming MP3.

The prefetch status interface is also useful for detecting errors. Register a callback and enable at least the SL_PREFETCHEVENT_FILLLEVELCHANGE and SL_PREFETCHEVENT_STATUSCHANGE events. If both of these events are delivered simultaneously, and PrefetchStatus::GetFillLevel reports a zero level, and PrefetchStatus::GetPrefetchStatus reports SL_PREFETCHSTATUS_UNDERFLOW, then this indicates a non-recoverable error in the data source. This includes the inability to connect to the data source because the local filename does not exist or the network URI is invalid.

The next version of OpenSL ES is expected to add more explicit support for handling errors in the data source. However, for future binary compatibility, we intend to continue to support the current method for reporting a non-recoverable error.

In summary, a recommended code sequence is:

Destroy

Be sure to destroy all objects on exit from your application. Objects should be destroyed in reverse order of their creation, as it is not safe to destroy an object that has any dependent objects. For example, destroy in this order: audio players and recorders, output mix, then finally the engine.

OpenSL ES does not support automatic garbage collection or reference counting of interfaces. After you call Object::Destroy, all extant interfaces derived from the associated object become undefined.

The Android OpenSL ES implementation does not detect the incorrect use of such interfaces. Continuing to use such interfaces after the object is destroyed will cause your application to crash or behave in unpredictable ways.

We recommend that you explicitly set both the primary object interface and all associated interfaces to NULL as part of your object destruction sequence, to prevent the accidental misuse of a stale interface handle.

Stereo panning

When Volume::EnableStereoPosition is used to enable stereo panning of a mono source, there is a 3 dB reduction in total sound power level. This is needed to permit the total sound power level to remain constant as the source is panned from one channel to the other. Therefore, don't enable stereo positioning if you don't need it. See the Wikipedia article on audio panning for more information.

Callbacks and threads

Callback handlers are generally called synchronously with respect to the event, that is, at the moment and location where the event is detected by the implementation. But this point is asynchronous with respect to the application. Thus you should use a mutex or other synchronization mechanism to control access to any variables shared between the application and the callback handler. In the example code, such as for buffer queues, we have omitted this synchronization in the interest of simplicity. However, proper mutual exclusion would be critical for any production code.

Callback handlers are called from internal non-application thread(s) which are not attached to the Dalvik virtual machine and thus are ineligible to use JNI. Because these internal threads are critical to the integrity of the OpenSL ES implementation, a callback handler should also not block or perform excessive work. Therefore, if your callback handler needs to use JNI or do anything significant (e.g. beyond an Enqueue or something else simple such as the "Get" family), the handler should instead post an event for another thread to process.

Note that the converse is safe: a Dalvik application thread which has entered JNI is allowed to directly call OpenSL ES APIs, including those which block. However, blocking calls are not recommended from the main thread, as they may result in the dreaded "Application Not Responding" (ANR).

Performance

As OpenSL ES is a native C API, non-Dalvik application threads which call OpenSL ES have no Dalvik-related overhead such as garbage collection pauses. However, there is no additional performance benefit to the use of OpenSL ES other than this. In particular, use of OpenSL ES does not result in lower audio latency, higher scheduling priority, etc. than what the platform generally provides. On the other hand, as the Android platform and specific device implementations continue to evolve, an OpenSL ES application can expect to benefit from any future system performance improvements.

Security and permissions

As far as who can do what, security in Android is done at the process level. Java programming language code can't do anything more than native code, nor can native code do anything more than Java code. The only differences between them are what APIs are available that provide functionality that the platform promises to support in the future and across different devices.

Applications using OpenSL ES must request whatever permissions they would need for similar non-native APIs. For example, if your application records audio, then it needs the android.permission.RECORD_AUDIO permission. Applications that use audio effects need android.permission.MODIFY_AUDIO_SETTINGS. Applications that play network URI resources need android.permission.NETWORK.

Media content parsers and software codecs run within the context of the Android application that calls OpenSL ES (hardware codecs are abstracted, but are device-dependent). Malformed content designed to exploit parser and codec vulnerabilities is a known attack vector. We recommend that you play media only from trustworthy sources, or that you partition your application such that code that handles media from untrustworthy sources runs in a relatively sandboxed environment. For example you could process media from untrustworthy sources in a separate process. Though both processes would still run under the same UID, this separation does make an attack more difficult.

Platform issues

This section describes known issues in the initial platform release which supports these APIs.

Dynamic interface management

DynamicInterfaceManagement::AddInterface does not work. Instead, specify the interface in the array passed to Create, as shown in the example code for environmental reverb.

References and resources

Google Android: Khronos Group: For convenience, we have included a copy of the OpenSL ES 1.0.1 specification with the NDK in docs/opensles/OpenSL_ES_Specification_1.0.1.pdf.

Miscellaneous: