Hacker News new | past | comments | ask | show | jobs | submit login
Sub-10ms roundtrip audio latency on Android (superpowered.com)
142 points by vlaskovits on June 16, 2016 | hide | past | favorite | 46 comments



> We updated our findings with the recent improvements in Android M (Marshmallow)

> opening a market of 1.1 billion devices

Minor nit, but considering that google says Marshmallow is on 10.1% of devices [1], that figure is an order of magnitude too high.

[1] https://developer.android.com/about/dashboards/index.html


"We updated our findings with the recent improvements in Android M (Marshmallow), and recently released a solution for Android’s USB audio and MIDI challenges, opening a market of 1.1 billion devices for pro audio application creators."

Those are two separate things:

1. Updated analysis on the latency with regards to Marshmallow.

2. Superpowered USB Audio and MIDI SDK opens a market of 1.1 billion devices.


A brief skim of the article they link regarding Marshmallow [1] provides this quote:

> As readers may recall, 10ms round-trip audio latency is the threshold that must be met by Android to be considered truly 'pro audio'.

This wasn't possible before Marshmallow (per aforementioned article), and therefore pro-quality audio apps are still only truly possible on 10% of Android devices. So the market is opened up for 100 million devices.

I'm not saying it's not a huge number of devices. But it's not as huge as the blog states.

[1] http://superpowered.com/superpowered-android-media-server


I'm one of the authors of the article, and the CTO of Superpowered. I apologize if the article was confusing.

Marshmallow improves audio latency here and there on some devices, but even Android N has the fundamental audio stack problems we mention in the articles.

There is no Android version delivering <= 10 ms round-trip audio latency.

Audio can be handled by built-in hardware or external (USB) hardware.

External: Android is not able to deliver glitch-free low latency audio on USB devices. We introduced a solution for this last month, that's one of the links on the top of the article. It opens up a market of 1.1 billion devices (meaning, Android 4.4.4 and up) for USB audio on Android.

Built-in: The current article introduces a demo solution for the built-in audio hardware.


It's open to more of them the 10% that currently have it. Whether the OEM provides a build of Android Marshmallow or not plays a roll for most consumers. Quite a few consumers have the option to unlock their device and bypass the OEM to get Android Marshmallow on their device (Cyanogen et al).


"It can be installed on any Android userdebug build or rooted production build."

"Installing on Samsung Galaxy devices is not recommended"

So, not for apps on devices today.


You're correct. It doesn't seem to be for consumer oriented products today. But it's a start...

I guess they're targetting some professional audio gear that have an Android-based component in it. Or hope to get this sold to phone/tablet vendors who would like to build a music-oriented product. Or get acquired by Google and get integrated into Android proper. Or just make Android a viable backend for their cross-platform audio sdk.

Can anyone from Superpowered chime in and shed some light on who or what you're targetting?


Hi there,

We're licensing the Superpowered Media Server to OEMs who want to differentiate their Android build by bringing latency down to pro audio levels.

As we say in the article, we squeeze latency out of user space with our zero latency SDK as well as out service space with our new server.


What is the business case here? A provider of audio software can write an iOS app that will out of the box avoid these latency issues for the variety of iOS hardware out there, or they can take on the task of creating hardware, customizing Android, using your media server and SDK, in order to avoid those latencies on Android.

One path seems easier and cheaper than the other...


If you want to develop an app that depends on low latency such as an interactive audio app or DAW, then on iOS that is entirely doable.

If you want to do the same on Android, or port that same app from iOS to Android, than that is virtually impossible.

From: http://superpowered.com/androidaudiopathlatency

"Many mobile apps that are critically dependent on low latency audio functionality such as some games, synthesizers, DAWs (Digital Audio Workstations), interactive audio apps and virtual instrument apps, and the coming wave of virtual reality apps, all of which thrive on Apple's platform (App Store + iOS devices) --- and generate big revenues for App Store and iOS developers are largely non-existent on Android. Android Audio's 10 Millisecond Problem, a little understood yet extremely difficult technical challenge with enormous ramifications, prevents these sorts of revenue producing apps from performing in an acceptable manner and even being published (!) on Android at this point in time. Startups and developers are unwilling to port and publish otherwise successful iOS apps (with ~10 ms audio latency needs) on Android for fear of degraded audio performance resulting in negative word-of-mouth and a hit to their professional reputation and brand. Consumers lose because have a strong desire to buy such apps on Android, as shown by revenue data on iOS, and currently, are unable to do so. One can appreciate the scale of this problem/opportunity when one takes into account the so-called 'next billion' consumers who will be 'mobile-only'."

No app developer will, as you describe:

"take on the task of creating hardware, customizing Android, using your media server and SDK, in order to avoid those latencies on Android."

But OEMs seeking to differentiate their Android builds will integrate Superpowered Media Server, allowing for low latency functionality.


What's iOS' audio latency?


About 7-12 ms. Actual benchmarks by an audio SDK maker here: http://superpowered.com/latency


And it's been like that for a long time. Last time I did mobile audio programming (2009? 2010?), the Android test devices I had (Galaxy S2?) were operating around 100ms audio latency. Glad to see they've improved!


http://superpowered.com/latency

Generally around 6-8ms.


And how about macOS?


Depends on how good your soundcard's drivers are. But I think Mac OS X is capable of something like 48 samples at a bare minimum? Therefore it depends on the sample rate you're using as well. Higher sample rates yield lower latencies but higher CPU usage. At 44.1khz that's about 1ms.


How does this compare to the existing audio tunneling in some versions of Android?

(The best writeup of this I've seen is https://github.com/felixpalmer/android-visualizer/issues/5 which details some of the problems that audio tunneling can cause)


A good first step is probably to read Android's own current situation page on the matter and look at the numbers.

http://source.android.com/devices/audio/latency_measurements...

Basically it has been getting and better with new versions of Android and better hardware, but it still has room for improvements.


Awesome stuff. Solid to see such a nice side-by-side integration without interfering with the existing media server.


Thanks!


I'd be interested in seeing what the android devs could do if they included specs for FPGAs into phones.

I think that would be a very fast way to get android to connect any IO to any configuration they need very quickly.

There also seems to be a growing trend to include ARMs inside actual FPGAs.


So this is kind of like ASIO drivers for Android? I hope it gains wider applicability, it's about time.


So why is this so hard for Android but seemingly easy for every other OS out there?


The article kind of touches that:

"Ultimately, the Android audio tack has fundamental architectural problems stemming from engineering decisions made at Android’s genesis, well before Google’s acquisition of Android."

Much of Android was designed for simple apps written in Java and running on resource-constrained devices. This is quite far from the reality of today. Perhaps it allowed Android to get to the market early and get a thriving app ecosystem, but billions of dollars have been poured to fixing those "mistakes" from the early days.

Audio apps are pretty much the polar opposite from what Android was originally designed for. Audio processing needs to run on native code or at least without ever invoking the garbage collector or any other unpredictable source of latency.

iOS and OSX audio stack is completely different and achieves a very low latency. Many musicians prefer Apple products because the audio works well out of the box. You might be able to achieve similar performance with Windows or Linux, but it's not as easy.

While the Superpowered has a neat improvement on Android's latency figures, 8...24 ms is not good enough for professional music use (and they don't mention any confidence intervals). For music apps, audio latency must be consistently less than 10 ms, preferably less than 5 ms.

It's an improvement but I don't think it inspires enough confidence that you'd see musicians performing in front of an audience using Android devices any time soon.


Doesn't have to be less than 5ms -- ~10ms and below (like iOS demonstrates) is great.


Short answer: abstraction layers.

Because Android has to be able to run on so many devices there are a bunch of abstraction layers between the app and the hardware, all adding delay.

iOs does not have this issue.


I bet your iOS also has some abstraction layer, I doubt you get to poke DAC chip registers directly from userspace. Superpowered Media Server is an abstraction layer too and so are desktop OS audio APIs.

And yet, they don't suck as badly.


On iOS, you'd at least be running audio code on native code without garbage collectors or other expensive runtime features. Native apps with low latency wasn't what Android was originally intended for.

Apple has the best audio stack out of the box anyway (desktop and mobile). There are very few music apps for Android, a lot of music apps are iOS only (or target a select few Android devices).


Does this mean we can finally have an app that'd invert the input from the microphone and play it back on the loudspeaker effectively performing surrounding ambient noise cancellation?


Ambient noise cancellation is more complicated than that. Consider that as you changed your distance from the speaker you would be going into and out of phase with the signal, you would effectively be doubling the ambient noise at some locations, and cancelling it at others.

Also 10ms response time would definitely be too slow for such an application.

If that worked, you could just use an opamp with a sub-microsecond response time with a cheap mic and speaker for a couple of dollars, and I'm sure China would have flooded the market.

I would give anything for a working ambient noise canceller. I would point it at the leafblowers outside while I sip champagne.


What about for application in a headphone? You wouldn't have issues with changes in distance, as your ears would always be a set distance away from the drivers.

For example, what is the latency on Bose's QuietComfort 25?


You would need a latency of something like 0.01ms or even less.

If I remember right such a low latency is basically impossible, so sound canceling headphones have to try to predict the sound before it happens.

There are large problems with simply the speed of sound - you don't really have time to emit the new sound, and if you move the mic farther from the speaker you don't accurately match the sound, and can make things worse.

That's why the headphones use fancy algorithms to try to guess what the sound will be, and that's why good ones cost so much, instead them being a simple analog circuit.


I would imagine that the physical inertia of the speaker would also be a factor.

Opamps can respond that quickly electrically, but you need more power the faster you want to move the speaker cone.

Say the mic is 5mm from the speaker. Sound travels at 340m/s. so 0.005/340=0.00001471s. So it takes 14.71 microseconds for the sound wave to travel that 5mm. That sounds awfully quick to have to reactively move a speaker cone accurately. Obviously speakers can move quickly enough to play at the correct frequencies but I don't know what their latency would be.

I'm sure there are other complications that I'm forgetting but it's definitely harder than you would think to cancel out sounds.


Reflections also make this not work.


can someone explain the practical use case?

- future apps can implement the code for less latency on the current android version?

- future roms can have less latency with this code?

- existent apps can have less latency with a 3rd party app helping?

- I need the loopback dongle?


The practical use case is that you will see commercial Android builds that will have integrated the Superpowered Media Server, allowing Android to become the great music and entertainment platform it should be.


thanks. Great!


Hooray! Congrats!


It often takes over 500ms to get voice through the cellular network. Worrying about 10ms in applications seems excessive.


It's for real-time music/audio apps, like virtual pianos, etc. These can be 100% local, and excessive latency is frustrating to the user.


It's not used for that. It's used for applications that need audio feedback after a trigger. In those cases 14ms+ is already way too much to the point of being unusable. For example: https://www.youtube.com/watch?v=_eE7NtK4jX8


It’s not for phone calls but real-time audio applications, e.g. music creation apps or games.


It will be useful for apps, esp. those targeting musicians. Iphone had sub-10ms latency for a long time.


Not only is 10ms roundtrip latency critical to the interactive audio type applications mentioned by others -- it is critical for VR too.


About 10 ms is the maximum latency that can be tolerated in music applications. Preferably it should be around 5 ms.

If you're a musician, you can easily notice when the sound doesn't come out when you press your finger on the keyboard or sing to a microphone. 5 ms is not noticeable at all, 10 ms feels a bit off but is tolerable, above 25 ms is clearly noticeable and very annoying (ie. it's no longer a musical instrument).


This latency mentioned is useful for music and VoIP apps.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: