This is way better than it used to be (I measured half a second latency in 2012: https://www.jefftk.com/p/android-sound-synthesis-and-latency) but it is still too high to let you use an Android device as a musical instrument. This is something iPhones have always been able to do, and one of my biggest disappointments with Android.
(Disclosure: I work for Google, speaking only for myself)
I agree audio at minimum should have been done differently on Android.
The latency(different type I believe) of Spotify applications and even YouTube applications is pretty bad compared to an Apple device. Sometimes the audio just fails to respond for half a second or so. This is not an interface bug from my experience.
This data [0] doesn't have the latest models, and it's meassuring latency internaly, not "touch-to-sound". The ones they tested average around 7-8ms. I've heard of <10ms as the benchmark for "instant" in audio.
Edit: I've been thinking about this and a touch-to-tone latency below 10ms seems imposible, even with the 120hz digitizer on newer iPhones. I would be very interested in seeing some experimental data as well.
The best I could find was https://developer.apple.com/forums/thread/71301 which suggests a 32ms touch latency. I think that might include some processing, but it doesn't seem much better than the 20ms latency in Android mentioned in the article.
The audio processing latency seems to be single digit on both systems.
In any case, I don't think these differences are big enough to argue that "Android can't be used for instruments".
Yeah, I think the numbers in the article should be good enough for a bunch of music use cases, but remember it used to be like half a second which is obviously unworkable.
Well financed developers like ROLI already have instruments on android. I think the incredible instruments available on iOS are also there because the App Store promise actually kind of works for them. They write a great application, upload it to server and people buy it for more than 2.99$, upfront. And unlike VSTs, no one even pirates it. The general idea/stereotype is that no one wants that on android. An open source port of VCV Rack would be feasable and wonderful though. That's the only free instrument I have installed on my music iPad.
We should not forget that our brain is great at adjusting to latency. In the old days, when I was excessively playing Quake 3 Team Arena it took me just a couple of minutes to adjust to network latency depending on the server I was connected to.
Professional musicians do this constantly and are much better at that.
What. Our brain is absolutely terrible at adjusting for latency. It can somewhat compensate, but only to an extent, and at severe cost.
There’s this video around that puts people in a VR helmet playing A/V with a delay, even the lowest setting completely destroys any form of efficiency.
Anything below 60Hz in VR has a significantly increased chance of giving motion sickness due to latency.
Playing Super Mario Bros in an emulator is strangely subtly harder than on the real NES hardware, just because of a slight lag added (can’t recal on NES but on SNES buttons literally bit-flip the RAM, hard to beat that). You can train all you want on an emulator, when you get back to the real thing it’s obvious it was just so crippling.
Try Guitar Hero / Rock Band with various audio / video latency settings. 10ms is obviously impactful, anything higher borderlines on unplayable, and even 5ms has a subtle impact increasing missing notes.
As for pro musicians, they absolutely loathe latency. This is why bassists (tempo) / drummers (rhythm) / conductors (tempo) exist: they serve as a single source of truth to sync on. In fact, we’re so sensitive to latency that the slightest deviation is unbearable, hence why following them put you back in sync. Same for effect pedals, amps, feedbacks... any latency is destroying.
Try playing on Zoom, it’s absolutely impossible (unless you assume latency is fixed, which it is not, and play in a tempo matching latency exactly, and follow the lead but time-warp your play by one beat, which I imagine is possible as a sort of canon but absolutely horrendous) unless you completely ignore the other part and find a common time source to sync upon.
Now you mention Quake 3, and it might just be you were severely misled by several critical innovations that game introduced, like network prediction and frame interpolation, negating most of the practical effects of latency.
Input in NES is not lag's source. Input is only read once per frame. Today it is very simple and cheap to improve over old controllers.
Lag comes from video. If you do not use FPGA emulators like Mister the video delay is enormous.
I don't understand what you are calling 10ms lag. Have you measured the delay between your computer and your screen? It is way bigger than that, in any computer or NES console(console-TV).
Are you maybe referring to 10ms NETWORK packet delay? All TVs or projectors use buffers, and it takes time for information to travel around serial cables.
Network delay is inconsistent, but electronic delay is not.
> Input in NES is not lag's source. Input is only read once per frame.
That’s exactly what I said ;) By virtue of being a different platform, emulation introduces a lag that is just not present on the real hardware (and can’t be lower by virtue of literally being a wire to memory). USB alone often introduces an astoundingly high minimal floor on latency.
> I don't understand what you are calling 10ms lag. Have you measured the delay between your computer and your screen? It is way bigger than that, in any computer or NES console(console-TV).
Yes I have measured it (on all my devices, accurately so) in order to minimise it as much as possible (I can work with but can’t stand latency, even in a terminal).
Rock Band, which I used to be an avid player of, allows you to adjust a relative delay (audio and video separately), timewarping A/V so that all three are perfectly in sync relative to real-world input.
> Network delay is inconsistent, but electronic delay is not.
I agree (and agree to buffers as well), but by and large any modern device is far removed from being just electronic, from USB (USB 2.0 has a 125us polling loop for multiplexing, USB 3 is point to point so can fare much better at 30us, but even then there’s kernel context switch to read and process that data) to HDMI (packet based with loads of multiplexing), many things end up being surprisingly subject to soft-real-time firmware/software before it hits the display itself. Hard real time stuff probably exists because the specs now allow for it but good luck finding that in consumer space where it’s much easier (read: cheaper) to push a best effort implementation.
But my point isn’t about the minutiae of lag source, it’s that the human brain is severely impacted by lag, can adjust to some extent but the cost is paid heavily at multiple levels and with various efficacy crippling consequences; and that Q3 and subsequent games have increasing numbers of facilities to create the illusion that everything happens in sync so that we’re not so impacted by it.
The brain can compensate for lag/latency up to a point. Church organ players often have to deal with >100ms of delay between pressing a key and hearing the note produced through those big pipes.
If you have an app which produces a sound when you tap on the screen, and there's a delay of say 70ms between tapping and sound being produced, it's easily possible to play in time with an external clock source, like a metronome, as long as the delay is fixed.
In the 1968 paper "Response time in man-computer conversational transactions" actions which occurred within 100ms the human input were perceived as instantaneous. In my experience this holds true with musical instrument apps, although one must first learn how much latency is present in order to compensate for it when synchronising with other clock sources.
Can confirm that many musicians have to play with latency. Pipe organs, heavily processed guitars (delay is an effect that is very commonly used) all are laggy and you have to play ahead of your accompaniment.
True for organists (for which I don’t know much about, but read on, as I believe the following applies to them as well)
The fact that the effects are laggy doesn’t mean you need to sync on them, you’re mostly listening whether the sound envelope is correct (e.g adjust for overshoot/undershoot a bend). Checking the output of effects only matters at timescales of whole beats/measures in terms of how you have to react, definitely not like keeping up with tempo, which has to be precise across the band.
You don’t get delay on drums, and exceedingly rarely so on bass, which is what you have to react agains efficiently, plus the instrument you’re playing provides its own subtle and not-so-subtle haptic feedback. On a guitar, this means the moment you pluck the chord is based on your internal clock synced by listening to the tempo giver (usually bass).
> Try playing on Zoom, it’s absolutely impossible (unless you assume latency is fixed, which it is not, and play in a tempo matching latency exactly, and follow the lead but time-warp your play by one beat, which I imagine is possible as a sort of canon but absolutely horrendous) unless you completely ignore the other part and find a common time source to sync upon.
Zoom's latency isn't fixed, so that won't work. On the other hand, it is possible to do something like this over the internet by measuring latency and then adjusting for it: www.jefftk.com/p/bucket-brigade-singing
> Anything below 60Hz in VR has a significantly increased chance of giving motion sickness due to latency.
I think there is a subtle difference between fps (avg) and latency. I would say that a 'stable' latency is always better than a high fps.
> Try Guitar Hero / Rock Band with various audio / video latency settings. 10ms is obviously impactful, anything higher borderlines on unplayable, and even 5ms has a subtle impact increasing missing notes.
Do you mean that you set an additional latency for both video and audio? Or you delay only the audio by 10ms?
> I think there is a subtle difference between fps (avg) and latency
Yup, I only mentioned 60Hz, because 16.667 ms is more obvious WRT latency but less known.
> Do you mean that you set an additional latency for both video and audio? Or you delay only the audio by 10ms?
No, RB allows you to set up the audio and video latencies, and since the track is fully known, it’s able to apply negative latency to everything but the input (which, being physical user input, is the only thing that can’t be altered).
So you set up input->audio latency and input->video latency values and the game will internally play them ahead, so that they appear exactly in sync in the physical world.
There are 120hz ipads, so wouldn't that drop to 16ms?
A physical piano with a 1.2" hammer blow distance played at soft velocity (~2.2mph), assuming distance of closest resonating surface to ear is half meter, would have around 35ms latency.
Xiaomi has hit 480hz touch sampling with the Mi 11. It looks possible in hardware and in software, someone just has to manage to perfect both in the same phone
I think typical newish (2014+) devices have between 5-9ms latency. Having core audio from the start was a huge leg up for iOS for sure. I am mostly platform agnostic unless I am working with audio where I pretty much only use Apple products.
The article and all the responses here express "latency" as a simple scalar number. Is it actually more like a histogram of values than a scalar, and the marketing number is the 90th %ile of that histogram?
My impression is that Android has more latency variance than iOS. If that's true then it's important to be clear what the reported numbers represent.
If you open the audio context you should see consistent latency as long as you keep it open, but each time you open a new audio context you might end up with different latency. You should not see jitter within a single context.
Having used nearly every flagship android device from the HTC Dream up through the Nexus 6p, and then switching to an iphone 6 and never looking back, except watching acquaintances use their android phones.
Also from talking to an engineer who worked on earlier generations of iphones, who said that every team had a latency budget that they could not exceed.
On the 3D side it is similarly disappointing, as Google never put the effort of proper frameworks similar to Apple's kits, and SceneForm on Google's tradition is now deprecated after VR on Android lost its marketing drive.
If android can get its audio latency act together there is some real opportunity here to steal market share from Apple. Since the release of the 2nd iPad Pro Apple has forced audio processing onto the slower low power CPU core, this means that for the last three years I can run all my iOS synth applications at 96kHZ with a low buffer sample size without any drop outs on my first gen iPad Pro and an iPad mini 3. What I cannot do is get similar performance on last year's iPad Pro. It is absolute insanity that Apple has no work around to direct audio processing to the high powered cores for apps made for musicians. Presumably at some point they will allow you to do this with a documented programing call but it has been three years of bad performance on synth applications due to the necessity to have high sample buffers (ie high latency).
I've noticed an increase in measured latencies, but didn't know about that slower CPU core thing.
Sad to see Apple take such a direction - musicians appreciated having proper hardware/software and Apple definitely made money on all the paid content. Why let this arrangement fall by the wayside?
Responsiveness and low input-to-result latency was why I chose iPhone over android 10 years ago even though I prefer a more open environment like android. Sounds like things haven’t changed a whole lot on that front.
Sometimes I wonder if these increased levels of abstraction and even digitization itself (or at least the Internet-ization of things) is a mistake for human-machine interface. Watching drone racing pilots use analog controls and analog video headsets in spite of the potato resolution because WiFi/digital/etc just add too much latency is kind of eye opening and makes you sort of reconsider the last couple decades. (Improvements have been made so digital isn’t so high latency and VR has served to address a lot of these latency concerns, but it’s still kind of interesting...)
Can you really install a custom rom, when it flushes proprietary firmware for eg the camera? There are only a handful of devices that doesn’t do that afaik.
It used to be possible to dip your toes in with light modifications while still keeping everything working. Now with verified boot chains, safety net and hardware backed verification, you either have everything fully locked down or you go full Richard Stallman and have everything foss.
That's objectively wrong. Less than 1% of apps require verified boot/safetynet/hardware attestation.
I'm running my own Android ROM without cheating Safetynet and I still have access to everything I use using their original unmodified uncheated app: Netflix, Spotify, my banks, all my connected objects (withings, Philips hue, bluetens, neato), or some other random apps.
There are things that don't work, it's true. The only I have witnessed are Pokemon Go, and NFC payment apps (like g pay). This is very far from the "go all floss RMS-style"
Yes, I'm all against data collection and I'm pro open source, but let's be honest, Android, despite its flaws (of which it has many), has always been the best MAJOR choice in that regard.
Of course you can always pick up a Librem 5 or a Pinephone, but I'd argue it comes with a lot of drawbacks.
For comparison, the Nintendo Switch cardboard piano works by having you press a cardboard key with a highly reflective sticker on the other end, to lift that sticker above a barrier, so it can be seen by the game controller's integrated IR camera, which streams the image over to the switch over bluetooth, which then runs some image analysis to figure out which key sticker just became visible and THEN plays a note and the latency is STILL so much lower than on Android, that you can actually play it like a musical instrument.
The wiimote has built in IR blob extraction from the sensor on device, so I think he is wrong unless Switch/joycon changed things. In some modes, which piano may use, it basically is only sending sparse location data of 4 brightest detected light blobs.
It'd be nice if something was done about the extra 100-200ms (!!) of latency that Bluetooth headsets can add[1]. I understand this is largely a problem from the manufacturer problem, but I'd love to see Google lead the way with some sort of certification program, or even make their own low cost BT chipset for 3rd party headphones to use, to improve the situation.
Of course Android is already world's better than desktop, where 200ms is seemingly closer to average than an outlier.
It is also kind of cool that perusing that chart, it seems headsets connected to Android have, on average, lower latency than when connected to iOS!
There are low latency bluetooth protocols, if you sort by latency on rtings for aptx-LL you can see some headsets are down below 40ms. Of course, this relies on both the host and the headset having support for the protocol and being engineered well. SBC and aptX just aren't designed for low latency, and low latency is something that can't be hacked into every codec after the fact.
I'll argue that an device meant for phone calls (which is where bluetooth audio started out!) should aim for low latency by default. I also understand that the CPU/Latency/Memory trade offs made over a decade ago are not the same one's we'd make now.
Doesn't change the fact that the same device can have a 3x latency difference between platforms with the same codec.
Google went and developed VP9 as a royalty free codec in the AV space, they should do something similar for Bluetooth. Right now if I'm on a video call with someone and we both have BT headsets, it is very possible that there is a 500ms audio latency!
Latency on calls over cellular networks is already very high (my subjective experience). But we humans are good at working around it when it's just two people - wait for the other person to finish speaking before you speak, and the other person might have a little longer delay before hearing your response.
If there are multiple parties on the call, this breaks down, because if two people want to speak, they both start talking and don't realize it for a little while.
On video calls this also breaks down, because either video and audio get out of sync (like with zoom) or you get odd artifacts in either the video or audio as they are kept in sync.
> I'll argue that an device meant for phone calls (which is where bluetooth audio started out!) should aim for low latency by default. I also understand that the CPU/Latency/Memory trade offs made over a decade ago are not the same one's we'd make now.
It turns out that the device that's meant for phone calls has to, first and foremost, make sure that there are no dropouts, crackles and other buffer underrun articles. Latency comes afterwards.
I would be happy if something could be done to prevent audio latency for all devices on my system from permanently increasing if I ever connect a bluetooth audio device.
On Linux, I have to restart pulseaudio after using Bluetooth audio, otherwise retroarch is unplayable.
BTW where are you getting numbers for 200ms latency being average on desktop? My understanding was that latency is typically around 40-50ms on desktop if you are using analog speakers.
I've never understood why Wi-Fi latency is only 2-3ms while Bluetooth is 100-200ms when they use the same frequency band.
Is it intentional to ensure there's plenty of buffer so audio packets can be re-sent when they fail, and the audio is seamless? In other words, is 100-200ms just inherent at that frequency with expected interference?
> I've never understood why Wi-Fi latency is only 2-3ms while Bluetooth is 100-200ms when they use the same frequency band.
You're mixing up two different things. Transport latency of WiFi is 2-3ms, but for bluetooth you're suddently talking about a transport, codec, receive buffer and playback code latency. That's not an apple-to-apple comparison.
If you want comparable numbers, measure the latency of audio being streamed from one computer to another via PulseAudio and you'll quickly see that it's more than 2-3ms. And there's a reason for that: most people strongly prefer their music to not stutter and work properly with all kinds of headphones instead of having absolutely no latency. Most people prefer to shove their phone in a separate room, back pocket and walk around RF saturated areas and still listen to music without interruption.
Hence the manufacturers err on the side of buffering some audio so it's more resilient to connection dropouts. You wifi, after all, doesn't need to move around busy highways and public transport systems.
I'd assume a combination of larger buffers for seamless error recovery and also for lower power consumption. The receiving side of that bluetooth connection is also typically extremely battery constrained.
To have minimal latency basically everything needs to stay active constantly. CPU is constantly awake sending or receiving data. Radio is constantly awake. Interconnects constantly awake. If you instead send a large buffer, you can sleep between bursts of work. Dropping to lower power states in the interim, and only a small DSP doing the audio output needs to be awake.
Overhead that’s not linear with the buffer size. The cost (in CPU time, etc.) of updating a 480 sample buffer is similar to that of a 64 sample buffer.
The figure of most interest to someone playing an electronic instrument is "tap-to-touch latency". The article indicates a minimum of 43 ms for that (28 round trip latency - 5 audio input latency + 20 touch latency). That's like ten times what you'd hope for.
Emulators of old retro games also care about audio latency, as the emulated audio is generated on the fly by a turning machine, which makes it very hard to match visuals to audio if there is inherent audio latency.
The 20 ms tap-to-touch reference is from 2017, maybe this has gotten better in the past 4 years? But maybe not. And per the source, it's possible to get down to 10 ms round trip latency if you buy the right phone.
60Hz touch scan rate (16.67ms) was common in 2017, but now we can easily buy a phone with 240Hz(4.17ms) touch scan rate. It should be much improvement.
Andrew Huang recently made a YouTube video [0] that provides some insight from the perspective of a music producer and someone who recently created a music production app. It's not overly technical, but it highlights some of the challenges one might face.
As the article says, there is room for improvement, so I guess it's good to see that the long term goal is 10ms round trip.
That was a strange video. It seemed like he started talking about latency between Android and IOS devices, which makes sense given that this article talks about response time of just under 40 ms like you said. The article says this is "well within the range required for real-time applications" if you define real-time applications as non-pro-audio stuff. If you're playing an instrument and it takes 40 ms for you to hear back the sound, that's Not Good.
But then the video takes a strange turn and starts talking about "Stability" of mac vs PC, and I had to turn it off. It made the argument that "I haven't seen any professional producers use anything except macs" as its main argument.
It would have been nice to hear some more technical information like mentioning core-audio, or something else.
Regarding the article itself. I'm excited that Android is target less than 20 ms round-trip latency. I'm hoping that this work can be brought back to the Linux Desktop. Messing around with Jack and dealing with xruns is something I wish I could stop doing.
To be a little pedantic: He is not claiming that producers are never using PCs, he is claiming that live audio never runs on PC, because stability is the top priority there. Later he says in the studio you can tolerate a little instability if the other benefits of windows are worth it.
That no producers are using windows PCs is an obviously false claim. There are DAWs that don't even run on mac. There are also live production tools that don't, like Notch, so there must be someone running windows on stage, anyway.
That macs are more stable in day-to-day operation is not really a bold claim though.
> That macs are more stable in day-to-day operation is not really a bold claim though.
It absolutely is. Do you have any data to back up the claim? Anecdotally windows 10's stability seems vastly superior, especially with every release of MacOS seemingly regressing further and further.
I just ran ‘uptime’ on my Mac terminal. 194 days. I pretty much never reboot the thing. It’s just always stable, despite hundreds, maybe thousands, of sleep-wake cycles, usually a dozen apps open at one time, the thing is as stable now as it was when I rebooted it some six months ago.
I used to use windows, for a decade, and certainly never had any experience like that.
The only time I ever reboot Windows 10 is to apply updates, which mac os also forces you to do. Windows doesn't even go down if a driver crashes anymore.
Updates alone put my old macbook pro at lower availability than any of my windows or Linux systems since MacOS manages to have a uniquely slow update system. I don't get how major MacOS updates manages to be several times slower than fresh installs of Windows or Linux.
Linux audio can be real bad but the one thing that could possibly make it worse is taking anything from Android. Android is the only computing platform I have ever used that was unable to play a simple audio file through without the audio dropping out at least once (I had the same experience on multiple devices, 5+ years apart).
I've started to. But I haven't done enough research on how all the other applications I use ( Carla, Ardour, soft synths, etc ) will play along with Pipewire instead of jack.
>"I haven't seen any professional producers use anything except macs" as its main argument.
Yeah that is both totally irrelevant to the topic at hand and also a bit strange. I have only seen PCs in the pro studios I have used but that doesn't make me believe nothing else is used.
Same here. I've even seen an increased number of Linux-based studios, oftentimes running Bitwig or Reaper. I don't doubt that audio latency can be a little high on some workflows, but I've honestly had a less laggy experience with Pulse than Coreaudio. Out of the box, Bitwig has a latency of ~8ms on my (Linux) machine, as opposed to 20ms on my Macbook.
Usually, "an update on [blank]" is euphemism for said product getting cancelled.
More to the point, one of PulseAudio's claims was that they consumed less CPU than Android's audio pipeline. With the recent hyped PipeWire, I wonder how that compares to Android's Oboe.
That's actually the joke here, which got a chuckle out of me. The idea is that excessively large audio latency itself has gotten cancelled.
I haven't played with it myself (it's been a few years since I was actively involved in Android audio latency), but from what I've seen, AAudio is pretty good, probably close to what the hardware is capable of. I'd very much encourage people to do empirical measurements against, for example, PipeWire (which also looks good). Do keep in mind that constraints are different on mobile, and things like power management can and do get in the way of down-to-the-wire latency.
> I'd very much encourage people to do empirical measurements against, for example, PipeWire (which also looks good).
A rule of thumb from reading linux audio mailing lists but which probably works generally-- completely disregard any claim of measurement of latency that isn't prefixed.
For example, the Google blog labels the Y-axis with "round-trip time in milliseconds." They're clearly and explicitly measuring round-trip latency. They then piggy-back on that well-understood concept to introduce the concept of "tap-to-tone latency" and show measurements for that.
Almost to the case, the times I've read a back-and-forth conversation where the word "latency" is used unprefixed, the commenters might as well be describing the warmth of vinyl recordings. Often they are simply describing an integer they typed into a widget or config file.
On several occasions I've witnessed users insert Jack between a single application and ALSA because "that's what the professionals use." These are knowledgeable users who want as little latency between the system and their ear as possible.
Where was that claim made (for pulseaudio vs Android)?
Hearing any claim that pulseaudio uses less cpu than some alternative makes me blink. For years I could not use pulseaudio while playing games, because it would eat 99% cpu and I only had one cpu core. Things are better now that I have eight cores, but I don't know if that's due to the extra cores or if pulseaudio has improved.
I tried jackd, but the documentation is overwhelming.
I remember ESD and OSS working just fine, but I was young and maybe I was just happy to get audio at all. I grew up with blips and bleeps coming out of the PC speaker.
The improvement is great, but I think the audio professionals jumped ship long ago to Apple, and they're not going back. I know of several music professionals plugging into their Macbooks or iPads as part of the pre-amp, but I don't know if any using Android. Google sat on this for way too long while Apple courted the creative professionals, and for any that have already made the upfront buy-in for Apple hardware, I don't see them taking the risk of switching ecosystems to save a few dollars for performance that still won't match what they're used to.
There will always be a niche that wants to do music production or engineering and can't afford Apple productions. Newbies or living outside the western world for example.
Just because the pros have merged towards a single brand doesn't mean that the only other smartphone monopoly won't benefit greatly from offering other high quality solutions.
> There will always be a niche that wants to do music production or engineering and can't afford Apple productions.
If Google cared about niches, they wouldn't have such a large graveyard of dead products - some of which were profitable and some of which were widely loved.
Features require maintenance the same as products. If this feature is enough to go after entire new categories of customers, aka musicians, this feature effectively defines a new product opportunity for Google. Also Android is already buggy as fuck, so every new feature they add actually is a big deal.
Ever since Google killed Reader I have no faith in them to ever maintain anything that isn't directly selling ads.
This is the reason I don't trust Google. What was the headcount to maintain Google Reader? I'm pretty sure an intern could do it part-time.
Also if it's part of Android, I bet it will unravel a bunch of hidden bugs that haven't been exposed before. Like I said, Android is already buggy as fuck. My Pixel 4a freezes/crashes regularly and Google told me (paraphrasing) "it just does that."
Well first you need a couple of UI people to move the buttons around every once in a while to keep people on their toes. Then you need a few people keeping the backend up with deprecations, a qa person, a few people making sure porn doesn't get into Trends, and a few SRE people to make sure the servers aren't going up in flames.
Perhaps. Though I wonder if every musician afford Macbooks or iPads. These things are not cheap, especially if you consider earnings of non-US based musicians, etc.
This is where Apple's far superior product lifecycle comes in: Buying a four year old Apple product is affordable, still runs the current OS, unlike Androids that old, and has the audio performance down.
This is a valid point, but it’s hard to place the blame on Google/Android. I think the challenge is less of Android itself but a refusal of the OEMs to support newer OS versions. This makes business sense, but sucks for consumers.
I think Google is doing a better job with their Pixel devices. I honestly hope they try to lead the Android industry more with them.
Blame, here, can't be squarely placed on any one entity. But it is shared between both Google and the OEMs. Google created the Android product, and permitted people to sell devices utilizing Android and to brand/advertise their devices with Google's Android branding.
It absolutely does, in part at least, fall on Google to establish how the Android label can be used. Their decision has (apparently) been that it can be used by anyone even if their devices become (practically) obsolete in just a year or two from various perspectives (not in an absolutely obsolete sense, but a lack of many/most/any updates is a crucial element of currency for mobile devices and hastens obsolescence). That's on Google, not the OEMs who take advantage of it.
I'd like to say the OEMs should do better, but the market has clearly convinced them that they can be profitable and sell soon-to-be-obsolete devices.
A four year old Android is years out of support to Googlew. You get two, maybe three, years from them at best and then security patches until they get bored.
That's an old device. Notice I said now(as of 2017). Google announced 3 year OS updates a year after the Pixel was released(which they specifically stated at launch would get 2 years of updates)
The Pixel 2 got updates for 3 years. I know because I have one sitting next me right now.
Honestly, I don’t care. I was forced to buy a Pixel 1 because the Nexus 5X didn’t get updates anymore, and by the time the Pixel 1 stopped getting updates, I just gave up.
I won't support Android 11 and later in any of my apps. If someone has an issue with that, they're free to send me a device getting the Android Preview releases for the version they want me to support. I treat Android just like iOS now: Want support? buy me a devkit.
In the past it was possible for me to just build AOSP master from source and test with that, but 3.0 was the first release breaking that assumption and after 5.0 the assumption was broken entirely. Now I just won't support it anymore.
Musical instruments aren’t too cheap either. Drummers and guitar players can get good stuff secondhand, but most synthesizers will run you way more than a mid-range MacBook and a copy of Logic. God help you if you need a piano...
The big problem is there are few great Android tablets (mostly from Samsung, previously Huawei), and small community to buy them. It's also make less apps released for Android tablets. I prefer Android but I choose iPad because of this situation.
Off topic, but would be nice if they could do something - presumably far simpler - about the lack of fine-grained volume control.
It might be a niche case, but it's really frustrating using the various sleep-related audio apps at night with earphones and not being able to select a minimum volume that is both loud enough to hear but not so loud it keeps you awake (example: Audible with a sleep timer).
I'm sure there are other uses for the finer control as well.
My ears have aged, but when I was younger, I was frustrated that my flip-clock/radio could not go low enough before the volume hit zero. It had an analog volume dial.
I had the impression the problem was physical limitations with the dial. With fully digital controls, it seems like this wouldn't be an issue. But obviously you need more than the handful of steps the volume buttons on the side of the phone give you.
Some vendors, e.g. Samsung, have done some hacks around there, providing more precise audio controls if you drag the audio slider by hand.
I don't know if there are any programmatic system-wide ways to do the same (every app can implement it in their software mixer though). I haven't had great experience with third-party solutions that obviously cannot integrate with the OS at the same level.
That sounds involved but 100% doable using synthesized volume limiting in-app.
Have you tried pestering Audible? Would probably take a while, but it feels like the kind of thing that would be hard to "un-justify" once you started arguing for it.
That's a nice idea - there are apps that purport (unsuccessfully in my experience) to work around the issue using software though I don't know they use that approach.
Obviously any major app supplier may be reductant to invest significant dev time to offer a workaround for an OS limitation (and Audible is just an example), but yes it's worth raising I suppose, thanks.
I had a quick poke around to see if it's possible to implement something like audio filtering on Android, and found just about exactly that question asked over at https://stackoverflow.com/questions/15385797/hook-simple-aud... TL;DR no, for obvious (albeit annoying) security reasons.
I'm not even sure if it would be straightforward with a rooted device either.
40ms is way too much. To put it in perspective, at 60Hz, that's almost 3 frames of video.
10ms would start to be useful for music, but I wouldn't call it good until it was below 2ms, which is what I run my jackd setup (which requires linux-rt) at.
I'm afraid we won't see good latency until Android moves away from Linux. The good news is that this is bound to happen, sooner or latter, as Fuchsia exists.
There is a huge difference between the latency you need to feasibly pull of a live performance - and the latency you need for high quality audio production.
Why you would be considering doing that on an android though I'm not sure. Though I guess it may be a reinforcing problem, there isn't an ecosystem because the tech isn't there.
The latency that matters is the discrepancy between what you feel with your fingers and what you hear. Your finger nerves respond extremely quickly, and if what you hear is delayed from what you feel by more than ~10ms, it is extremely jarring and basically unusable
Up to now I've been referring to round-trip audio latency. Round-trip latency involves three components in the audio chain: audio input, audio processing and audio output.
I read through the whole article and that's the most detail given about any testing methodology; I don't see any more description of how these numbers are being measured.
My rule of thumb for computer speed measurements and efficiency has always been in terms of CPU instructions; the goal of 10ms, while seeming very tiny, is still roughly ten million instructions of a 1GHz CPU. The other numbers mentioned are proportionally more, which then makes one wonder where all the time is being spent.
Relatedly, I've actually written audio drivers (HD audio for Windows 3.x/9x, long story but I won't digress...) and the output latency measured from the app writing a buffer to the signal appearing at the headphone jack is probably in the dozens to hundreds of microseconds range (thousands of CPU instructions, including virtualisation overhead) at most --- but those OSs have extremely thin abstraction layers, and the call stack between the app and writing to the hardware registers is very shallow.
The fact that newer hardware/software seems to have more audio latency reminds me of this: https://danluu.com/input-lag/
It's a simple test: produce a tone (either on built-in speaker or over wired headphones if using a loopback dongle), measure the time it takes for that tone to reach the audio input.
For devices which I didn't have access to (our team has a limited number of test devices) I used the figures from Superpowered, but with some assumptions/rules:
#1 If AAudio was available on the device I used the measurements from that rather than OpenSL ES
#2 No measurements from custom ROMs
#3 I used the measurements from the latest version of Android which the OEM had released for that device. For example, if a device was originally released with Lollipop but it was possible to upgrade to Marshmallow then I used the figures for Marshmallow.
open recorder
play audio file
loop:
keep listening to input frames for 1-2 seconds
# analyze captured file
look for 2x consecutive appearances of the original audio file
sample := timestamp2 - timestamp1
Q1: Did I get this right? Any hunch about the pros&cons of both approaches?
# downlink/uplink latency
Many of the latency measurements break the latency down between audio downlink and uplink.
Q2: How is this measured?
The android doc discusses using a GPIO to have a zero-latency signal. You swap the signal on a well-known GPIO, and then play a well-known audio file. You then connect the GPIO to a speaker, and measure the latency between the GPIO-fed speaker, and the actual device speaker. As an optimization, you plug both the GPIO and the audio jack that feeds the actual device speaker into an oscilloscope, and get the distance there. Is this how audio downlink latency is measured?
Q3: If this is how downlink latency is measured, how do you implement it in prod devices (no access to the board)? Is this something that can be simulated using the usb-c plug?
# tools
The android doc/some googling points to several tools. I found oboetester, splatency, drrickorang, and google walt. I tested all but the walt one.
The first interesting thing is that oboetest, drrickorang, and walt seem to need an external jack-based dongle.
Q3: Why is this needed for? The audio latency approaches mentioned above should only need a speaker and a mic.
The experience has not been great for any of the tools. Operationally, I'd like to script these tests. Instead, I need to click on GUIs. Also, all the tools I tested seem to fail often, so I have to repeat the experiments multiple times. Also, I miss better documentation on exactly what they do.
We do not use the Larsen Effect any more because it was too sensitive to variations in gain. We now use a random encoded bit stream that sounds like a short noise burst. We can get a better correlation peak with that signal.
> Many of the latency measurements break the latency down between
> audio downlink and uplink.
It is very hard to separate the input and output latency without special hardware (like the WALT device).
You can measure combined input+output latency using a loopback test. Input latency tends to be much lower than output latency. This is because, when the full duplex stream is stable, the input buffer is close to empty and the output buffer is close to full. Then, if there is a preemption, the input buffer fills up and the output buffer drains, providing glitch protection.
You can measure touch+output latency using tap-to-tone. The screen touch latency is about 15-30 msec. If you use a hardware MIDI controller (like a keyboard or drum pad) instead of tapping the touch screen then you can get a “touch” latency of about 1 msec (MIDI is a lightweight and low latency protocol). Then you can get a better estimate of just the output latency.
> The android doc discusses using a GPIO to have a zero-latency signal.
We don’t normally use that technique because it requires special hardware.
> I found oboetester, splatency, drrickorang, and google walt.
Our group supports OboeTester.
> The first interesting thing is that oboetest, drrickorang, and walt seem to need an external jack-based dongle.
The lowest latency path is usually over the headphone jack (either through 3.5mm or USB dongle). To test this path you do need a "jack-based dongle" aka loopback adapter.
The reason why wired headphones usually give you lower latency is that OEMs often introduce digital signal processing for the speaker to improve the acoustics/quality, which can introduce additional latency. Side note: this is why you will often see "best with headphones" on games and music apps.
That said, OboeTester will work over the speakers and mic in a quiet room. That is because the new random bit technique is more robust.
Thanks Don and Phil for the great answer. Some further comments:
* I find interesting that you are getting better correlations when you add a well-known noise burst instead of a less chaotic signal (e.g. a tone or a chirp). In retrospect, it makes sense
* for the uplink/downlink breakup, I see in the Usage.md file in oboetester that you're isolating the downlink measurement with the tap-to-tone experiment. For this experiment, the doc suggests to use (a) the jack to avoid the speaker processing extra latency, and (b) a USB-MIDI input device to replace the touch screen latency (15-30 ms).
2 questions here:
* Q1: I assume that, if instead of the jack, you use a usb-c audio adapter accessory mode, there should be no extra latency either, right? (I'm using late pixel phones)
* Q2: which device are you using for the USB-MIDI input?
> * Q1: I assume that, if instead of the jack, you use a usb-c audio adapter accessory mode, there should be no extra latency either, right? (I'm using late pixel phones)
That's correct, by using either the 3.5mm jack, or USB-C adapter you won't incur any additional latency introduced by DSP to improve the speaker acoustics.
However, the USB path typically does have a few ms higher latency than the 3.5mm jack path.
> * Q2: which device are you using for the USB-MIDI input?
We've used a variety of devices and found the latency differences to be negligible. At the moment I test with an old AKAI LPK25.
> I read through the whole article and that's the most detail given about any testing methodology; I don't see any more description of how these numbers are being measured.
The details usually aren’t that interesting. What you can do is record an input signal, pass the signal through to the output, and wire the output back in to another input.
You end up with two copies of the input signal, one delayed by the total round-trip time relative to the other. It’s fairly easy.
It's certainly interesting because they can have a huge effect on the measurement results. For example, with HDA and I suspect other audio codec systems, what you described can be done with essentially 0 delay by configuring the codec to do the in/out mix itself. If you are measuring output from application buffers to input to application buffers, then how big they are and at what sample rates and bit depths will also affect the latency you see, etc.
It is welcome to hear that Google is working on this because sound latency is critical for visually impaired users using text to speech on their phones. The TTS latency is on top of the sound latency and can be significant, to the point where the UI is unresponsive or falls behind. Any improvements they can make will be greatly appreciated.
It's nice that some things get faster sometimes. Let's celebrate that, instead of whining about all the things that don't get faster, or complaining how they aren't getting faster enough.
> The average latency of the most popular Android phones has dropped to under 40ms, which is well within the range required for real-time applications.
Uh.... It needs to be at least 10 times faster to claim realtime use
You have a point, but my main concern is just the inevitable internet-connected app ecosystem for expensive and cool devices that are going to have a security patch lifecycle of 18-36 months with an expected lifespan of 15 years for second and third owners
(Disclosure: I work for Google, speaking only for myself)