Hacker News new | past | comments | ask | show | jobs | submit login

I would like to know

1. How consistent the output is on a given device (is it 100% deterministic and will always produce the same fingerprint?)

2. How big is the variance across devices (how good of a fingerprint it is).

The article doesn't really do much at digging into that, which really is the most important thing if you want to use it as a fingerprint.

I'm not quite sure why different devices would generate different output, other than differences in floating point computation.




An AudioContext provides properties of your machine's audio stack to JavaScript as an API

https://developer.mozilla.org/en-US/docs/Web/API/AudioContex...

Everything is different for every version of firmware, hardware, driver revision, etc. it’s like enumerating fonts, it combined with other fingerprinting techniques provides a very unique snapshot of you.

Eff’s panapticlick provides an audio fingerprinting example. There exist plugins which will provide some entropy to your fingerprint which also changes over time. That helps a little, but any changes you make really makes you stick out like a sore thumb (your audioprofile becomes even more unique)


FFS why is this information exposed to a goddamn website?


Details like knowing how much latency to expect can be important for some use-cases, such as if you're trying to time things to occur at the same moment to the user.


There’s an obvious solution: don’t allow websites to record or play audio, and don’t expose any of this until permission is granted.

As a major side benefit, websites will stop randomly playing audio without permission.


I would propose an addition for when permission is granted: if (as in the example given) there's a need to time things to occur at the same time then the app can pass that on to the APIs and the browser makes it happen, but the app gets no feedback about how it was done or even if it needed to be done at all - why does it need it? The browser made sure it synced up!

In other words, make it more declarative and less verbose.


I don’t know all the details for this particular issue, but, in general, knowing the latency can be important. Imagine you’re displaying frames and playing sound together and you want them synchronized. You need some way to submit graphics frames and audio so that they arrive at the same time.

The browser could always increase the apparent audio latency by buffering, but that reduces the ability of music apps to perform well.


What I'm suggesting is the code in the app doesn't work out how to submit graphics frames and audio so that they arrive at the same time in the browser, but that the app's code tells the browser how to sync up the graphics frames and audio that it receives.

Move responsibility for the syncing to the browser and then the app doesn't need to know anything. In short, I can put it together and send it to you, or I can send you the bits and tell you how they should be put together.


Can you explain how this would actually work?

A graphics frame appears on the screen at a specific time. (For VR, it is a definite time, and this is critical. For normal video or games, a little bit of slop, maybe a few ms, is probably okay.)

For audio, humans are sensitive to 10 ms deviations or even less.

Any API that works decently will need to synchronize audio and video, so there needs to be a way for a program to say “this audio sample should play are the same time as this video frame is shown”. But an API should also allow programs to react as quickly as possible to user input. And Bluetooth headphones, in particular, have very, very high latency.

So designing an API that performs well without revealing the latency is hard.

I do think it would be good to cleanly separate normal web pages and games, though. For pure content, none of this matters except that video needs to maintain synchronization. But normal content does not need clicks to translate quickly to video changes.


I don't know about you but I find everything about computers is hard, that doesn't mean there aren't better or worse solutions.

The browser is the presentation layer, it needs to know the latency of your headphones (or the system does). Why does the content provider need it? What's wrong with "here is frame A, please play audio A at the same time (while taking into account the latency that only you know about)" as a request?


Because, if the audio latency significantly exceeds the video latency, then the browser can’t do this without delaying the video.


It's simply moving responsibility from one entity to another, there's no technical reason that the content provide would be better at syncing the two, just as with any other network communication.


The obvious solution is just to disable javascript unless you trust the site


And then the site totally fails. I'm no fan of JS either but the amount of sites that use JS is now close to 100%. So you either stay paranoid and the web breaks or you eat the shit sandwich. Pure lunacy.


> And then the site totally fails.

That's ok. Then I have the option to weigh whether the risk is worth it or rather not. 9 out of 10 times it is not, in rare cases I will allow the js to execute.


Not viable for the majority of users.

How do you explain to one of these "I'm not good with computers haha" types what a CDN is and why Taboola CDN shouldn't be allowed but Akamai should be otherwise things Won't Work Right. Even if they're capable of learning [1] why should they care, when all these security measures actively make their life harder?

Blocking audio and other fingerprintable surfaces by default, with a "click here to enable audio / video" and a "remember my choice on this site in future" is the only way it can possibly work, because 99% of users will only ever go for the laziest option. We need to have protections for them that work regardless of (or in spite of) their skill level.

[1] https://www.nngroup.com/articles/computer-skill-levels/


Possibly curated, subscribable whitelists?


I thought that then looked at OscillatorNode and it allows you to essentially load a waveform into memory and/or generate your own. This means you can run a legit multitrack recorder or a synth from your web browser. I had a requirement for this 10 years ago, but Requires java extensions on the users browser


Not to be overly negative but thats lunacy. What you described is best served as a native application. Web browsers were supposed to be information viewing programs. Not virtual machines. I know it makes portable applications easy but the smash a VM into a text viewer is such a horribly broken idea and is why we have this security nightmare.


Nothing about this is a virtual machine. It is an API that exposes a lower-level Audio interface. I don't need to spawn a virtual machine or mess with any DLLs or figure out libraries because I can simply load a browser. Nothing is saying this needs to be on a remote server, it can be a local .html file for all it matters.

But hey some of us still run around with MS-DOS 5.22 on that new Spectre-ridden, rowhammer-ready i7 because DOS/4GW is hella stable.


> Web browsers were supposed to be information viewing programs.

True if we stopped thinking at 1996. I don't know how to tell you but browsers become application platforms not document delivery vehicles in the last 20 years.


There's plenty of sites doing this already like https://soundtrap.com


Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.

https://news.ycombinator.com/newsguidelines.html


I think this is mistaken. The device API supposedly provides this, but the info you get is always generic due to fingerprinting protection. Also, AudioContext is expected to be active only when created within a user action like a button click. Otherwise it doesn't run (implemented in Safari).

This is a pain - on the one hand, the browser vendors and w3c are locking down API capabilities to prevent fingerprinting and timing based security hacks, but these are interfering with genuine needs to provide audio functionality. For example, you can't determine whether the user has 3 audio devices connected and prompt them to select one for output. So you really can't build desktop quality audio systems with webaudio and siblings.


> Also, AudioContext is expected to be active only when created within a user action like a button click.

I really hate the user-action restriction as a security measure, because it's ridiculously easy to bypass[0], particularly on Chrome. It's privacy theater -- annoying for legitimate developers, but easy for malicious actors to get around.

Putting this stuff behind a permission or sandboxing it somehow would be better for both end-users and developers. As it stands, we just get fewer webaudio applications online from legitimate developers, and they all move to native platforms where it's even easier to fingerprint users. And the malicious sites that don't move to native just fingerprint anyway because we're using bad, opaque metrics for consent.

[0]: https://danshumway.com/blog/chrome-autoplay/demo/


>For example, you can't determine whether the user has 3 audio devices connected and prompt them to select one for output.

Frankly, this really should be handled by the browser. JavaScript really shouldn't have any business messing with that.


I have no problem with sites being able to do this, after I’ve given them permission to access the audio stack in the first place, and limited to first-party scripts on the page.


That's reasonable for just output selection. So shouldnt be able to send audio to all the devices? .. and different audio streams? .. which is all commonplace in the desktop world.


Sure, but the browser should handle that logic. Even if the website prompts a user to select an audio device it should be handled on the browser's side without feedback to the website.


It's been a little while since I used this, but navigator.mediaDevices.enumerateDevices

will give you that list, allowing you to have a menu for a user to choose one and then the client code must use setSinkId to assign the chosen device id. https://developer.mozilla.org/en-US/docs


The IDs you get (both deviceID and groupID) are not user displayable. They're like random numbers. In normal browsing mode, you keep getting the same deviceIDs for the same domain, but random groupIDs for every refresh. You get different deviceIDs for different sites. In private browsing mode, you get both random every time.


Ah, I forgot that the device label is only returned when there is an active stream. And has zero support in Safari. https://developer.mozilla.org/en-US/docs/Web/API/MediaDevice...

I never knew the device ids were randomized like that, thanks for the info.


> So you really can't build desktop quality audio systems with webaudio and siblings.

Nor should anybody even be trying. We're better off without any of this sort of horseshit.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: