Hacker News new | past | comments | ask | show | jobs | submit login

This approach is neat for observability, but it's worth noticing that it essentially quantises all of your samples down to the vertical resolution of your graph. If you somehow introduced a bug that caused an error that was smaller than the step size then these tests wouldn't catch it.

(e.g. if you somehow managed to introduce a constant DC-offset of +0.05, with the shown step size of 0.2, these tests would probably never pick it up, modulo rounding.)

That said, these tests are great for asserting that specific functionality does broadly what it says on the tin, and making it easy to understand why not if they fail. We'll likely start using this technique at Fourier Audio (shameless plug) as a more observable functionality smoke test to augment finer-grained analytic tests that assert properties of the output waveform samples directly.




That's true that it quantizes (aka bins) the samples, so it isn't right for tests that need to be 100% sample-perfect, at least vertically speaking. I suppose it is a compromise between a few tradeoffs - easy readability just from looking at the code itself (you could do images, but then there's a separate file you have to keep track of, or you're looking at binary data as a float[]) vs strict correctness. The evaluation of these tradeoffs would definitely depend on what you're doing, and in my case, most of the potential bugs are going to relate to horizontal time resolution, not vertical sample depth resolution.

If the precise values of these floats is important in your domain (which it very well may be), a combination of approaches would probably be good! Would love to hear how well this approach works for you guys. Keep me updated :)


I'm not sure it makes sense to separate "vertical" correctness from "horizontal" correctness when it comes to "did the feature behave" though; to extend the example in TFA, if your fade progress went from 0->0.99 but then stopped before it actually reached 1 for some reason, you might find that you still had a (small, but still present) signal on the output, which, if the peak-peak amplitude was < 0.1, the test wouldn't catch.

Obviously any time you're working with floating-point sample data the precise values of floats will almost always not be bit-accurate against what your model predicts (sometimes even if that model is a previous run of the same system with the same inputs as in this case); it's about defining an acceptable deviation. I guess what I'm saying is that for audio software, a peak-peak error of 0.1 equates to a signal at -20 dBFS (ref DBFS@1.0) (which of course is quite a large amount of error for an audio signal), so perhaps using higher-resolution graphs would be a good idea.

(Has anyone made a tool to diff sixels yet? /s)


Fair points here. Unfortunately adding more vertical resolution starts to get a little unwieldy to navigate through. Maybe it could start using different characters to multiply the resolution to something sufficiently less forgiving of errors. If it could choose between even 3 chars, for example, it would effectively squash 3 possible values into one line, tripling the resolution.


I think more resolution may give you more false negatives, which might not be helpful. We’ve used similar tools for integration testing at work and the smallest usually irrelevant change can bust the reference cases, due to the high-detail in the reference, which means going through all the changed tests and then seeing that everything is still fine.

For this, just thinking about sound, I wonder if you could invert the reference wave form and add it to the test to see how well it cancels? Then instead of just knowing there was a diff, you could get measurements of the degree of the diff.


A more accurate and only slightly more complex process for this is to generate numerical text representations of the desired test waveforms and then feed them through sox to get actual wave files. The numerical text representations are likely even easier to generate programmatically than the ascii->audio transformation.


What does a "numerical text representation" of a waveform look like? (Not familiar with audio processing but interested to understand your suggestion.)


Here's a fragment of the representation of a stereo file:

       4.9600227   0.094451904297 -0.014831542969 
       4.9600454   0.089172363281 -0.0092468261719 
        4.960068   0.087493896484 -0.0065612792969 
       4.9600907   0.090179443359 -0.0028686523438 
       4.9601134   0.093963623047 0.0060729980469 
       4.9601361   0.095367431641  0.020538330078 
       4.9601587   0.094299316406  0.035186767578 
       4.9601814    0.09228515625  0.045013427734 
       4.9602041   0.089691162109  0.051422119141 
       4.9602268   0.086059570312  0.058929443359 
Columsn are: [time in seconds] [left channel sample] [right channel sample]

This was generated using

      sox somefile.wav somefile.dat
You can reverse that by reversing the argument order above.


This has some advantages-- it's numerically precise and can be more flexible, but it has some downsides over the suggested approach.

- The quantization of the graphs is a feature to add some tolerance to the tests. I admit this is a mixed blessing.

- This is a lot more opaque to someone looking at a text file of the test output than what is described in the post.


The opacity of the .dat file is real and deep. But I'd expect the opacity of the go/python/lua/whatever code that generates the .dat to be extremely low, and that's what you'd read.


I was thinking that maybe that lack of precision was a good thing. Makes your tests less fragile.

I agree though that you probably want to augment this with some form of assertion about noise level to check the high frequency smaller components.


There's some french guy who made some maths that might help with this, idk ;)

--pb, CTO of Fourier Audio Ltd.


Yeah, maybe an ascii-art waterfall plot is the way to go!


I suppose you could use a patterned dither / sigma-delta to get a slightly bigger chance of finding differences.




Consider applying for YC's W25 batch! Applications are open till Nov 12.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: