Hacker News new | past | comments | ask | show | jobs | submit login
How Distortion Works in Music (benmosheron.gitlab.io)
394 points by beeeeeeeeeeeeen on May 19, 2020 | hide | past | favorite | 102 comments



That's cool, but it doesn't really explain _why_ the sound becomes more interesting. It has to do with how the sound is played back, in short, there are no "square" waves. Everything is a combination of sine waves. The more you have it's as if there are more instruments playing on top of each other, in music they're called "harmonics". A "squared" wave is a wave with many many additional sine waves in different frequencies that overlap with each other, if you were to plot it it wouldn't look square, rather it would be wavy. This article shows visually how this effect takes place, if you are interested: http://www.jezzamon.com/fourier/


> Everything is a combination of sine waves.

No, we hear things as if they were a combination of sinusoidal waves (mostly). That's only one way to represent them; you can just as easily represent them as a series of impulses. Mathematically they're all equivalent models, and none of the models accurately represent acoustic transfer.

You can say we perceive things as if they were a linear combination of sinusoidal waves because we have a series of hair cells in the cochlea. Each bundle of hairs responds to waves within a very narrow frequency range, so that's what we mainly respond to. All of our audio perception is filtered through this process first. Since the hair bundles have a specific audio response, they don't respond to instantaneous frequencies- there has to be a wave of a certain duration for us to perceive a frequency. That's even independent of the mathematical fact that shorter wave pulses are more indistinguishable from white noise. Also, all of this is ignoring the other equipment of the ear, like the eardrum and the bone lever[1] that transmits sound to the inner ear.

We have a lot of brain circuitry that picks out specific features, which means we can recognize things like square waves or impulses. I don't know much about this stuff but it's important for things like cochlear implants. AFAIK it's similar to some optical illusion/visual perception phenomena- things like how we perceive magenta as a single color, even though it's just two colors together, or how we perceive orange as a distinct color from brown even though it isn't[2].

> The more you have it's as if there are more instruments playing on top of each other, in music they're called "harmonics".

Symmetrical clipping and square waves both create odd-order harmonics (3x, 5x, 7x, 9x, etc) which are very unnatural. The brain is good at picking this kind of thing out, and while I find it enjoyable (eg chiptunes), it's not really comparable to normal harmonics in music. It's certainly not something you can simplify down to just the component sinusoids. The perception of sounds has way more to do with stuff happening in the brain than in the air.

[1]: https://en.wikipedia.org/wiki/Ossicles#/media/File:Slide1ghe...

[2]: https://en.wikipedia.org/wiki/Shades_of_orange


There's nothing more unnatural about odd-order harmonics than even-order harmonics. The brain picks up on both of them because any harmonic sound is good evidence that some living thing is nearby (non-living things like flowing water and falling rocks typically produce inharmonic sounds). The idea that there is something "unnatural" or inferior about odd-order harmonics is propaganda invented by valve-amplifier enthusiasts, because their amplifiers predominantly produce even-order harmonic distortion. A stopped pipe wind instrument produces odd-order harmonics, and they were invented long before chiptunes.


> Symmetrical clipping and square waves both create odd-order harmonics (3x, 5x, 7x, 9x, etc) which are very unnatural.

What do you mean by "unnatural"? A clarinet has mostly odd harmonics[1], for example. Clarinets don't occur in nature, I suppose, but they're not using electronics or anything either.

[1] https://newt.phys.unsw.edu.au/jw/clarinetacoustics.html#harm...


Probably even harmonics are more rounded or pleasing and less harsh. Compare trumpet vs cornet.


Related - I have recently been experimenting with musical effects based on breaking down a signal to any basis function of your choice. You can literally build any signal out of (nearly) any other. Turns out not to be a great audio effect as it tends to morph from one to the other via white noise in the middle in the general case, but with the right basis functions it's not so bad. At the extreme, in the case of sinusoidal basis functions ordered in increasing frequency, you've reinvented the filter sweep.

I plan to share on the blog at some point https://omnisplore.wordpress.com/category/music-creative/


Sounds cool. I always liked how ZynAddSubFX could use different functions for its additive synthesis.


It is a combination of the actual and our perception of it IMO. Even if the wave was a square, our ear hair wouldn't pulsate as square because of the inertia. And as such we wouldn't perceive a perfect square any differently than a "wavy" square, which is why Nyquist theorem/Fourier transform works. My comment was about understanding why the sound becomes richer just by flattening a wave. It is not intuitive that this would be the case but in essence it's as if you are _adding_ more frequencies that didn't exist before to the same sound. More frequencies are interpreted by the ear/brain as more instruments or richer instruments.


>> Everything is a combination of sine waves.

> No, we hear things as if they were a combination of sinusoidal waves (mostly). That's only one way to represent them

Well, but the reason we do our audio processing this way is that the Fourier transform our ear puts out is a high-quality description of the sound. If it didn't work well, we wouldn't do it.

> or how we perceive orange as a distinct color from brown even though it isn't

Why orange and brown? You can say the same thing about red and blue.

Conversely, if orange and brown weren't distinct colors, it wouldn't be possible for us to perceive them differently, and yet we do, very consistently. What exactly are you trying to say by "a distinct color"?


Due to the ear having “hair” in the cochlea, do people with Alopecia have a higher tendency to being deaf? Or fo they hear anything differently?


Hair cells don't actually have anything to do with hair. Not sure why you are being downvoted though, the name is a bit confusing. Jim Hudspeth gives a great talk about the neuroscience of hearing (haven't watched this video but saw him in person ~10 years ago): https://youtu.be/hn8N8p9P5gw


> It's certainly not something you can simplify down to just the component sinusoids.

Anything a human can perceive as sound can be simplified down to just the component sine waves.

> You can say we perceive things as if they were a linear combination of sinusoidal waves because we have a series of hair cells in the cochlea.

No. We say this because any waveform can be reproduced by a sufficient number of sine waves at different frequencies and amplitudes. It's not perception, or the physics of the human ear. It's math.[1]

[1] http://astro.pas.rochester.edu/~aquillen/phy103/Lectures/D_F...


Hah, I graduated from UofR- I was even a TA for Physics 142.

Math does not have anything to say about sums of sine waves being the true form of functions. The wavelet transform is just as physically representative. In fact you can generate any number of Hilbert bases for a given function. They are ALL equivalent.

The Fourier transform has a particular relevance because ears do something similar. Hair cells signal the brain while they are sensing vibrations within their particular frequency range. You could have different ears that signaled when they saw a sharp rise or drop in pressure and worked completely differently, but they would sense square waves just as fine. Or you could have ears that directly sense and measure air pressure. Instead, our ears sense that there are vibrations at 1 Hz, 3 Hz, 5 Hz, 7 Hz etc. and the brain interprets that and realizes it is a square wave.

The concept of a real Fourier transform is just as unphysical as a real square wave. A real Fourier transform requires a perfectly defined sound pressure at every instant; because air and sensors have inertia that is not possible. There will always be lag, so any change in frequency is not perfectly transformed or represented from/as a sum of sine waves.


On a related note, this [1] is by far the best video I've seen on the topic of turning digital signals (like square waves) into analog ones. The bandlimiting and timing section shows what a square wave ends up looking like when it's been converted to an analog signal.

[1] https://xiph.org/video/vid2.shtml It's 24 min long and, imo, does an excellent job at explaining the role of digital sampling frequencies, bit-depth, and the misleading "stairstep" representation of digital waveforms.


That's the video I always share with people spouting the same old clichéd nonsense about how a digital signal can't be as good as an analogue one (and other variant clichés). His presentation style can be a tad annoying, but there's no doubting the utter clarity of his points.


A digital signal can't be as "flawed" as an analogue one. The analog synthesizers exploit these flaws to produce richer and more interesting sounds that you just can't get in the digital world as it's too deterministic and the attempts to randomize behavior are too "white-noisy", instead of natural occuring distortions. There's been some advancements using AI, though, to model these flaws organically, which sound promising: https://teddykoker.com/2020/05/deep-learning-for-guitar-effe...


Record the signal, play it back. It's got the same 'flaws', perfectly. Reproduction is what I am referring to, not the _emulation_ of an analogue sound source.

The difference between digital and analogue gear is that the imperfections in analogue gear make them interesting. i.e. instabilities in VCOs, the addition of harmonics etc. I am a big, big fan of old analogue gear and so I am definitely in the camp of 'we're not there yet' with the emulation of analogue gear.

But, the idea that a digital signal can't replicate it, is nonsense, because obviously an analogue signal can be recorded digitally and perfectly played back. This is often the myth that is propagated, that the 'stair steps' somehow mean a loss of fidelity.

Interestingly, and relevant to this discussion, I am yet to find any digital plugin that manages to do saturation properly. Plugins are often, seemingly, just using random number generators, rather than anything more complex (like the AI models you link to); Nothing comes close to my Thermionic Culture Vulture, or any of my other valve based equipment at creating that 'fatness'.

I think that the AI approach is interesting, it does make me wonder how far we're willing to go to reproduce this in software, when the real thing is a handful of wires, resistors, capacitors, and transformers. It seems like trying to implement the ARM instruction set in Javascript. How much processing would a simple saturation plugin need? How many could one computer run?


I do not agree. I find modern digital softsynths just as "rich and interesting" as analog, not to mention 1,000,000 times more convenient to work with.

Something like U-He Diva for example can sound as analog as anything analog that I've heard. Then if you get into wavetable or granular, you can get sounds that are impossible to achieve with analog, which are in my opinion very "rich and interesting".

I suspect that the people who don't agree may have heard a digital synth many years ago and are unaware that the technology has improved markedly since then, which isn't surprising because there's a lot of money in it.

Just my opinion, but modern analog emulation sounds just as good as real analog.

Nothing wrong with enjoying the workflow of analog though, although personally I dislike that workflow.


There are plenty of good digital and soft-synths for sure. But the Diva isn't close to a real analogue synth. I have several analogue synths (OB-6, Alpha Juno 2, Juno 106, JX-8P, SH-101, Syncussion SY-1) in my studio (and modular too), and also own the Diva plugin. There really is no comparison. I never use the Diva plugin. If it was as good, I'd use it all the time, because it's much more convenient for recall to have a plugin.

Perhaps, for relatively simple sounds, or sounds that have little movement. But, the moment there's any filter movement then it's obvious. The filter really is the key, I think, when a digital synth outs itself. Often there's a 'digital edge' to filters on digital synths that don't come close to the 'smooth destruction' that happens in the analogue realm (really, really hard to describe with words!). But, in A/B tests between digital repros of classic synths and the classic synth, that's nearly always where I can hear the difference. But, it's also a major difference, because often the 'sound' of an analogue synth comes from its filter.

If you're looking for that stand out sound for a track, the one everyone goes "I love the track with that noise in", then analogue is where it will come from, the quickest and easiest.

This obsession with replicating the analogue sounds in software is really tedious though. I prefer digital synths that do stuff that the analogue synths can't do, like Omnisphere - nothing touches that, and it's a perfect compliment to the real analogue sounds. Let's keep the analogue realm, doing what it does best, and then let's advance the possibilities within the digital realm, looking forward rather than back.


I carefully considered whether or not to include Diva by name because I knew someone would shoot it down no matter which VST I named. I'm also a big Omnisphere and Phaseplant fan.

The point I was trying to make is that I feel that digital is definitely good enough to do professional-sounding "analog" performances without people going, "definitely sounds digital" - unless digital is the sound you're trying to achieve.

>If you're looking for that stand out sound for a track, the one everyone goes "I love the track with that noise in", then analogue is where it will come from, the quickest and easiest.

I disagree, but I also recognize that nobody will "win" this discussion, because it's a matter of taste - like asking for a consensus on which tastes better, chicken or fish.

You enjoy your analog synths, and I'll enjoy badly playing my digital synths. I'm more of a modern wavetable fan than analog sound anyway.

We're both music lovers - nobody's forcing anyone to use anything.


> But the Diva isn't close to a real analogue synth

Statements like this need to be prefaced with qualifying disclaimers, like "IMHO". You have to accept that your opinion is completely subjective. I've seen a variety of blind tests over the years in which listeners were unable to differentiate between analogue hardware and digital simulations. I've become increasingly impressed with the quality of virtual synthesisers over the years. At this stage, while I do still often prefer to use my analogue synths over the VSTs I own, that's mostly because the hardware is simply more fun to experiment with.


You can record an analogue synth digitally without losing its characteristics. It's interesting that you can't recreate the sound exactly by generating it in the digital domain.


So you can... assuming you can get exactly the right random bias generated given any possible input - and the bias will depend on both the current input and potentially some recent input. This is significantly harder and more expensive than just going and building the analogue box, so nobody does it.


Nobody does it? There's massive amounts of plugins that aim to do exactly that, along with DSP-based pedals and amps. For analogue synths, the well-designed VSTs are impossible to distinguish from the real thing in an A/B test. E.g. the Repro 5: https://www.youtube.com/watch?v=uFA_B6pP9AA

You're right in the strict sense about analog random bias being hard to reproduce, but those are largely irrelevant when it comes to the important characteristics of the sound. In an A/B test, it is impossible to distinguish a well designed digital audio simulation from a real analog device.


Impressive, both the instrument (Repro5) and the video. Can't believe the A/B voting results were as if they were random, everybody got about half right/wrong.


> if you were to plot it it wouldn't look square, rather it would be wavy

Could you go into more detail on why you say this? A real life square wave is square, possibly with some ringing or overshoot in the flanks (but not necessarily)


There are no real life perfect square waves for the same reason there are no ideal diodes, capacitors etc; you cannot have a perfect vertical cliff because that would have infinite slew, and you cannot have perfectly sharp corners because that implies infinite second derivative, etc.

The best you can get is a tight curve into an extremely steep "triangle".

The amplifier driving the speaker has finite slew. The speaker itself is constrained to move at finite speed and cannot accelerate instantly. The air has a finite slew rate - the speed of sound.

Stackexchange has a discussion on trying to determine a maximum frequency in air here: https://physics.stackexchange.com/questions/23418/is-there-a... which leads to all sorts of useful sub-discussions. Attenuation depends on frequency; the higher frequency harmonics are more subject to attenuation in air, as well as reflecting off surfaces and self-interfering.


I think they are referring to something like this: http://1.bp.blogspot.com/-WhH0B8mnGhw/UZhL9ehVa8I/AAAAAAAAAH...


Exactly, what I'm trying to explain (although it's not easy as it's not very intuitive), is that we get more interesting sounds by clipping a signal's peaks to square because it is identical to _adding_ lots of different frequencies to the same sound in a way that their cancelling each other out would be represented as a square in digital audio, because of lack of detailed sampling at that level. The actual signal does not just flat stop at the square peak, the peaks get more signals in different frequencies added to them at playback, so they sound richer. You send a square to the speaker, the speaker isn't doing square, it pulsates in a sinusoidal fashion, nothing is fast or solid enough to produce flat square. Maybe you can find these harmonics in a factory with steel equipment, but they will still be pulsating at very high frequencies, which is why these "real" sounds are so much richer than a speaker.


The square wave signal will move the speaker in a square pattern, but the actual waves propagating through the air are always sine waves. What you get is a series of harmonics indicative of the square source.


The waves can be mathematically described as a sum of sine waves, but there's no reason why air shouldn't support something that "looks like" a square wave (no more as the speaker anyway) within some limitations.

What you might have is that the air might disperse the wave so it gets distorted but still with the same components https://en.wikipedia.org/wiki/Acoustic_dispersion


But think about it this way: if you have a plate moving back and forth in a square wave pattern at the surface of a pool of water, would the waves on the water be square?


Waves in water are surface gravity waves, which air does not have. Because water is pretty incompressible, high pressure causes a water column to rise against gravity. The rate at which the column falls back down is obviously dictated by gravity. The faster the frequency, the more a pressure wave acts like a real pressure wave and less like a surface gravity wave. Since real pressure waves don't rely on gravity for transmission, they travel far faster, resulting in dispersion[1]. You can still have square waves as long as they are 10-100x slower (longer) than normal waves.

Since air pressure does not typically produce a height difference in the atmosphere, standard sounds in air are not subject to dispersion and act very differently from water. There is obviously still a limit to the rise/fall time of a square wave, when the transition is more like a shockwave front, but that's a very square wave.

[1]: https://en.wikipedia.org/wiki/Dispersion_(water_waves)


No, but because surface waves behave differently from sound waves (which are in "3D") https://en.wikipedia.org/wiki/Dispersion_(water_waves) Basically phase velocity changes with the amplitude and the shallowness of the water.

Yes it's easy to think they behave the same, but they don't.


Instead of flawed thought experiments you could try a quick test for yourself: hold your speaker up to your microphone, play a square wave, record sound, and look at the waveform.


I have no experience whatsoever with sound, but I got curious and did that, and got weird results:

https://imgur.com/HSSdM5t

(below is the original 440Hz square wave; above is the recorded sound)

The recorded waveform looks nothing like the original, and it sounds very different too: the original is much harsher, although the pitch sounds exactly the same, as expected.

Might be my crappy microphone? Or maybe the sound is being filtered somewhere along the way?


That waveform looks pretty good imo. When I did it, I the signal was very weak so the signal to noise ratio was bad. There were low frequency impulses from me moving the headphones around and regular hums from other sources.

My best guess for the large attack showing up there is not effects from the microphone, DAC, amp, or anything but the actual speaker. Good audio measurements are hard to come by thanks to all the snake oil, but square wave measurements are common when there is data. The physical models of transducers aren’t trivial, but in a single broad stroke the answer is “physics” and “spring-mass-damper”.

https://www.innerfidelity.com/images/AKGK701.pdf

The decay is expected, as the pressure around the microphone can only temporarily increase before the pressure wave disperses into the room. As an extreme example, if you turned an entire wall into a speaker and played a square wave, you would be able to make a much more square shaped waveform. There’s no replacement for displacement baby (see: subwoofers). If you want a more square waveform, try using a headphone pushed right up to the mic and going to a higher frequency (like 1 kHz).

Edit: If you want to skip the trip down the rabbit hole: I think the end-all for audio quality are sealed in-ear monitors (IEMs). Low group delay from short distance from transducer to eardrum, and sealed enclosure for good bass response (see: square shaped low frequency square waves).


What does that tell you? Shouldn't you be looking at the Fourier transform of the waveform?


Visualizing the spectrum of a square wave is so easy, it’s used to teach the concept of the frequency domain. Less square square wave? You likely peeled off some higher harmonics and added some phase noise.

The point of the experiment is to demonstrate that sound waves in air are not at all like water waves and sonic square waves do, in fact, exist.


The sine-ness of the water waves comes more from the mass and "elasticity" of the water, and not from the impulse driving it. If you put the same plate in air (or a speaker cone projecting a square wave) the mass and elasticity of the air permits much more "equare"-like pressure waves.


People focus too much on your water analogy.

But it is true: if there was a plate that could move like a square it would not mean all air in the front and back of it would also move like it.

The air compression would move like a sine instead of a square.


Even if the analogy worked, how do you propose to move a plate in a square wave? You would need to instantaneously move it by the wave amplitude.


Electrical square waves aren't really instantaneous either, they just got a very, very fast rise time.


To produce a square wave, speaker cone should move with infinite speed.


Yeah it's probably limited by the speed of sound at least. 10 mm movement and 20 000 Hz is probably supersonic diaphragm with some really back of the envelope calculation...


Gibbs effect. To sample a signal you need an antialiasing filter in front of the ADC.


So, that 2nd sound with the grouping of sine waves that looks like it's starting to form a saw wave is actually rather close to the natural signal of an electric guitar.

When installing pickups into one of my guitars several years ago, I decided I wasn't going to be satisfied by adjusting the height of the pole pieces by ear, nor was I going to be satisfied with with matching the height of the pole pieces to the radius of the fret board, then adjusting by ear. No, I had a hacker space membership and access to a 100Mhz oscilloscope, so I was going match the pole piece height to the fretboard radius, and then use the oscilloscope to tweak pole piece height until the output of each individual string matched amplitude with the others.

So there I was, with my guitar running into the oscilloscope, doing my damnedest to pick each string with equal force and watching the amplitude on the scope. And I noticed a curious thing. The initial pluck was a squiggly, kinda-saw-wave-shaped thing, and then as the note decayed, became something more akin the 2nd graph in the OP, and then slowly turned into something rather like a sine wave as the note faded.

What happens, is upon picking, there's a lot of higher order harmonics, and other high frequency content, and those higher frequencies quickly decay, and soon you're dealing with mostly just the fundamental and its lower order harmonics.


You just described spectral envelopes.


What's so surprising? The initial pluck is dominated by the short percussive sound of actually plucking the string, which quickly fades away so what's left is the fundamental sine-like wave. As noted below, spectral envelopes.


Somewhat related, my colleague has studied the interplay between distortion harmonics and chordal harmony in metal music:

https://pdfs.semanticscholar.org/b2e2/c20f8fe3ad45d39bc8f562...


This is really interesting!

One way to get a clearer sound with modern high-gain metal tone is to record each note of a chord running through the amp individually. (I can't remember exactly, but I think this avoids "intermodulation" distortion within the amp itself.)


I wonder, has anyone tried making a guitar that outputs the signal from each string's pickup coils separately, and then an amplifier that distorts each signal separately?

edit: Looks like this might be exactly that: https://www.youtube.com/watch?v=9EUbO59OO_Y


Over the years my EE background led me to wonder about a number of different ways you could improve an electric guitar, only to find out later that they've all been done before. There are quite a few nerds who love music and they've been experimenting for decades.


Yes, a pickup that has an output for each string is called a "hexaphonic pickup".

Note that normal guitar pickups have one or two coils (humbucker) that span all six strings. A hexaphonic pickup has 6 or 12 coils for 6 outputs, one (or two) per string.

The first time I recall hearing about this was with In Flames' album Reroute to Remain. That album has pretty fat distortion sounds.


Martin from Wintergatan (of Marble Machine fame) has made a bass guitar with a separate pickup and output for every string.

This lets him do things like mute one string when another starts playing in software, as he won't always have his hands on the bass to be able to mute the non-playing strings.

https://www.youtube.com/watch?v=rvL83-iy-EQ


Wow, thanks for sharing.


The Line6 Variax does this. With their digital modelers, you can set different effects for each string. Pretty wild.


The ARP Avatar could do this in 1977.


Yes, I can confirm you won't get intermodulation and the harmonic mix will be less dense.


This is one reason Iron Maiden sounds so cool, because they have lots of harmonized guitar parts played by multiple guitarists. (loads of other bands too, but Iron Maiden is especially well known for this.) OTOH, so called "power chords" (root + 5th, or root + 4th) also sound pretty cool because of the "intermodulation" distortion. But it's true that "overly complicated" chords can be a tough listen with heavy amounts of distortion.


Intermodulation distortion creates new tones with frequencies at the sum and difference of the frequencies (and integer multiple of the frequencies) of each pair of tones you're distorting. Usually this produces tones that are harmonically unrelated, but in the case of a power chord it produces a harmonic series with a fundamental one octave below the root of the power chord (0.5 times its frequency).

The fundamentals of the notes of the power chord have frequency ratio 1:1.5

Where i, j, k are all integers:

0.5 * i = 1 * j + 1.5 * k

0.5 * i = 1 * j - 1.5 * k

No matter what combination of frequencies you select, you always end up with multiples of 0.5. Even if you extend the power chord with the octave (frequency ratios 1:1.5:2) you still only get multiples of 0.5. The same applies if you add the harmonic overtones of the fundamentals. This generation of a new bass note contributes to the "heavy" sound.

But in reality guitar strings have non-zero thickness, which means they produce notes with overtones that aren't perfectly harmonic (they are non-integer multiples of the fundamental) . This means you actually won't get perfect multiples of 0.5 of the root of the chord, but it's close enough that the inharmonicity just adds even more "heavy" sound without sounding like noise.


Dan Worrall's videos for FabFilter about distortion and saturation are great too:

https://www.youtube.com/watch?v=erv4lit4aWY

https://www.youtube.com/watch?v=NO2OZ3UTy2k

Actually all the videos of Dan are amazing.


It's interesting to look at some circuits for guitar distortion stomp boxes like the ProCo Rat[1] or Boss DS-1[2] or others on that eletrosmash site. Also guitar cabinet frequency response drops significantly after 10kHz so that rounds high frequency sharp corners off a bit (try listening to the direct signal from a distorted guitar preamp vs. the same signal through various cabinet impulse responses sometime, or hook up an oscilloscope across the output of a guitar multieffects box on various distortion settings. Nary a square wave to be seen.

[1] https://www.electrosmash.com/proco-rat [2] https://www.electrosmash.com/boss-ds1-analysis


This is very interesting and a great interactive demo. Could you make it so that when you press another sine wave, the other one stops playing? I accidentally had a couple things playing at once.


I'm not a DSP person, but I thought the term "transfer function" was more typically used in reference to Laplace transforms of linear time-invariant dynamical systems. Is it also common to use it for state-space mappings like the ones in this article?


Yes, these are straight lookup tables/mappings, not transfer functions, which use complex math - and s-plane to z-plane transforms for DSP - and can implement various filter shapes.

This is a good entry level guide to distortion, but it misses a few points.

The first is that any non-linear mapping will introduce aliasing which folds back below the sample rate. The more clipping, the more aliasing you get. In commercial DSP you deal with this by oversampling, applying the distortion, and then filtering and downsampling to remove the aliasing.

If you don't do that, the sound that comes out is nasty and crude with none of the creamy warmth of the real thing.

The other is that high-quality amp distortion emulations - Kemper, etc - are more like real transfer function emulations. They're also level-dependent, so the mapping that comes out isn't one-size-fits all and responds to dynamics.

That doesn't mean delay/echo/reverb, it means a network of very short delays characterised by a Volterra series and/or some form of convolution which models the dynamic tone characteristics of a real guitar amp.


Would the non-linear mapping actually produce aliasing though, since it's in the discrete time domain? My understanding is that you only get aliasing when converting from analog to digital. While the discrete time sequence does map to higher frequency components (imaging) I would think this would need be filtered in the analog domain (the speaker's D2A converter)


You make a good point. Indeed a transfer function is usually used to characterise linear systems. However, non-linear systems also have transfer functions, it's just that they are not scale invariant. Transfer function technically is just the "function" characterising the relationship between input and output. In a time-invariant linear system this is the ratio of the input spectrum to output spectrum, and is invariant to the scale of the input, and so can be characterised by a matrix multiplication (ie a linear transformation). For non-linear systems such as distortion, you now have a time-varying linear transfer function, a gain, depending on the amplitude of the input. So overall it is non-linear, and must be characterised by a function (the distortion curve), not by a single linear operation. One of the consequences of this is that it's a lot harder to go between the time domain and the frequency domain when dealing with non-linear transformations, so frequency analysis becomes more difficult -- but not impossible, depending on the system. For instance Chebyshev distortion can be analysed in the frequency domain.

So, non-linear transfer functions can be seen as arbitrary functions, or can be seen as linear transfer functions parameterised by some parameter that varies non-linearly with the input (eg a gain modulated by a non-linear function of the input amplitude -- distortion). This second view is called "linearization" where you model a non-linear system with a local linear model, based on the assumption that non-linearities are smooth when viewed "close up", and is needed to apply certain linear control methods to non-linear systems.


Many many years ago, as my little sister was getting into rock and roll, I mentioned my theory that one reason American teenagers gravitate towards it, is that the sound of a distorted guitar resembles the sound of an engine, and that cars symbolize freedom and coming-of-age in our society.

"Okay, so what exactly do you mean by 'distorted', anyway? I hear that term thrown around..."

So we sat down in front of dad's stereo, turned it nice and up, and I played her the intro to _Sorrow_: https://www.youtube.com/watch?v=AdKNlGfkyhc

2 minutes and 14 seconds later: "Oh. Okay. I think I get it now."

...

"You got any more of that?"


That's an interesting theory. But wouldn't the same people who like distorted guitar also like the sound of sports cars and motorcycles? I know there's some overlap, or at least many motorcycle owners listen to rock, but among the total population of rock fans are relatively few motorcycle owners.

(Anecdata: committed rock fan since getting Boston's "More than a Feeling" on 45 in the mid 70s, later played in several punk and hard rock bands ... but I can't stand the sound of Harleys or loud engines)


A lot of the basic rhythms and tempos (aside from the ones directly taken from west African music via American blues) were mimicking locomotives / steam engines.

Personally I think distortion does a good job of conveying emotion / passion as it intuitively gives you a sense of “wow they are really pushing stuff beyond it’s intended limit”.


So, can we think about distortion as a kind of machine harmonization?

And if yes, can we consider another kind of machine harmonization which would be to add notes to create wanted harmonies, based on an incoming note (modulating velocity, and maybe timing, so as not to overpower the original note)?

And if yes, this could be called "MIDI distortion"... Are there MIDI plugins that do this?

I recently started making (very simple) plugins in the Reaper DAW; it would be possible to do such MIDI harmonization in JSFX. I'm wondering if it's been done already.


For what it's worth, that's kind of how an organ works. The stops route the signal from each key to different banks of pipes that correspond to harmonics of the note.


Ah thank you, that makes a lot of sense!


Within Midi you are limited to 128 notes. Within this range you could certainly produce some interesting chordal effects (one thought would be octaves and fifths phasing in and out).

And there are many patches out there to do this sort of thing.

But you are far better off processing the audio signal for any kind of interesting modulations.


Not sure; by definition a software synth can produce audio signal on the totality of the audible range. To control these we can use notes + pitch bend or some kind of continuous signal such as virtual cv.

So it seems that at least in theory it should be possible to produce any harmonics using just MIDI.

Why would one want to do this? 1/ It's fun 2/ It should give more control about what harmonics are actually produced so maybe more interesting to hear.

There are many MIDI effects around "harmony" but they tend to help with chord progressions, and not this AFAIK.


Just a note With MIDI 2.0 we now gone from 7-bit to 32-bit which makes a huuuge difference in terms of what can be achieved, no more 128-note limit!


We've been waiting 30+ years for this... At last!


About the wave shaping, if some posters have a hard time visualizing how the mapping looks like:

Imagine that you have a (pure) input sine-wave coming up on the y-axis. It then hits some function on the x-y plane, and then gets "reflected" (mapped) onto the x-axis.

This imagine explains it quite well, and in this case there's a linear transfer characteristic / operating curve, with no distortion of input signal: https://www.tutorialspoint.com/amplifiers/images/input_cycle...

Now - instead of imaging a perfect linear operating curve, think of how the curve of a diode looks like, and how that would clip the upper curve of input.

Now put another diode for the reverse, and you're also clipping the bottom curve of input - this would be a basic diode clipping circuit, which is the fundamental drive-stage for distortion pedals like tube screamers - you'll often see a pair of diodes going opposite directions.

(In electronics classes, you tend to learn how to NOT generate distortion - or rather, how to amplify stuff with least amount of distortion possible.)


TL;DR: Wave shaping is just function composition.

I don't know why a new term is needed, but if you can visualize function composition for simple functions like floor, min, max, you can visualize wave shaping.

If the input signal is f(x), you obtain clipped signal g(x) like this:

    g(_) = min(1, _) ○ max(-1, _) ○ (a * _) ○ f(_)
This can also be written as

    g(x) = min(1, max(-1, a * f(x)))

In the example in the article, T(x) = min(1, max(-1, ax)) is called a transfer function.

I find it easier to understand function composition by breaking the functions down into simple parts, though.

The sonic effects of this are completely unexplained in the article. Other comments here discuss the effects on harmonics; mathematically, the TL;DR is that the Fourier series of g = t ○ f are more interesting (the intuition why can be obtained by looking at the Fourier series for a square wave).

Without getting into math: a clipped signal retains some of the input, and adds more of square wave characteristics. A square wave is the buzz you get by flipping + and - very rapidly, which will force the speaker cone to go all the way in/out very rapidly.

The reason why math above doesn't fully represent the effect is that an actual speaker cone won't be able to teleport to min/max positions instantaneously; and the unfaithfulness of the speaker cabinet creates pleasing tonal characteristics.


Cool article. Pretty straightforward but very cool to see the interactive examples.

Interestingly I found the simple "clipped" and "clipped/boosted" functions by far the most pleasing.

I doubt it's a coincidence that's what most analog FX are doing.

The distinctions between these are usually murky but in guitar FX:

Boost - Boost Overdrive - boost + soft clipping Distortion - boost + hard/soft clipping Fuzz - boost + mostly hard clipping

All the effects are generally doing one or more of the following: - Boost signal through a transistor till the transistor saturates/cuts off and clips - Do the same with an OP-AMP - Use diodes to do an extra clipping stage

In analog audio the signal is literally an alternating current... the "values" in digital audio correspond to +/- voltage levels. So his transfer functions are trying to mimic what electronic components do when they hit their limits. The transfer functions can also do things that are probably impossible in the analog world.

There's a lot of weird non-linear stuff that seems to bring out harmonics as well.


One kind of distortion that I loved to apply in my FastTracker II days, and which I don't see used all that often, is to flip the sign.

In short, FT2's sample editor could handle a few common waveform formats, but this was cross platform days and there were a lot of wave formats. These have differnt headers and different structures, but many of them ended up being just a series of samples, either in 8-bit or 16-bit, and at some sample frequency. To deal with this, FT2 added a button that let you reinterpret a waveform as if it was stored signed or unsigned, and one that let you reinterpret as 8/16 bit. This let it load a huge amount of sample formats well enough; you'd fiddle around with those buttons until it sounded right, then you cut the first few millisecs of noise (the file format header) off the front, and done!

But turns out that the "reinterpret as (un)signed" button (called "Conv" in FT2) is pretty spectacular for nasty distortion too.

Imagine a sine wave. Loaded correctly, it'll look like this:

     +-----------------------------------------------------------------------------+
     |                                                                             |
     |                                                                            X|
     |                  XXXXX                                                   XXX|
     |               X       XX                                                X   |
     |              X          XX                                            XX    |
     |                                                                      X      |
     |           X               XX                                        XX      |
     |                             X                                      X        |
     |         X                   X                                    X          |
     |                              X                                 XX           |
     +-----------------------------------------------------------------------------+
     |      X                        X                              X              |
     |     X                          X                            X               |
     |    X                           X                           X                |
     |                                 X                        XX                 |
     |   X                             XX                      X                   |
     |                                  X                    X                     |
     |  X                                X                 X                       |
     | X                                 XX            X XX                        |
     | X                                  XXXXXX X  XXX                            |
     |                                                                             |
     +-----------------------------------------------------------------------------+

Imagine this was stored as 8-bit signed. Now, if you reinterpret those bytes as unsigned, it'll look like this:

     +-----------------------------------------------------------------------------+
     |      XX                      XX                              XX             |
     |     X X                      X X                            X X             |
     |    X  X                      X X                           X  X             |
     |       X                      X  X                        XX   X             |
     |   X   X                      X  XX                      X     X             |
     |       X                      X   X                    X       X             |
     |  X    X                      X    X                 X         X             |
     | X     X                      X    XX            X XX          X             |
     | X     X                      X     XXXXXX X  XXX              X             |
     |       X                      X                                X             |
     +-----------------------------------------------------------------------------+
     |       X                      X                                X             |
     |       X                      X                                X            X|
     |       X          XXXXX       X                                X          XXX|
     |       X       X       XX     X                                X         X   |
     |       X      X          XX   X                                X       XX    |
     |       X                      X                                X      X      |
     |       X   X               XX X                                X     XX      |
     |       X                     XX                                X    X        |
     |       X X                   XX                                X  X          |
     |       X                      X                                XXX           |
     +-----------------------------------------------------------------------------+
Notice how the top half and the bottom half of the sine waves are swapped.

Now, I don't understand much about DSP and sound and the frequency domain. Better programmers than me have told me that this makes no sense from an audio/music perspective. But it sounds pretty cool; you keep a lot of the essence of the original sound, and add a lot of nasty.

I once made a VST plugin that does this (it lets you slide the waveform up, wrapping over to the bottom; its only parameter is by how much to slide), but I don't think I have it anymore unfortunately.

For context, applied to a random bassdrum using 8bitbubsy's amazing Windows clone of FT2[0]:

- original: http://neuvostoliitto.nl/808_drum_kick_035_21338.wav

- distorted: http://neuvostoliitto.nl/808_drum_kick_fucked.wav

Just sharin' :-)

[0] https://16-bits.org/ft2.php


That's the shape of waves in wavefolding and complex oscillator modules [1] in the modular world. So, it makes perfect sense (and sounds very cool)

[1] https://www.youtube.com/watch?v=jOx9z0SPDek


Exactly what I saw when I looked at the waveshapes, wavefolding is really cool and poweful. And not only done through modulars, my Digitone can do some wavefolding through the offsets feature for the modulators/carrier.


> Now, I don't understand anything about DSP and sound and the frequency domain.

Without going into theory, just look at it. It's the addition of a square wave and a sine wave now. And a square wave has lots of harmonics, giving that buzzy sound.


> And a square wave has lots of harmonics

A square wave is 100% distortion effectively. It's an infinite series of sine-waves (although the infinite series isn't needed to generate a square wave, just the sum of harmonics up to nyquist)


Well, no. In any finite range of frequencies there are infinitely more harmonics that are in a noise signal than are in a square wave. In the audible frequency and dynamic ranges, you realistically can’t perceive more than maybe 7 harmonics in a square wave and it sounds very different than a noise signal.


not entirely, because the square wave isn't regular. in graphical terms, the space between each time the square goes up and down is irregular. that gives it a totally different (and, notably, more atonal) sound than a normal square wave.

also, the sine wave got flipped, half, kinda. just mixing a square wave with something has a totally different effect than this. it'll sound like the original sound and also a square wave tone playing together.


I think it’s the multiplication of the two waveforms (aka mixing). The spectrums are convolved. It’s a common technique. Square waves are a fun case to explore here because you might start finding cases where something you might just consider 180 degree phase jumps (PSK) actually effect the magnitude spectrum (square wave). They don’t teach you that in school!


I believe this is called wavefolding - there's a button for it in the demo which gives a very similar picture to yours.


For a really cool and detailed break down of almost any audio effect checkout the book DAFX: Digital Audio Effects


Fantastic article! I loved how the transfer functions are shown, with audio examples! I'm going to use this for writing an effects patch for my synthesizer :)

Speaking of synthesizers, my favorite line was "I’m building a synthesizer in javascript at the moment". I'd love to see this!


That's awesome, thanks!


Great use of interactive visuals/audio to help explain things.

(I look forward to this becoming more common.)


I recall something like this from a sound system engineering course I took in 1990. It was a course for people wanted a job of setting up speakers and operating the mixing board. (It was supposed to be a course to be a recording studio sound engineer... I'm still pissed)

The instructor would often mention odd transharmonic distortion. It sounds pleasant to our ears but even transharmonic does not. That's about all we were told about that.

Years later I was reading about brains and somehow I understood transharmonic distortion. The article mentioned something that I now forget but it was clear to me why we like to listen to loud rock music.


It would be nice if both waveshaping graphs were side by side so I don't have to keep scrolling up and down to change the function, and so I can see the waveform at the same time as the transfer function.


Very cool demos to play around with there! I was hoping it would go into the psychology of why distortion works, but this was still awesome.

I've found that the nice thing about distortion, especially in shoegaze music like My Bloody Valentine, is that the noise makes what you hear sort of an "audio illusion" in that you start picking up melodies that maybe weren't really there. It tickles your ears. Ironically, you could say the noise smooths out the wrinkles in the piece.


distortion is for pretending i meant to clip my recordings at a frequency i didn't notice on studio monitors.

let's talk about bitcrunching! now that's fun. and definitely not the other way i cover up poor recording!


This is super cool. I really enjoyed the demos, and the explanation of what distortion actually does to the waveform was very clear.


Thanks!


I enjoyed the "very noisy" setting, actually sounds like most of the radio dial.


why are the final audio samples so terrible/quiet? did no one actually try them out?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: