maxcrofts.com

Shepard-Risset Flanger in Web Audio

Before they formally became the supergroup Swedish House Mafia, the trio collaborated with Laidback Luke in 2009 to produce Leave The World Behind. For the uninitiated:

The track at 128 beats per minute in E♭ minor with stripped-back production yet impeccable sound design, the defining feature of the group's first era. Infamously, its kickdrum has been sampled in countless tracks by other producers serving as EDM's staple next to only the Pryda snare. But beyond the percussion there is this white-noise riser percolating in the background, driving the tension throughout. The FX stem during the drop looks like:

25026027028029030501002003005001k2k3k5k10k16kFX stem (bars 129–160)Time (s)Frequency (Hz)−60−40−200dBFS

The broadband noise drowns out the pitched movement, so it helps to fold the spectrum onto a musical grid. Instead of plotting raw frequency, we sum the energy into one bin per semitone, then subtract a local noise floor (the median level across the surrounding octave) from every bin.

129133137141145149153157160C1C2C3C4C5C6C7C8C9FX stem (bars 129–160)BarNote0246810dB above floor

What falls out from this analysis is that the perceived pitch of the riser increases a semitone a bar, with its lowest point around B♭2. Clearly, though, I'm burying the lede. This is a near-textbook example of a Shepard-Risset glissando: an auditory illusion perceived as a constant increase in pitch. The illusion comes from stacking several voices a fixed interval apart and bending them upward together. Each voice fades in at the bottom of the range and out at the top, with the combined signal remaining at a constant amplitude. This leaves no audible "seam" for the ear to catch. What's interesting in this instance, though, is that the Swedes didn't reach for traditional tonal oscillators. Instead, they used white noise fed through a flanger.

Flanging and Pitch

A flanger is built around a very short delay line (i.e. measured in milliseconds) which is then combined with the dry signal that fed it, with optional feedback:

dryfeedbackinput+delay±output
A generic flanger

The summation creates a comb filter with evenly spaced peaks and notches set by the length of the delay; where a wavelength fits neatly into the delay time the two reinforce, where it lands out of phase they cancel.

1/D2/D3/D4/D5/D01AmplitudeFrequency1/Dreinforcecancel
The comb in the frequency domain: summing the signal with a copy delayed by D reinforces every frequency whose wavelength divides the delay and cancels those landing half a cycle off, building peaks and notches spaced 1/D apart.

By modulating the delay time with an LFO you can vary this interference over time, creating the classic flanged sound.

signaldelayedsum
Sweeping delay

Point such a filter at white noise and you get tone; white noise carries energy at every frequency, so the comb's teeth always have something to grab. The ear hears that harmonic series as a single tone at the fundamental 1/D.

Web Audio

The Web Audio API models sound as a graph. Every processing block (e.g. an oscillator, a filter, or a gain stage) is an AudioNode attached to an AudioContext, and you build your signal chain by calling connect on those nodes down to the context's destination. The graph runs on a dedicated audio thread, so once it is patched the main thread only has to nudge the odd parameter.

const ctx = new AudioContext();

// A 440 Hz saw wave, shaped by a low-pass filter, into a gain stage.
const osc = new OscillatorNode(ctx, { type: "sawtooth", frequency: 440 });
const filter = new BiquadFilterNode(ctx, { type: "lowpass", frequency: 800 });
const gain = new GainNode(ctx, { gain: 0.2 });

osc.connect(filter);
filter.connect(gain);
gain.connect(ctx.destination);

// Browsers generally require user input for audio.
playButton.addEventListener("click", () => osc.start());

Interestingly the API doesn't come with a built-in noise generator, so we have to build our own. The recommended method is to create an AudioWorklet which runs off-thread, much like a web worker.

class WhiteNoiseProcessor extends AudioWorkletProcessor {
	process(inputs, outputs) {
		for (const channel of outputs[0]) {
			for (let i = 0; i < channel.length; i++) {
				channel[i] = Math.random() * 2 - 1;
			}
		}
		return true;
	}
}

registerProcessor("white-noise", WhiteNoiseProcessor);

We can implement the Shepard-Risset flanger in such a processor like so:

class ShepardRissetFlangerProcessor extends AudioWorkletProcessor {
	constructor({ processorOptions: o }) {
		super();
		this.loopS = o.loopS; // re-seed period (s)
		this.spanSt = o.spanSt; // semitones swept per loop = voice spacing
		this.nVoices = o.nVoices;
		this.f0 = o.f0; // lowest comb fundamental (Hz)
		this.fb = o.fb; // feedback / resonance (0..~0.97)

		this.BUF = 8192; // per-voice delay line, >= the longest delay
		this.bufs = Array.from(
			{ length: this.nVoices },
			() => new Float32Array(this.BUF),
		);
		this.w = 0; // shared write head
	}

	process(inputs, outputs) {
		const out = outputs[0][0];
		const inp = inputs[0]?.[0];
		const dTop = sampleRate / this.f0; // longest delay, in samples
		const k = this.spanSt / 12; // octaves swept per loop
		const loopFrames = this.loopS * sampleRate;
		const BUF = this.BUF;

		for (let n = 0; n < out.length; n++) {
			const x = inp ? inp[n] : 0;
			const cyc = ((currentFrame + n) / loopFrames) % 1; // 0..1 over the loop
			let sum = 0;
			for (let i = 0; i < this.nVoices; i++) {
				// voice i rides spanSt semitones above its neighbour; cyc lifts them all
				const D = dTop * Math.pow(2, -(i + cyc) * k);
				const buf = this.bufs[i];
				const r = (this.w - D + BUF) % BUF; // fractional read head
				const i0 = r | 0;
				const frac = r - i0;
				const delayed = buf[i0] * (1 - frac) + buf[(i0 + 1) % BUF] * frac;
				const y = x - this.fb * delayed; // resonant inverted comb
				buf[this.w] = y;
				sum += y;
			}
			out[n] = sum / Math.sqrt(this.nVoices);
			this.w = (this.w + 1) % BUF;
		}
		return true;
	}
}

registerProcessor("shepard-risset-flanger", ShepardRissetFlangerProcessor);

Putting it all together:

AudioWorkletNode(white-noise)AudioWorkletNode(shepard-risset-flanger)BiquadFilterNode×8 (post)AnalyserNodeGainNodeAudioDestinationNode
The final Web Audio chain