Bad Audio Sandbox

Sound A

Left:

Right:

Sound B

Left:

Right:


ABX test


    MUSHRA Style test

    MUSHRA-like Testing

    ?

    A MUSHRA inspired test, a methodology which is supposed to address some of the issues with ABX testing.

    Sound A is considered the reference and Sound B should be set so that the quality difference is clearly but just audible.

    Several sounds are created varying between A and b, as well as an "anchor" sound that is reduce quality.

    Rank each sound according to the perceived quality.

    The graph is for feedback that a sound is playing but noise is deliberately added to obscure the waveforms, so they won't match.

    This is only a proof of concept and isn't meant for serious scientific research.

    https://en.wikipedia.org/wiki/MUSHRA

    https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1534-3-201510-I!!PDF-E.pdf

    Criterion: Closest to A (best quality)

    Test

    Score

    100

    Excellent

    80

    Good

    60

    Fair

    40

    Poor

    20

    Bad

    0

    MUSHRA-like Testing - Results

    ?

    A MUSHRA inspired test, a methodology which is supposed to address some of the issues with ABX testing.

    The heat map show's where your scores were, the black dots show the mean score.

    The thin blue line is the line of best fit. The wide band shows the expected line.

    More analysis metrics are being considered....

    https://en.wikipedia.org/wiki/MUSHRA

    https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.1534-3-201510-I!!PDF-E.pdf

    Sound
    P-Value
    Meaning

    *Probability that "The sound is indistinguishable from reference A" (Null Hypothesis)


    Setup

    ?

    Turn on Stereo to investigate how differences between the sounds reaching each ear are perceived.

    Normalise, either looking at A and B 'together' and scaling by the same amount, keeping relative levels the same or 'individually' so they are at the same volume. If you are testing phase changes, then together will keep all harmonics at the same level. But if you are testing distortion, then levels may need to be the same.

    You can also load and save set ups (patches) to share or repeat later. There are a few example patches to fetch and try out.

    Background info and explanations

    Normalise:

    Additive Tone Synthesis

    Additive Tone Synthesis

    50Hz ?
    Frequency of first harmonic (fundamental)
    0Hz ?
    Fine adjust for frequency

    Harmonic Series

    ?
    Additive synthesis of the basic tone, so basically, building it from sine-waves. The preview shows a single cycle of the wave, the harmonics and phases used are shown in the graph (this is not an FFT).
    same ?
    Controls the balance between the first harmonic and the others.
    -1 ?
    First slider controls the level and polarity of the harmonics. The second slider controls how much the polarity alternates, full on will alternate polarity for each harmonic added.
    1/n1.8 ?
    Controls how loud higher harmonics are. For even harmonics, 1/n is typical of sawtooth waves.
    1 ?
    First slider controls the level and polarity of the harmonics. The second slider controls how much the polarity alternates, full on will alternate polarity for each harmonic added.
    1/n1.8 ?
    Controls how loud higher harmonics are. For odd harmonics, 1/n is typical of square and sawtooth waves, 1/n^2 is like a triangle wave.
    0 ?
    A fixed phase offset for all harmonics, adjust the phase of waves at t=0. Needed for pulse waves
    0 ?
    How often the polarity alternates as harmonics are added, the polarity. Usually at 2 to when set of harmonics to alternate (eg triangle). Equivalent to the duty cycle when set up for a pulse wave and values other than 2 used.
    0 ?
    Adjusts where the positive, negative and zero values of the alternations falls. When at zero, with Alternation Freq set to 2, the alternations fall on odd harmonics. Change to +/-1 to fall on even harmonics.
    0 ?
    Continue to add harmonics past the Nyquist limit to create deliberate aliasing. The level of the harmonic takes into account the settings in the Additive Filter section. It's as if the ADC is after this filter

    Envelope

    ?
    The envelope applied to the sound. The whole envelope is smoothed by a low pass filter.
    0.005s ?
    Controls the time for the linear attack part of the envelope
    0s ?
    Controls the time held at maximum between attack and decay
    0.4s ?
    Controls the time for the exponential decay part of the envelope to fall to -60db
    150 ?
    Controls the cut-off of the low pass filter which smooths the envelope. 0=off, 1000 = very very low cut-off frequency
    Additive Filter

    Additive Filter

    ?
    This is an additive simulation of a filter using, is it actually just controlling the levels of the sine-waves being added together. There is no phase change introduced, it is intended to help control the frequency content of the sample and vary it over time using the envelope.
    12db/octave ?
    Slope of filter after cut-off
    20000Hz ?
    Start frequency of filter
    20000Hz ?
    Hold (or end of attack) frequency of filter
    20000Hz ?
    End frequency of filter
    0.005s ?
    Controls the time for the linear (in frequency) attack of filter envelope
    0s ?
    Controls the time filter held at maximum between attack and decay
    0.4s ?
    Controls the time for the linear (in frequency) decay of filter envelope
    Additive Phase shifts

    Additive Phase shifts

    ?
    Controls how and by how much the phase of the first harmonic is shifted and also if harmonics are shifted, too.

    Root Phase shift

    0π (0ms) ?
    Phase delay of first harmonic for Sound A. Affects 2nd harmonic according to settings in common section.

    Envelope Mode

    Fixed envelope for all harmonics ?
    Controls whether the envelope of a each harmonic is shifted so the it starts with phase=0 of the harmonic or if all harmonics start at the same time but with shifted phases

    Higher harmonic shift

    0 ?
    Controls how much the higher harmonics (if present) are delayed. It is relative to amount of phase delay on the first harmonic. 0 means no delay and 1 means same phase delay as first harmonic.
    Samples and Inharmonic Tones

    Samples and Inharmonics

    ?

    Selection of sampled waveforms which can be used instead of or mixed in with the additive waveform.

    The waveform is mixed in after the additive process synthesis stage. So all stages below this, including distortion and digital processing affect the sample, too.

    Waveforms were created in Cubase using included samples together with various plugin instruments and effects.

    There are also a range of tones which can be added to investigate intermodulation and other effects.

    0% ?

    Mix of generated Additive Waveform and the specified sample.

    0db ?

    Adjust the level of the sample.

    The final output is normalised according to the normalisation settings to prevent unintended clipping.

    Inharmonics

    ?

    Controls to add audio range sine-wave tones which can be added to investigate intermodulation distortion.

    The frequency can be unrelated to the main tone or a ratio defined by the different scales.

    The envelope is the same as used for the additive synthesis, even if a sampled waveform is used.

    -1 ?
    Level and frequency of a tone, A, mixed in to the main sound. Frequency set directly.
    -1 ?
    Level and pitch of a tone, B, mixed in to the main sound. The pitch is set in semitones according to Equal temperament scale.
    off ?
    Level and pitch of a tone, B, mixed in to the main sound. The pitch is set in semitones according to Ptolemy's intense diatonic scale.
    off ?

    Amount of noise to add.

    Colour varies from white (0db) to pink (-3db).

    The level at 1Khz is about the same for each.

    Uses the main pitch envelope.

    Distortion and oversampling

    Distortion

    ?

    A range of non-linear processes which can be applied to the sound, controls for the quality of oversampling.

    The preview is of a single cycle of the distorted waveform and the FFT of this cycle.

    You can also add inharmonic tones above to investigate intermodulation distortion etc.

    0% ?

    Turns distortion on and off and controls the overall amount

    Total Harmonic Distortion is measured (using 10 harmonics) for the distortion applied (not including jitter or inharmonics).

    Saturation

    ?

    Common saturation functions.

    Actual level of each controlled by main Distortion level.

    0% ?
    Asymmetric hyperbolic function for even harmonics
    0% ?
    Third order Chebyshev polynomial distortion, generating third harmonic for a sine wave
    0% ?
    Tanh distortion, commonly used in Saturation modelling
    0% ?
    Plain old full hard clipping

    Speaker - Experimental

    ?

    An experimental and very basic model to provide frequency dependent distortion.

    Uses a Duffing Oscillator model for non-linear speaker forces.

    0% ?
    Controls the mix of speaker model and bypassed signal.
    0% ?
    Speaker mass, second derivative coefficient - acceleration of cone
    0% ?
    Damping - first derivative coefficient - velocity of cone
    0% ?
    Linear coefficient - restorative force of cone - (Hookes Law)
    0% ?
    Non-linear coefficient - cubed of displacement - curving of restorative force

    Oversampling

    ?

    Controls for oversampling. Only applied when the Distortion Level is on and only affects the distortion processing.

    Oversampling is only carried out once, not individually for each non-linearity as it is applied so aliasing is still possible.

    x4 ?
    How many times the sample rate is increased for the non-linear processes to be applied
    90db% ?
    The amount of cut in the stop-band.
    0.1 x fc ?
    The width of the transition band between pass and stop
    -1 ?
    Level and frequency of an ultrasonic tone added after upsampling but before distortion to test audibility of intermodulation artifacts. The tones is between sample nyquist and oversampled nyquist so depends on both sample rate and oversampling factor.

    No oversampling

    Digital Converters

    Digital conversion

    ?
    C

    Simulation of Jitter and bit depth reduction with dithering.

    See individual sections for more details.

    Graphs are as follows:

    • Jitter: shows how the jitter settings affect the output values of a small (+/-0.1%) linear change. Samples and displays 1000 of these transitions.
    • Dither Linearity: Shows how, on average, intermediate values are transformed to the reduced bit depth. Ideally this should be linear. Again, graph is based on sampling of the system, not theoretical calculations.
    • Dynamic Range: Sampled frequency distribution of averaged noise floor without dithering (red) and with the current dithering settings (green). The levels shown are indications only, the display is intended to show how the controls affect relative levels of noise floor.

    Jitter

    ?

    Simulation of jitter in ADC and/or DAC. Uses 3x oversampling and Lagrange interpolation to get the sample points.

    Difference between ADC and DAC is how the Lagrange interpolation is done:

    ADC keeps the time values fixed and 'jitters' where the sample is taken.

    DAC 'jitters' the time of the samples, but resamples at fixed time points.

    Is there a difference?

    The jitter is normally distributed white noise.

    0% ?
    Simulation of jitter on the ADC, where waveform is undistorted but point where sample is taken moves randomly (normal distribution) around the correct position.
    0% ?
    Simulation of jitter on the DAC, where waveform is distorted as each sample point is shifted randomly (normal distribution) around the correct position.
    0% ?
    Applies a fixed frequency (37Hz prime) jitter to ADC calculations.

    Jitter is off

    Quantisation

    ?

    Dither simulation using some common dither types.

    Also includes a bit depth reduction simulation to make quantisation distortion more noticeable.

    Applied after normalisation of the waveforms.

    Clipping and asymmetry of +ve and -ve max values for integers is NOT modelled (code for this is commented out) as the intention is to model the quantisation process and explore dither algorithms, not explore quirks of binary numbers.

    off ?

    Simulates the reduction of bit depth by rounding to values allow with given number of bits.

    Ignores that negative values have one less possible value to allow for the zero value as that fact is kind of irrelevant to the theory behind quantisation and dithering.

    Similarly, doesn't model DAC clipping, either. Code for both was implemented but commented out as it was distracting.

    off ?

    Level of dither added, lsb (least significant bits).

    RMS level (also in lsb units not db) is for comparison.

    As Gaussian is technically unbounded, only rms is shown, but is adjusted to match triangular.

    Triangular ?
    Type of dither noise distribution across the range set by level, Rectangle, Triangular or Gaussian (normal).
    0% ?

    Amount of the added dither subtracted after the bit reduction (Subtractive Dither).

    This is the theoretically best way to add dither but is impractical because a copy of the dither needs to be kept with the output.

    0% ?

    Amount of error feedback.

    Results in box-car noise shaping with a gentle high-pass filter.

    Least sophisticated type of noise shaping but may be interesting.

    0 ?
    Applies a percentage of the noise AFTER the bit depth reduction, so at 100% is just added noise NOT dither. Uses Equal-Power-Law cross-fading and shaping to sound as close as possible to the real noise.

    Output Attenuation

    ?

    Applied at very end of chain, after normalisation and quantisation processing.

    -1 ?

    Final attenuation of waveforms. Slider scale is adjusted (squared) to allow for fine adjustments of small differences, but also allow big differences if needed. Phase flip is just normal or flip.




    Naughty Filter

    Naughty Filter

    ?

    Two implementations of a narrow bandpass designed to reveal the sound of pre-ringing from subtle to overt.

    One uses a standard (RBJ) Infinite Impulse Response (IIR) peaking filter, which is minimum phase.

    The other uses combined windowed sinc functions to create a similar linear phase Finite Impulse Response (FIR) filter.

    It's a bit crude, particularly the FIR implementation, but sufficient to clearly distinguish pre-ringing.

    ?

    Cut off frequency of the filter.

    ?

    Gain of the eq.

    ?

    Width of affected band.

    ?

    Mix of Minimum phase (IIR) to Linear Phase (FIR) in output. 0% = all IIR, 100% = all FIR.

    Playback FFT

    ?
    FFT as provided by the Web Audio API, including built in smoothing and windowing functions.

    Static Detailed FFT

    ?

    Static FFT, updated automatically when waveforms are updated or when update is tapped.

    Tap the pin icon to keep this section in view.

    It doesn't show sampled waveforms. This is because it is specifically setup to show sharp details for analysis of distortion and other effects.

    To achieve this, it uses the same sound settings as above but tuned to give a whole number of cycles within the 64k sample frame, hence the spectrum is very sharp for harmonics.

    Note: Inharmonic tones are added using a windowing function to keep the spectrum as clean as possible.

    Null test

    ?
    Sound B is inverted and added to Sound B. The level is then measured and result is normalised (assuming it is not completely null). Identifies the differences between the sounds.

    See explanation of Null Test

    Notes: Background

    Notes: Background

    This page was created to help investigate audibility of low frequency phase shifts.

    It generates two sounds with identical harmonic content and envelopes as set by the common controls.

    By default, each sound has the phase of the first harmonic shifted by the specified amount relative to the other harmonics. Higher harmonic can also be phase shifted by a specified fraction of the root phase shift.

    The CheckBoxes next to each slider allow those parameters to be selected for testing. They will appear in the Sound A and Sound B boxes allowing them to be independently set for each sound.

    An envelope with linear attack and exponential decay is applied to each harmonic. The envelope is band limited by an adjustable low-pass filter.

    By default, the envelopes of the first (and other if specified) harmonics are delayed by the same amount as the phase delay for that harmonic but this delay can be removed.

    The waves are normalised together so their relative size are unchanged. This is to keep to optimum loudness levels for tests, avoiding clipping or the sounds being too quiet.

    Negative phase shifts are there to allow trying to compensate for existing phase shifts in playback systems.

    The code is plain javascript, using the browsers own Web Audio API for playback, to allow for easy modification and review of methodology etc.

    The waves are generated using additive synthesis to allow for easy and predictable setting of phases and levels.

    The filter is a simulated Butterworth response but with no phase shift. It is intended just for altering harmonic content to try out different sounds The filter envelope is linear but moves the frequency as if it were an exponential control.

    Notes: Null test

    Notes: Null test

    This null test is a sanity check for the generated waveforms.

    Play back and image of the null test result is of the normalised version. However, the displayed Peak level is measured before normalising (unless individual normalising is selected, in which case they are both brought to the level of the loudest for the null test).

    The null test should only contain the first harmonic (and second if it is shifted, too) plus a little distortion for short attacks and/or low smoothing due to offset attack starts.

    Changes to harmonics controls should not affect the sound of the null test result (with no second harmonic shift).

    With a significant hold time and no second harmonic shift, an offset of pi should and does give 6db and with pi/2 should and does give 3db. With 2Pi, the hold section should and does null, but the offset envelopes will cause a difference to appear during attack and decay.