What is MQA?

Introduction

MQA is a hierarchical method and set of specifications for recording, archiving, archive recovery and efficient distribution of high-quality audio. Devised by long-term collaborators Bob Stuart and Peter Craven, it has been developed by MQA Ltd.

One axiom is that, in audio, High Resolution can be more accurately defined in the analogue domain in terms of temporal fine structure and lack of modulation noise than by a description in the digital domain, particularly if that description relies on sample rate or bit depth numbers.

Another observation is that, by not going back to first principles, the recent trend seeking higher-resolution in digital audio has involved an unstructured and somewhat unscientific approach; a ‘dilution’ rather than resolution of problems; leading to excessive increases in data rate with resulting lack of convenience for the end user.

A postulate in MQA is that by combining the statistics of musical signals with modern methods in sampling theory and insights from human neuroscience, we can more effectively convert the analogue music to digital and back to analogue.

A key implementation point is how to bring these insights to bear on current equipment, so that distribution files can be enjoyed on existing equipment while at the same time not accepting compromises in the potential to overcome key problems in processing or in the gateways (A/D and D/A) or to innovate in the future.

Because it has a different conceptual frame of reference, MQA is a philosophy more than it is ‘just a codec’.

Background

It is now widely, although not universally, accepted that ‘hi-res’ digital audio, with increased sampling rate or bit-depth, delivers improved sound quality. But it does so at large cost to coding efficiency. A 24-bit/88.2 kHz recording requires three times the data rate of a 16-bit/44.1 kHz alternative, and that ratio increases by further factors of two as sampling rate is doubled again to 176.4 kHz and then to 352.8 kHz, the sampling rate of DXD. While the progressive improvement in sound quality is welcome, it takes a disproportionate toll on data rates and storage capacity. Simply increasing sampling rate also fails to address head-on why it is that 44.1 kHz and 48 kHz sampling rates impose subjective limitations. Instead, sampling rate has become a proxy for resolution.

Recent hearing research provides support for the long-standing notion that the time-domain performance of anti-alias and reconstruction filters – most especially steep digital linear-phase filters – is responsible for perceptible degradation of sound quality. Recently, direct evidence for the audibility of low-pass filters used in digital audio has been published.

It has been known since at least 1946 that the Fourier time-frequency uncertainty inherent in conventional signal analysis can be ‘beaten’ by human listeners, and by a significant margin. Indeed, recent experimental studies have shown temporal discrimination at least 5 times higher.

These findings accord with the idea that the capabilities of human hearing have been determined by evolutionary requirements, in particular the need to identify sounds as ‘potentially threatening’ or ‘non-threatening’ in the shortest possible time interval, thereby providing the maximum opportunity for fight or flight. While vision plays a part in this too, of course, we cannot see through 360 degrees, around corners, or at as low light levels as some predators.

In these circumstances in particular, our hearing is the primary sense by which we detect danger, and speed of detection and rapid estimation of direction and range is of the essence. As too is the ability to separate direct sound from short-delay or closely-spaced reflections – which naturally require the resolution of short time intervals that are independent of frequency or bandwidth of the source.

Our understanding of natural soundscapes, reverberation, animal vocalisations and speech, requires adjustable time/frequency balances which, up until now, have not been adequately accounted for in audio system design.

This all suggests that the time-domain acuity of the human auditory system may have been more important than frequency-domain acuity and explains why its time-frequency uncertainty is so much superior to that of an FFT analyser (and its close relative, the sinc-kernel of digital sampling). Causal signals are key to our achieving this feat; if natural signal waveforms are time-reversed we can no longer outperform the time-frequency uncertainty of Fourier analysis.

Temporal acuity manifests a survival characteristic, one with origins that must reach back to much earlier in the mammalian timeline than the emergence of homo sapiens.

It would be strange indeed if our remarkable time-domain acuity were irrelevant to the perception of music. In fact there is persuasive evidence that this is not the case: those experimental subjects who have proven most adept at resolving time-frequency uncertainty are musicians, suggesting that time-domain acuity is enhanced – trained – by the process of becoming a musician. So the traditional frequency-domain view of audio system performance is fundamentally at odds with our perception of music. A fresh approach to the specification and design of high fidelity audio encoding and equipment which takes much closer account of system time-domain performance is therefore long overdue.

Want to read more?