Measurements. They can be used in one-upmanship fashion. Magazines and reviews which do them are serious, scientific and objective. Those which don’t, aren’t. Doug Schneider of the SoundStage Network for example says, “I don’t take magazines too seriously that don’t do them.” Can measurements predict performance? Doug who prides himself on taking the best loudspeaker measurements in the industry: “You must have sufficient measurements to predict… in my opinion, no magazine produces nearly enough measurements to predict anything with great accuracy, including us.” If that’s the case, what are measurements good for? Bragging rights? Doug believes that at the very least, measurements reveal whether a product has been competently designed.
Another US publication which conducts and publishes measurements is Stereophile. It’s not at all uncommon to find disparity between their measurements and the opinion of the writer who actually heard the product [Ed – Stereophile’s Croft Acoustics Integrated amplifier review is one recent example]. If bad measurements quite routinely don’t correlate or mean what they seem to, does it imply that good measurements prove or substantiate a good review opinion? Couldn’t one just as easily claim that if (within reason) bad measurements aren’t all that meaningful, neither are good ones (if they measure things which aren’t as important as we think)?
For example, the typical nearfield speaker measurement is done at 1m/1w. That’s to simulate anechoic conditions and to not capture room effects which, obviously, would differ from room to room. The irony is that without room effects, measurements won’t tell us what this speaker will sound like in our room. In fact, the most useless and abstract thing about expensive anechoic speaker measurements from the consumer perspective is that they show premature roll-off in the bass. This isn’t at all representative of what that speaker will actually do in a reflective real-world environment. The anechoic measurement will show a steep roll-off at 100Hz. In room, that speaker might do solid bass to 40Hz. That’s a huge difference. Without proper interpolation from knowing how to read an anechoic result, the measurement itself remains grossly deceptive.

Amplifier measurements might show total harmonic distortion and intermodulation distortion figures. Here it’s easy to believe that lower must be better. Just so it overlooks how the ear/brain responds to and filters harmonic distortion; how certain kinds of distortion are more benign than others; how amplifier and speaker distortion interacts, sums or partially cancels. Our typical hifi measurements don’t show how harmonic distortion shifts with amplitude, i.e. how distribution changes with SPL. They often don’t show sufficient increments to track how an amp behaves at 1w vs. 10w vs 100w. It might be a peach if used up to 5 watts but turn into a distortion generator at 20. Distortion measurements at 1kHz tell us little to nothing about the amp’s broadband behaviour.
How about the effects of voice-coil heating on speaker distortion? Even if one did show high-power measurements, would they include changes over time to illustrate what happens when a given loudspeaker is played loudly over 15 minutes? How does one accurately represent an omnipolar speaker’s in-room behaviour with nearfield measurements like a Duevel, mbl or German Physiks? How about side-firing mid/woofers crossed at 1kHz? To begin with, are static test-tone measurements even remotely representative of complex music signal? These are just some of the many valid questions one may ask on the subject. For our purposes, the most important one is whether the typical review measurements are all that useful to a potential buyer.
What meaningful things can they tell us? They can, for example, confirm or dispute an amplifier’s power specification. Does it really make 400 watts into 4Ω below 1% THD as claimed? What happens when that amp gets thermally stressed to the max? Will it blow up? Will it trigger protection or blow a fuse? Will a tube amp’s output transformer make 30Hz and 20kHz without roll-off? In speakers, non-linear frequency response like suck-outs and peaks particularly around the crossover points can show nearfield issues which may or may not be fully or partially obscured in the actual listening seat. And so on. There are many things which measurements can confirm or refute.
On the manufacturing end, it’s clear that measurements are mandatory. There is parts matching to insure they’re all within 0.25% or 5% tolerance (whatever that maker’s standard may be); or response matching between a left and right speaker plus matching it to the reference lab model. Quality control without measurements is unthinkable. So is digital design which can’t even be listened to until the signal is in the analogue domain. Sources of noise within a circuit must be tracked down with measurements. Attempts at circuit distortion reduction and PCB layout is tracked with measurements. Speaker enclosure behaviour is quantified by measurements. So is the efficacy of properly engineered isolation products. And so on and so forth ad infinitum.
In fact, it’s only with very advanced simulation software like Comsol that complex behaviour across a variety of interlinked disciplines like mechanical, electrical and acoustical can be modelled to cut down development time, money and endless prototyping. The most engineering-driven most experienced companies may nail 90% of a product’s design entirely in the virtual domain. Companies like Goldmund write their own simulation software. YG Acoustics have written proprietary software to design their crossovers. Etc.
At what point listening tests kick in differs from maker to maker. At some point, they invariably do. Now listener feedback begins to interact with measurement verification to finalize the remaining 5% or 15%. This can include parts selection where different capacitors or transformers may measure identical but sound very different. Back on the consumer side, Doug is right to say that whilst measurements may predict a lot less about final performance than we’d like them to and do so with far less accuracy than ideal, they can at minimum reveal whether a product was competently or shoddily designed; and additionally provide useful information to those who know how to read them. This sadly excludes the majority of consumers. Making measurements meaningful relies not only on knowing how to interpret them, it requires that one tie together a whole suite of them in an interdisciplinary multidimensional fashion to arrive at a useful bigger picture that accounts for how they interact.
That there remains a big gap between subjective listening impressions and a fuller correlation via more complete measurements is plain. To begin with, it requires a far more complete understanding of how the human brain processes sound than we have at present. The hifi industry is far too small to finance the necessary cognitive brain research. Industry at large has little to no interest to further that understanding unless it could be monetized far more significantly. On the product side, we often don’t yet know what (else) to measure. The most current example are USB and Ethernet cables and peripherals for audio use. Many users hear differences. Hardcore IT experts are adamant that if basic specs are in place, nothing else should or could make any difference.
Not that long ago jitters were what happened to students before a test; or to folks about to get married. Today jitter is an acknowledged phenomenon. Yet devices like the Uptone Regen, Audioquest Jitterbug and Schiit Wyrd still represent engineered solutions which address issues whose very existence is just beginning to be acknowledged by the very few. But forward 5 years and “Ethernet flue” will be in everyone’s vocabulary as something even the most affordable DAC addresses in some form.

For now, reading consumers believing in the usefulness of measurements already have certain sources. With it they enjoy monthly opportunities to correlate reviewer commentary with measurements to learn which ones are most likely to predict anything of usefulness; and which ones merely look good, scientific and serious to add perceived gravitas to said review and its publication. Meanwhile we have popular electronics from manufacturers who have stated to never yet have measured anything which correlated with what they hear; we have makers who claim to finalize their products purely on the strength of a suite of ‘perfect’ measurements; we have amps which measure as perfect as current technology allows yet have listeners claim they sound bad; and continue to have amps which measure poorly but enjoy plenty of sales and contented listeners.
If we add up all the evidence and what’s between the lines, what does it really say?