* The article covers digital audio from its very foundations, which is unnecessary. Just put a link to Wikipedia.
* The citation format is weird. Instead of using inline hyperlinks, you have to look up references at the end.
* The test only covers MP3. This is 2014... and we know MP3 has a low-pass filter.
* The methodology doesn't mention converting from MP3 back to a lossless format for the ABX test, and it doesn't mention measuring MP3 encoder gain. ABX tests can be colored when people recognize differences between lossy and lossless playback that have nothing to do with the encoding itself: if lossless playback uses less IO and more CPU, then you might hear less clicking from your disk drive and your computer's CPU fan might spin faster. Most ABX tests convert both files back to WAV before testing.
* The spectrum of the residual is not very interesting, because the human ear is not sensitive to phase (or far less sensitive to phase than amplitude). It would be more interesting to compute the differences in the spectrums.
If you want a more sound and thorough analysis of audio compression, try the Hydrogen Audio tests:
Quick summary: At reasonable bit rates, many encoders are sonically transparent. At lower bit rates (64/96 kbit/s), Opus is the most transparent, and Apple's encoder is in second place.
Hey, thanks for the feedback! Some of your criticisms point out things which exist mainly because the piece was originally written for a class assignment. e.g. Digital audio was not covered in the course, so I added some information in order to better provide background for the detection theory stuff, which was in the class (and also to make the article more self-contained for the reader unfamiliar with that material). Naturally there is nothing groundbreaking in that summary, but I thought I did a decent job of helping the reader with the legwork of understanding the principles. Additionally, the references are in APA format which was required for the class but, as you point out, is not ideal for the web. I will likely tweak the web version when I have a chance so it's a little more easily navigable.
The fact that the residual spectrum is uninteresting I totally acknowledge; that was admittedly an afterthought. I'm going to check out your hydrogenaudio link for details on how to better that section/figure.
As far as the fact that I only used MP3—that was kind of the point. I wanted to pick a seemingly low-hanging fruit so that I would have some nontrivial results to work with (i.e., cases where I could hear the difference). If I'd used high bitrate AAC and it was entirely transparent for my ears 100% of the time, I wouldn't be able to do much talk about false positives and hits and misses—talk which was ultimately the meat of the assignment, and thus of the article. The goal was, as I say in the "nota bene" at the top, not to say "Hey, look how unreliable MP3 128 is at being transparent," but rather to use MP3 128 to demonstrate the concepts of transparency and compression perceptibility.
As far as converting MP3 back to a WAV for ABX playback, I hadn't heard of that but I have to think that it would have made little to no effect for my test. I have a solid state hard drive so had no disk noise, and I did not hear the cpu fan on for either type of playback. It is something I'll consider for the future.
Again, I appreciate your feedback and thanks for taking the time to comment!
Excellent points, klodolph, and I'd add is that he used 128K MP3, which is well known to not be as good as CD quality. Most music services use 320K MP3 these days when bandwidth isn't an issue (e.g. over wifi).
Hey, as I mentioned in my response to klodolph, I used 128 CBR because I knew it would likely not be transparent and wanted to demonstrate false positives, hits, misses, correct recognitions, etc. I wasn't trying to make some authoritative critique of MP3 just because it's low settings are indeed low quality.
> [...] it is concluded that phase distortion is audible and it degenerates the perceived sound quality for commonly heard sounds when reproduced through headphones. (The Influence of Phase Distortion on Sound Quality)
I've always thought of there being 5 different levels of lossiness:
* Understandable
* Not noticeably worse "casually" (i.e. unable to distinguish which version it is without having heard both versions first)
* Not noticeably worse
* Not noticeably different
* Exact
This applies to video encoding also: for example, I start being able to pick up on mpeg artifacts well before I actually start being annoyed by them.
They each have their roles. The question is: what encodings are most suited for what domains out of the above?
* The article covers digital audio from its very foundations, which is unnecessary. Just put a link to Wikipedia.
* The citation format is weird. Instead of using inline hyperlinks, you have to look up references at the end.
* The test only covers MP3. This is 2014... and we know MP3 has a low-pass filter.
* The methodology doesn't mention converting from MP3 back to a lossless format for the ABX test, and it doesn't mention measuring MP3 encoder gain. ABX tests can be colored when people recognize differences between lossy and lossless playback that have nothing to do with the encoding itself: if lossless playback uses less IO and more CPU, then you might hear less clicking from your disk drive and your computer's CPU fan might spin faster. Most ABX tests convert both files back to WAV before testing.
* The spectrum of the residual is not very interesting, because the human ear is not sensitive to phase (or far less sensitive to phase than amplitude). It would be more interesting to compute the differences in the spectrums.
If you want a more sound and thorough analysis of audio compression, try the Hydrogen Audio tests:
http://wiki.hydrogenaud.io/index.php?title=Hydrogenaudio_Lis...
Quick summary: At reasonable bit rates, many encoders are sonically transparent. At lower bit rates (64/96 kbit/s), Opus is the most transparent, and Apple's encoder is in second place.