Surely there was a market for soundfiles that take up less space than WAV before the internet. Why did MP3 and similar compression techniques not come up before 1994? What were the difficulties?
Printable View
Surely there was a market for soundfiles that take up less space than WAV before the internet. Why did MP3 and similar compression techniques not come up before 1994? What were the difficulties?
Looking back 20-25 years not at compression specifically, but at music + computers in general, the only marriage of the two that had any sort of audio quality was at the very high end of audio production - Synclavier, Fairlight, etc. - Dedicated audio production systems costing hundreds of thousands of dollars. And this is in the '80s so it's not that far back. The overriding concern was audio quality, so I don't think compression was really on the radar for those concerns. Compression is more for consumers than for production.
Sony has their ATRAC compression scheme that they use for minidisc but that wasn't until the early '90s either.
Back in the '80s and '90s RAM was unbelievably expensive and decent audio would suck the life out of your processor and RAM. Even with today's computers you sometimes get sputtering and freezes and whatnot even when playing compressed audio if you try to do too much, and you still need dedicated processors to ease the strain on your main processor if you're doing any multitrack recording and sound processing. So I think it's only within the last 10-15 years or so that a basic home PC could even handle MP3s.
That's some good points, but audio compression has benefits without being able to decode it in real-time (there's still a detectable overhead if you play MP3 today). Imagine if you could send a new song around on 3 or 4 floppies or even by e-mail.
I think the reason is actually a little circular and nonsensical in hindsight: there was no demand for compression methods to make audio files small because hardly anyone even dreamed of it being practical to have entire albums worth of audio stored on their computer at once. And why didn't they dream of it being practical? Because it wasn't practical! Those files are huge! I remember around 1990 having friends who would trade sound files simply for the sake of having little recorded bits of audio on their computers. A four-second sound of Bart Simpson going "Don't have a cow, dude" was enthralling. Forget about an entire high-quality song.
Also contributing to this was the fact that even if you got all that audio on your computer, there wasn't much to actually do with it. This is before iPods, before nice media player software that could set up playlists and display metadata, before networked digital audio streamers (and before the wifi and ubiquitous ethernet that makes them possible). It wasn't until the files came down to practical sizes that people even realized the files were useful and these types of applications started showing up. Before that, Windows Media Player could open one file at a time, and you could play it, then you had to go File -> Open and open another one. And I can remember playing sounds on the mac by dragging them into the system folder so they'd show up in the sound control panel because there was no other built-in way to do it.
"There just wasn't a need" seems to sum it up well. Until not many years ago, "Digital" and "Music" were two unrelated terms to most people. It wasn't until practical portable MP3 players were developed that people had much interest in storing music on computers. CDs were great - they were small, more durable than cassette tapes, sounded better than cassette tapes, what's not to love?
Before then, one of the first digital audio formats was Dolby's AC-1, introduced in 1984 and supplanted with AC-2 in 1989, and AC-3 in 1991 or so. At the time, it was mostly being used as a transmission method - Skywalker Sound started to use AC-2 to move high-quality audio from their studios in Marin down to Hollywood on ISDN lines in 1991, and in 1992, the first digital radio studio to transmitter link was installed. It wasn't until late 1993 that a single-chip digital decoder was developed - well before the age of simply installing a codec file into your PC-based media player. In 1992, AC-3 was used as the key component of "Dolby Digital" movie soundtracks.
Until then, digital audio involved a large investment of money and of space - the equipment was huge. Before that single-chip design in 1993, the encoding and decoding equipment was a big box, about 19 inches square by five or seven inches tall. With the new single chip, the big box was reduced to two inches tall. Today, it's all done in software and a tiny sliver of a thing such as an iPod Nano that could hide under a cookie.
The original CELP algorithm as simulated in 1983 required 150 seconds to encode 1 second of speech when run on a Cray I supercomputer.
I think I remember that a common 16 MHz 68030 couldn't decode an MPEG-2 layer 2 audio in realtime, and that was what, 1995?
You were basically stuck with delta-coding, or with dedicated hardware, until recently.
If memory serves, in 1994 our family's desktop computer had half a gigabyte of hard drive space. We didn't even call them gigabytes then. You can't fit all that many albums in that much space. And they would have taken quite a bit of time to download over the 2400 bps modem.
Some random thoughts from a geezer-in-training whose working knowledge of digital is somewhat dated:
I think the problem with digital audio and computers lay with the storage devices supplying the data. While the CPU's were up to the task, the hard drives were not. The computer-based 8-track audio recorder/editor I used back in the Nineties required an external SCSI drive to keep up with the processor. (Video has a similar history. While it was possible to do all sorts of neat digitial video effects in real time in the early Eighties, it was another decade before it was possible to do digital editing with stored data, and that required several hard drives running simultaneously in order to shag the data fast enough.) So maybe the deal is not so much that compression is being used as it is that the memory devices now available are simply faster. (But as I say, it's been a decade since I've been "in the loop" and I haven't the faintest idea what the MP3 encoding scheme may be.)
Not all early digital recorders were hugh and expensive. Circa 1985 Sony developed the F1 processor which could record and play CD-quality audio. The box wasn't much larger than the typical DVD player of recent years. And the retail price was something like $1400. The catch was that it was just a processor - you had to record the data on a VCR. And thus, editing was, at best, crude. It was intended for the consumer market, but a lot of public radio stations got into digital recording in the late Eighties by buying an F1 and a BetaMax.
That Dolby AC-1 gotpasswords mentions must be the thing I recall reading about at the time but never encountered. As I recall, it achieved compression through a rather unique coding scheme: instead of recording the result of each sample, it computed the difference between the current sample and the previous one, a much smaller number, and recorded that. I've no idea how the thing recovered its bearings if it ever went astray.
The great-granddaddy of commercialy successful digital audio devices must be the Allen Digital Computer Organ, first marketed in 1972. How'd they do it? First, of course, the data was coming from ROM chips, so throughput was not a problem. I've no idea what the bit rate was, but since the audio output was "clean" it must have been at least 12-bit. Given the aliasing on the top few notes of the 1' flute stop, the sampling rate must have been on the order of 32kHz. They used an interesting method to cut their memory needs in half: if you design your waveform so that all the harmonics are in phase with the fundamental, the positive and negative halves of the waveform are mirror images of each other. Thus, you need only have the data for the positive half of the waveform; read the same data backwards and add a minus sign to the numbers and you get the negative half. (Which was the system's only real drawback - with everything perfectly in phase, the sound tended to be too "pristine".)
Now you've done it. I just went to Google for a trip down Memory Lane.
The F1 actually came out in 1981. Here's a picture. I kinda understated its size - it had an external power supply.
I'd forgotten about R-DAT, which essentially was an F1 and VCR in the same package, and about the size of the ubiquitus Maranz cassette machine used by radio news departments.
I don't follow your reasoning. This seems like a motivation for audio compression being needed, not barrier to its development.Quote:
Originally posted by BJMoose
I don't think you quite understood me (and I may well have been vague). I was saying that, while microprocessors have long been up to the task of handling digital audio, the computer mass storage devices of a couple of decades back were too slow to keep up.Quote:
Originally posted by McNutty
No doubt it was a motivation to develop compression for audio files. But as I think about the nature of audio (especially music), there simply is not enough "repetition" in the data stream for any compression scheme to work (unless you're digitizing a test tone). If there is an effective compression scheme today, I suspect it's something along the lines of the Dolby system discussed above. But as I say, I'm at least ten years behind the times.
But lossy audio formats do vastly, effectively compress data. It doesn't have much to do with repetition, but with removing sounds that people normally don't hear, anyway, thus keeping only the stuff that matters. For example, you can't tell the difference between size 100 font and size 99 font (I really am using it), so I can save information space by changing size 99 font to 100 and getting rid of that extra information. Of course, as you make it smaller and smaller, you get rid of bigger and bigger differences, till it gets to really crappy audio where every "font" is either size 50, size 75, or size 100, and all the subtler variations are gone.Quote:
Originally posted by BJMoose
And audiophiles claim that they call tell the difference between size 99 and size 100 "fonts", so they stick with the original, much larger, information. And maybe some of them can.
[aside]I can tell the difference. - (Okay, mostly because my browser seems to use different font thresholds and displays one as 10px and the other as 9px. :tongue:) But even if you zoom way in so that the pixel quantization doesn't matter, that's a difference in area of 1.99%, which should be quite visible. Line up a row of 10 cm big objects, and anybody should be able to see if one is 1 mm shorter. - Bad example, because it's off by several orders of magnitude. Consider that you can hear a grain of salt falling on a linoleum floor. But probably not when someone is playing a kettledrum at the same time.[/aside]
True is, there's lots of entropy in CD-quality audio data. You can't compress it down to less than 30%..70% without taking away something. True is also, you can take away quite a lot, without doing too much audible damage. - And there's the rub. To decide what we can safely take away, we need a sophisticated psychoacoustic model, which requires processing power.
To store audio without lossily compressing it into mushy goop, you either need insane amounts of storage capacity, or insane amounts of processing power. We've had insane amounts of storage capacity, first.
At a time when a single Compact Disc stored more data than the average user's harddisk, games would play uncompressed background music directly from the CD drive, looped via analog cable into your soundcard. For other sound effects, many games from 2001 still used simple 4-bit ADPCM, because processing power was too precious to waste it on complicated audio compression. - Googling around, it seems to take maybe 30 MIPS (million instructions per second) to decode an MP3, that is, almost as much as a full-blown 50-MHz 486 had.
Uncompressed CD audio is 172 KB/s, a more than manageable data rate for all but the oldest HDs (for comparison, even a floppy did 46 KB/s) - The problem was simply that your expensive 20 MB drive was full after 2 minutes! The common processors couldn't help much to reduce this volume either, since they've only had a few MHz.Quote:
Originally posted by BJMoose
There were CD players, and there were computers. It didn't occur to anyone how (or why) these two worlds could possibly mix.
I'll stand corrected, sort of. The specific system I mentioned earlier had to be able simultaneously to read up to eight files while recording two more. (And I hear ya about drive capacity. The place I then worked for got its first PC in 1985; it had a 10MB hard drive, and, if memory serves, it took us about two years to fill the drive. Ah, the good old days. Now, Microsoft needs more than 10 meg just to break wind. . . .)
In response to Vox: I can imagine a scheme where, if you compressed the hell out of the audio before you digitized it, you might be able to get away with 8-bit sampling instead of 16-bit. It could work for most pop music (FM rock stations have been compressing their audio for decades in order to "sound louder"), but you wouldn't want to try it on a Mahler symphony.