วันอังคารที่ 21 สิงหาคม พ.ศ. 2550

MP3

MPEG-1 Audio Layer 3, more commonly referred to as MP3, is an audio encoding format.
It uses a
lossy compression algorithm that is designed to greatly reduce the amount of data required to represent the audio recording, yet still sound like a faithful reproduction of the original uncompressed audio to most listeners. It was invented by a team of European engineers at Philips, CCETT (Centre commun d'études de télévision et télécommunications), IRT and Fraunhofer Society, who worked in the framework of the EUREKA 147 DAB digital radio research program, and it became an ISO/IEC standard in 1991.
MP3 is an audio-specific format. The compression removes certain parts of sound that are outside the normal human hearing range so cannot be heard by the listener. It provides a representation of
pulse-code modulation — encoded audio in much less space than straightforward methods, by using psychoacoustic models to discard components less audible to human hearing, and recording the remaining information in an efficient manner. Similar principles are used by JPEG, an image compression format.

Development
The psychoacoustic masking codec was first proposed, apparently independently in 1979, by Manfred Schroeder, et. al.[1] in Germany and M. A.Krasner[2] in the United States. Krasner was the first to publish and to produce hardware, but the publication of his results as a relatively obscure Lincoln laboratories Technical Report did not immediately influence the mainstream of psychoacoustic coder development. Manfred Schroeder was already a well known and revered figure in the world wide community of acoustical and electrical engineers and his paper had immediate influence in European and specifically German circles of acoustic and source-coding (audio compression) research. Both Krasner and Schroeder built upon the work of E. F. Zwicker.[3]
The immediate predecessor of MP3, and the first practical implementation in hardware (Krasner's hardware was too cumbersome and slow for practical use), was "Optimum Coding in the Frequency Domain",[4] which was an implementation of a psychoacoustic transform coder based on Motorola 56000 DSP chips. MP3 is directly descended from OCF. MP3 represents the outcome of the collaboration of Dr. Karl Heinz Brandenburg with the Fraunhofer Society for Integrated Circuits, Erlangen, with relatively minor contributions from the Musicam (MP2) branch of psychoacoustic sub-band coders.
Modern lossy bit compression technologies, including MPEG and MP3, are based on the early work of Prof
Oscar Bonello of the University of Buenos Aires, Argentina. [dubiousdiscuss] He was involved in studio equipment design for broadcast radio automation. At the same time he taught acoustics at the University (he is the author of the "Bonello Criterion" for room acoustics design), with psychoacoustics being his main field of research. In 1983, he started researching the idea of using the Critical Band Masking principle (a property of the ear) in order to reduce the bit stream needed to encode an audio signal. The masking principle was discovered in 1924 and further developed by Egan-Hake and Richard Ehmer in 1959. Bonello's work created, in 1987, the world's first bit compression system[dubiousdiscuss], named ECAM, working in real time and implemented by hardware on an IBM PC computer. This plug in card and the associated control software was demonstrated for the first time in 1988 as a fully working product named Audicom and introduced to the world at the international NAB Radio Exhibition in Atlanta, USA on 1990. The basic Bonello implementation is now used in MP3 and other systems. Bonello refuses to apply for any patents around this technology.[5][6]
MPEG-1 Audio Layer 2 encoding began as the Digital Audio Broadcast (DAB) project managed by Egon Meier-Engelen of the Deutsche Forschungs- und Versuchsanstalt für Luft- und Raumfahrt (later on called Deutsches Zentrum für Luft- und Raumfahrt, German Aerospace Center) in Germany. This project was financed by the European Union as a part of the EUREKA research program where it was commonly known as EU-147, which ran from 1987 to 1994.
As a doctoral student at Germany's University of Erlangen-Nuremberg,
Karlheinz Brandenburg began working on digital music compression in the early 1980s, focusing on how people perceive music. He completed his doctoral work in 1989 and became an assistant professor at Erlangen-Nuremberg. While there, he continued to work on music compression with scientists at the Fraunhofer Society (in 1993 he joined the staff of the Fraunhofer Institute).[7]
In 1991, there were two proposals available: Musicam (known as Layer 2), and ASPEC (Adaptive Spectral Perceptual Entropy Coding). The Musicam technique, as proposed by Philips (The Netherlands), CCETT (France) and Institut für Rundfunktechnik (Germany) was chosen due to its simplicity and error robustness, as well as its low computational power associated with the encoding of high quality compressed audio. The Musicam format, based on sub-band encoding, was a key to settle the basis of the MPEG Audio compression format (sampling rates, structure of frames, headers, number of samples per frame). Its technology and ideas were fully incorporated into the definition of ISO MPEG Audio Layer I and Layer II and further on of the Layer III (MP3) format. Under the chairmanship of Professor Mussmann (University of Hannover) the editing of the standard was made under the responsibilities of Leon van de Kerkhof (Layer I) and Gerhard Stoll (Layer II).
A
working group consisting of Leon Van de Kerkhof (The Netherlands), Gerhard Stoll (Germany), Leonardo Chiariglione (Italy), Yves-François Dehery (France), Karlheinz Brandenburg (Germany) took ideas from Musicam and ASPEC, added some of their own ideas and created MP3, which was designed to achieve the same quality at 128 kbit/s as MP2 at 192 kbit/s.
All algorithms were approved in 1991, finalized in 1992 as part of
MPEG-1, the first standard suite by MPEG, which resulted in the international standard ISO/IEC 11172-3, published in 1993. Further work on MPEG audio was finalized in 1994 as part of the second suite of MPEG standards, MPEG-2, more formally known as international standard ISO/IEC 13818-3, originally published in 1995.
Compression efficiency of encoders is typically defined by the bit rate, because compression rate depends on the bit depth and
sampling rate of the input signal. Nevertheless, there are often published compression rates that use the CD parameters as references (44.1 kHz, 2 channels at 16 bits per channel or 2×16 bit). Sometimes the Digital Audio Tape (DAT) SP parameters are used (48 kHz, 2×16 bit). Compression ratios with this reference are higher, which demonstrates the problem of the term compression ratio for lossy encoders.
Karlheinz Brandenburg used a CD recording of
Suzanne Vega's song "Tom's Diner" to assess the MP3 compression algorithm. This song was chosen because of its softness and simplicity, making it easier to hear imperfections in the compression format during playbacks. Some jokingly refer to Suzanne Vega as "The mother of MP3". Some more critical audio excerpts (glockenspiel, triangle, accordion, etc.) were taken from the EBU V3/SQAM reference compact disc and have been used by professional sound engineers to assess the subjective quality of the MPEG Audio formats.

Audio quality
When creating an MP3 file, there is a trade-off between the amount of space used and the sound quality of the result. Typically, the creator of the MP3 file is allowed to set a bit rate, which specifies how many kilobits the file may use per second of audio, for example, when ripping a compact disc to this format. The lower the bit rate used, the lower the audio quality will be, but the smaller the file size. Likewise, the higher the bit rate used, the higher quality, and therefore, larger the file size the resulting MP3 will be.
As described, MP3 files encoded with a lower bit rate will generally play back at a lower quality. With too low a bit rate, "
compression artifacts" (i.e., sounds that were not present in the original recording) may be audible in the reproduction. Some audio is hard to compress because of its randomness and sharp attacks. When this type of audio is compressed, artifacts such as ringing or pre-echo are usually heard. A sample of applause compressed with a relatively nominal bitrate provides a good example of compression artifacts.
Besides the bit rate of an encoded piece of audio, the quality of MP3 files also depends on the quality of the encoder itself, and the difficulty of the signal being encoded. As the MP3 standard allows quite a bit of freedom with encoding algorithms, different encoders may feature quite different quality, even when targeting similar bit rates. As an example, in a public listening test featuring two different MP3 encoders at about 128 kbit/s,
[8] one scored 3.66 on a 1–5 scale, while the other scored only 2.22.
Quality is heavily dependent on the choice of encoder and encoding parameters. While quality around 128 kbit/s was somewhere between annoying and acceptable with older encoders, modern MP3 encoders can provide very good quality at those bit rates
[9] (January 2006). However, in 1998, MP3 at 128 kbit/s was only providing quality equivalent to AAC-LC at 96 kbit/s and MP2 at 192 kbit/s.[10]
The transparency threshold of MP3 can be estimated to be at about 128 kbit/s with good encoders on typical music as evidenced by its strong performance in the above test, however some particularly difficult material can require 192 kbit/s or higher. As with all lossy formats, some samples can not be encoded to be transparent for all users.
For digital stereophonic sounds, this transparency threshold of MP3 can be greatly reduced by using the Joint stereo coding mode based on stereo intensity redundancy removal. This feature further reduces the overall bit rate of a stereophonic sound down to 96 kbit/s. Unfortunately, in spite of a wide use of this feature in most MP3 files and all standardized encoders no official results of this transparency level were ever published due to strong lobbying and opposition of the professional music industry.[
citation needed]
The simplest type of MP3 file uses one bit rate for the entire file — this is known as
Constant Bit Rate (CBR) encoding. Using a constant bit rate makes encoding simpler and faster. However, it is also possible to create files where the bit rate changes throughout the file. These are known as Variable Bit Rate (VBR) files. The idea behind this is that, in any piece of audio, some parts will be much easier to compress, such as silence or music containing only a few instruments, while others will be more difficult to compress. So, the overall quality of the file may be increased by using a lower bit rate for the less complex passages and a higher one for the more complex parts. With some encoders, it is possible to specify a given quality, and the encoder will vary the bit rate accordingly. Users who know a particular "quality setting" that is transparent to their ears can use this value when encoding all of their music, and not need to worry about performing personal listening tests on each piece of music to determine the correct settings.
In a listening test, MP3 encoders at low bit rates performed significantly worse than those using more modern compression methods (such as AAC). In a 2004 public listening test at 32 kbit/s,
[11] the LAME MP3 encoder scored only 1.79/5 — behind all modern encoders — with Nero Digital HE AAC scoring 3.30/5.
It is also important to note that perceived quality can be influenced by listening environment (ambient noise), listener attention, and listener training.