![]() Hello Dear MIUI Fan's! Today you'll know more about MP3 File format, how comress this file and other information about it. Many of us remember CD players and CDs. So on such a disk could fit a maximum of 20-23 songs). And with the advent of MP3 on one disc, you can record about 100. Today's you know how encoded MP3 file. ![]() About CD and CD recording! The question arises:
But in order to understand how ingenious the mp3 thing is, you need to figure out how a regular CD file is encoded. This is done in the simplest way - it records the level of the analog signal at small intervals, on a CD it is done 44,100 times per second! And in each measurement, the signal level is encoded with a 16-bit number, that is, ~65,000 gradations are obtained. And if the stereo recording is two channels in parallel, it is very easy to calculate that one second of such audio will occupy 1411 kibits (44,100x16x2). But mp3 can manage with a data stream (bitrate) of 128 kibits in one second, and this is 11 times less than on a CD. You ask. What kind of magic? How is it possible to compress the audio recording? You can’t just skip 10 reports! And now there are compression formats and abruptly mp3 (WAV,OGG,ACC,WMA etc). But they use exactly the same principles as MP3. So let's look at the most iconic of them! It turns out that mp3 information is recorded in a completely different way and the basis of everything is the so-called Fourier transform. What is this? We know that sound is a wave. Let's look at an example of some kind of audio signal, it turns out that this is some kind of ugly wave! ![]() But this is not a record so scary it is actually Beethoven)). Sound in itself is a mixture of waves with different frequencies, overlapping each other they led to this form. ![]() The Fourier transform does the following: it allows you to understand what frequencies a given audio signal consists of, and with what intensity each of them sounds. By the way, this is exactly how Shazam works: it decomposes the recorded sound into frequencies, selects specific notes (special sounds) and compares it with its database. Of course, when an mp3 file is encoded, it is first broken into small pieces (the so-called frames, 0,0026 sec) Fourier transform is performed on each of them, and the fun begins! The fact is that our hearing is not perfect, it is inherent in everyone! For example, with age, sensitivity to high frequencies decreases - therefore, in mp3 it immediately cuts off anything above 16,000 hertz. Link for 16000 Hertz sound Cruel! I agree! but you yourself listen to these 16,000 hertz (everyone who heard it means you are less than 30-35 years old, well, if you have not heard more than 30-35 years old, respectively). Thus it turns out to slightly reduce the file size. But the greatest compression can be achieved using the so-called masking effect. It turns out that if at some frequency the signal is very loud, it can simply drown out the neighboring frequencies and frequencies that are multiples of it, the so-called harmonics. It turns out that we can simply remove part of the sound signal and no one will notice it! There is another type of disguise (Time masking - 0.05 sec)- it turns out that after a very loud signal, a short period of deafness occurs. Therefore, if there is a loud sound, then after it the entire signal can be removed and no one will notice it either! Temporary deafness may occur even before a loud sound (motorcycle sound). This is due to the features of the processing of audio signals in our brain, and it is very easy to check! This is the use of the so-called psycho-acoustic model in life, which takes into account the imperfection of our hearing. But there are also tricks, for example, when recording a stereo file, not the left and right channels are recorded separately, but the sum of the channels divided in half and the difference of the channels. Music tracks are mixed so that most of the instruments sound the same in both channels, that is, there are not many details in the channel difference. Therefore, it can be encoded more roughly, which further reduces the file size. But that's not all! Now the most important thing. Look! on the one hand, we deleted information that we don’t need, but in fact we just replaced it with the same characters, for example: zero "0" ![]() But they did stay in the file, so you still need to encode them so that they take up as little space as possible. And this is done using the so-called Huffman code. Such compression does not lead to additional losses and it works like this: for each character in the file (well, more precisely in the frame), some code is assigned: moreover, if the character is found often, the code is short, and if the character is rare, then it is long. ![]() And it works like this: if we cut out quite a lot from the file and we have many zeros, then we can encode them with a very small number of bits. Therefore, the file size will be small! And at the last stage, the Fourier transform coefficients (well, those that remain) are written into the frame, and they are all glued together and a finished file is formed. Well, the playback is in the reverse order. As you understand, when encoding mp3 there are irreversible losses, because we delete a lot of things. It turns out that we are very much deceived, but this is only because we ourselves allow it! After all, our perception of sound is not perfect. And yes, we can only distinguish 9% of the audio information. Well and good! After all, on the other hand, we 9% is enough to perceive the whole sound picture! And precisely because of this imperfection, we can easily hear our interlocutors in a noisy crowd, we don’t stall at concerts, and we can download a lot of music from the Internet without wasting precious traffic. That's the whole algorithm for compressing and encoding an MP3 file. I hope it was interesting. Thanks to Administrator and SMod and Mod! Related thread: |
Rate
-
Number of participants 6 Experience +67 Pack Reason