Audio Visual Image Blog: Week 3 - Storing Sound Digitally

This entry will discuss how to store sound waves digitally.

The challenge of storing a sound into a computer has multiple steps to overcome. First, a sound has to be transformed into an electrical signal which can be done through a microphone. Next, this analog electrical signal needs to be transformed into something that a computer can store.

The below image is just a simple representation of how sound is stored in a digital form</

The reason that such conversion needs to happen is due to the fact that an analog signal is continuous. A computer can only store binary information, which can not represent continuous data (you would need an infinite amount of bits to do such a thing). So in order to store it, a continuous signal must be turned into a discontinuous form that a computer can store, a digital form.

This is achieved by essentially taking snapshots of the sound wave as represented by an analog electrical signal. There are a number of factors that come into play which will determine the precision of the snapshot however, which this section will now go over.

Sample Rate

The sample rate is the number of samples of an analog signal captured per second, expressed in Hz. The greater the sample rate, the more precisely the wave is captured. A higher sample rate will also require more space to store each second of audio, this is known as the bitrate (number of bits per unit of time, usually seconds). Sample rate isn't the only thing to affect bitrate, but I will go over that later.

In this example, the analog wave (blue) is sampled at a regular rate. Each sample is marked with a red square, representing a value that is stored digitally. This is the only thing that gets stored in the computer, anything that takes place between the points is lost. The green lines represent a rough figure of a wave that would be output from the samples taken. As can be seen, a fair amount of precision is lost. For example the deepest part of the first valley is cut off. Notice how it more accurately captures the slightly longer wave.

In this second example, the sample rate is doubled. While some information is still lost, the resultant wave in green still resembles the original wave much more closely than at the previous sample rate. The cost of course is twice the number of dots, which means twice the amount of data required to store it.

Finally, this example is half the sample rate of the first example. The first half of the sound is almost completely lost, but the longer wave is captured at least in part.

As can be seen, lower frequency waves can be captured with a lower sample rate. Lower sample rates are however completely incapable of representing higher frequencies, so to capture a wider range of frequencies will require a higher sample rate. There is in fact a minimum sample rate required to capture frequencies accurately.

Sample Rate (kHz)	Maximum Frequency (kHz)
8	3.6
11.025	5
22.05	10
32	14.5
44	20
48	21.8
64	29.1
88.2	40
96	43.6

Source: http://wiki.audacityteam.org/wiki/Sample_Rates

The above table shows a trend. The sample rate is always just over twice as much as the maximum frequency it can capture accurately. Human hearing has a range of 20-20kHz, which means that the minimum sample rate required to capture sounds that humans can hear is 44kHz. Anything higher than that would have an almost indistinguishable increase in quality for humans, less than that however the difference will be noticeable.

Quantization

The amount of data stored per sample is the other factor that goes towards precision, as well as the size of the resulting file. This is also known as bit-depth, word size or resolution.

The value of a sample at any given point is a representation of the level of the analog signal it captured. This is usually an electrical signal that will have some range, say for example from 5 to -5 Volts. An analog signal is continuous however, thusly the actual range of possible values it could have are infinite. As a computer cannot store an infinite range of values, it has to be rounded to some value that a computer can store. As such, the range of 5 to -5 volts can be split up into a number of steps. The number of steps that can be represented per sample is dependant on the number of bits you're willing to delegate such data to. The higher the number of bits, the greater the precision and also of course the larger the size of the sample in bits.

This image illustrates the lack of precision that can be caused by low quantization. Here, each value that can be stored in a sample are listed along the left side of the chart. The red line represents a sample. The actual analog value of the third red dot could be something like 3.4092.. etc. This can't be stored in the illustrated system however, so it is instead rounded down to 3. This would audibly alter the sound, so to get a more accurate capture a larger number of steps would be required.

The number of steps it is possible to store is directly related to the number of bits per sample.

8-bit quantization can represent 255 voltage levels.

16-bit quantization can represent 65,536 voltage levels.

24-bit quantization can represent 16,777,216 voltage levels.

Bit Rate

The bit rate is just the number of bits per second that a given piece of digital audio has. It's worked out from the sample rate and the quantization as well as the number of channels (for example, stereo sound has 2 channels).

Bit Rate = sample rate * bit depth * channels

A 44.1kHz stereo sound with 16-bit quantization is thusly:

44100 * 16 * 2 = 1,411,200 bits per second.

The file size of a piece of audio can be worked out from this if you also add in the duration.

Filesize = sample rate * bit depth * channels * seconds

If the above example was 3 minutes long then

Filesize = 1,411,200 * 180

Filesize = 254,016,000 bits. Or 31,752,000 bytes or around 30MB.

Of course this is only valid for uncompressed audio. The higher the level of compression, the lower the bitrate but potentially also the lower the quality of the sound upon playback.

Source: http://www.dolphinmusic.co.uk/article/120-what-does-the-bit-depth-and-sample-rate-refer-to-.html

Audio Visual Image Blog

Wednesday, 17 October 2012

Week 3 - Storing Sound Digitally

No comments:

Post a Comment