Would someone with a little more knowledge than me about sound engineering be able to tell me exactly how the concept of bits work in audio? I know that 8bit audio sounds more degraded than 16bit in that there are less possible values to represent the softest and the loudest sample, but I still don’t quite get how this causes bass sounds to have hiss. I’ve been trying to a create a simple sine wave to use as a bass sound, but I ran from one problem into the next and eventually ended up trying to create the sound in 24bit, and after that in 32bit since even the 16bit version seemed to have a lot more higher frequencies than I wanted it to. I noticed Renoise doesn’t load actual 32bit sounds, but when I specify 32bit (IEEE float) at the point of creation, it works fine. What exactly is the difference between the two?
I’m synthesising the sounds in Sound Forge, by going Tools->Synthesis->Simple…, and I create a sine wave at 46.25Hz of exactly 4 seconds in length. Then when I look at a spectrum analysis of the resulting sound, I get the following results:
32bit (IEEE float):
Why is that the 16bit version has the higher peaks (which seem like harmonics to me) as well? Wouldn’t synthesising a simple sound of frequency 46.25Hz result in a sound of only that frequency? Or how exactly does this work? This really got me thinking now because if I place a lowpass equaliser even at 500Hz and with a Q of 1.4, there are still higher frequencies removed from the sound. I’ve also realised that the EQ plugin I’m using (Ultrafunk fxEqualizer) seems to smoothen out the sound where it was originally cut off with a click due to note cuts placed in the tracker cutting the sine wave anywhere other than on the DC line. I notice Renoise does this too, in fact, but the click is still really noticeable. So does anyone have any insight on any of this?
the 8 bit hiss is produced by the approximation of the sine function made by only 256 possible values (8 bit => 2^8 = 256). The more bits you have, the less this approximation is audible. 32 bits files are the only audio files where the represented quantities are not integers.
there are many 32 bits sound formats. The difference between them is mainly about the treatment of the "mantissa" (the fractional part: the mantissa of 2.525 is 525). I don’t know precisely how they are treated, but I can say you that IEEE (Institute of Electrical and Electronics Engineers) is a standard-making organization (like the European ISO) which defined one of these standards.
the Q factor in filters identifies the slope of the filtering “bell” curve, so you are right: depending on the Q value, you are also filtering near frequencies by a factor which is dependant on the Q in respect of the attenuation you requested.
the (in)harmonics you are experiencing in the spectrum view should be caused by the sampling noise due to the approximations I’ve told you about in 1), which create sub and super frequencies which can be harmonics (integer multiples of base frequency) or inharmonics (others).
The less the approximation, the less the noise.
About floating point:
the mantissa is a sort of fractional part, but not in the sense you say.
The mantissa 2.525 is m=0.315625, and the exponent is e=3. Why?
Because floating point numbers are represented as “m * b^e”
where m is the mantissa, b is the base number and e is the exponent.
For binary computers, b is naturally 2. The mantissa is a fractional
number between 0 and 0.5, and the exponent is an integer.
About the original question: It-Alien said the most important, unless
you need a more detailed and mathematicaly description, in which case
I suggest you search the net for DSP tutorials etc or even get a book
I think harmony central has some tutorials, don’t remember if there’s any
exactly fitting to the question though.
But what I don’t get is, why are there other frequencies present in the sine wave when I synthesised it at 46.25Hz? Shouldn’t only this frequency be present, so that a lowpass filter at, say, 500Hz doesn’t affect the sound? But it does…so where are these other frequencies coming from?
Other than that, everything does seem a little clearer now. I remember things about the mantissa in algebra class at uni but never had any idea how this sort of thing is used in sound engineering and the such. Thanks for all the help so far.
draw a nice and round sinus wave (one period is enough)
in a carthesian coordinate system like this:
to make things simpler, lets imagine using 2 bits only.
This gives only four levels of amplitude, so draw 4 tickmarks
on the amplitude line. These four values are the only ones
you can store digitally with 2 bits!
Now redraw the sinuscurve, not as a smooth curve but using only
the amplitudes you’ve marked. The result is a stepped wave, which
is clearly not a good sinewave.
These steps result in what is called “quantization noise”, which is
the extra frequencies introduced by the quantization of the smoothly
changing amplitude values to clear, discrete steps like in your drawing above.
Using one more bit, you can double the number of tickmarks, resulting
in a smoother and more correct curve. With 8 bits, you have 256 levels
but that’s still not enough, and we can very clearly hear the noise those
steps introduce. With 16 bit, more than 65000 levels, it gets difficult.
With 24 or 32 bits, this problem is practically gone.