Updated:

If you are working with audio data in machine learning and want the model to work with discrete inputs, the first idea that comes to mind is to use linear encoding: just map each floating point sample between -1 and +1 into an integer from 0 to 255. This is called linear encoding.

However, there’s a much better way to do this: μ-law encoding. You can read more about the theoretical details in the given Wikipedia link, but I wasn’t convinced without an empirical observation.

So, here it is: same audio file encoded using linear and μ-law algorithm. It’s the first 8 seconds of this Youtube video (which is called the Youtube Mix dataset) sampled at 16 kHz.

Linear
μ-law

The noise in linear encoding is much more prominent compared to the μ-law version.