Python – Convert 32-bit Floating Points to 16-bit PCM range

audiopython

I have some data generated by the javascript HTML5 web audio api. It generates Float32Array, an array of 32-bit Floating Points, between -1 and 1. I stream the data to my server using a websocket.

I need to convert the 32-bit floating points to 16-bit PCM range between -32768 and +32767 (16-bit signed integer). This then allows the data to be used as a wav file.

I'm having trouble converting. I suspect the answer is to use the struct module, but I can't get the correct formatting.

Best Answer

Here's a sample Python 2.7 program that reads a file containing raw 32-bit floating point audio samples and creates a WAV file containing those samples converted 16-bit signed integer samples:

import sys
import array
import struct
import wave

def convert(fin, fout, chunk_size = 1024 * 1024):
    chunk_size *= 4    # convert from samples to bytes

    waveout = wave.open(fout, "wb")
    waveout.setparams((1, 2, 44100, 0, "NONE", ""))

    while True:
        raw_floats = fin.read(chunk_size)
        if raw_floats == "":
            return
        floats = array.array('f', raw_floats)
        samples = [sample * 32767
                   for sample in floats]
        raw_ints = struct.pack("<%dh" % len(samples), *samples)
        waveout.writeframes(raw_ints)

convert(open(sys.argv[1], "rb"), open(sys.argv[2], "wb"))

The code uses array.array to convert the 32-bit floating point samples to a Python floats because it should be a bit faster than struct.unpack. It also uses the native machine byte order, just like Float32Array does. It's not possible to use array.array to create the 16-bit integer samples because they need to use the little endian byte order regardless of the native machine order. The range conversion is handled by simple Python code.