R – Casting an mmapped ByteString to other types

bytestringhaskellio

I realize this may be a rather heretical question, but I wonder whether I can mmap a file of data, via System.IO.Posix.MMap, and then cast the resulting ByteString into a strict array of some other type? Eg. if I know that the file contains doubles, can I somehow get this mmapped data into an UArr Double so I can do sumU etc on it, and have the virtual memory system take care of IO for me? This is essentially how I deal with multi-GB data sets in my C++ code. Alternative more idiomatic ways to do this also appreciated, thanks!

Supreme extra points for ways I can also do multicore processing on the data 🙂 Not that I'm demanding or anything.

Best Answer

I don't think it is safe to do this. UArr are Haskell heap allocated unpinned memory, the GC will move it. ByteStrings (and mmapped ones) are ForeignPtrs to pinned memory. They're different objects in the runtime system.

You will need to copy for this to be safe, if you're changing the underlying type from ForeignPtr to a Haskell value 'a'.

Related Topic