Python – HDF5 file created with h5py can’t be opened by h5py

h5pyhdf5ionumpypython

I created an HDF5 file apparently without any problems, under Ubuntu 12.04 (32bit version), using Anaconda as Python distribution and writing in ipython notebooks. The underlying data are all numpy arrays. For example,

import numpy as np
import h5py

f = h5py.File('myfile.hdf5','w')

group = f.create_group('a_group')

group.create_dataset(name='matrix', data=np.zeros((10, 10)), chunks=True, compression='gzip')

If I try to open this file from a new iypthon notebook, though, I get an error message:

f = h5py.File('myfile.hdf5', "r")

---------------------------------------------------------------------------
IOError                                   Traceback (most recent call last)
<ipython-input-4-b64ac5089cd4> in <module>()
----> 1 f = h5py.File(file_name, "r")

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_hl/files.pyc in __init__(self, name, mode, driver, libver, userblock_size, **kwds)
    220 
    221             fapl = make_fapl(driver, libver, **kwds)
--> 222             fid = make_fid(name, mode, userblock_size, fapl)
    223 
    224         Group.__init__(self, fid)

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/_hl/files.pyc in make_fid(name, mode, userblock_size, fapl, fcpl)
     77 
     78     if mode == 'r':
---> 79         fid = h5f.open(name, h5f.ACC_RDONLY, fapl=fapl)
     80     elif mode == 'r+':
     81         fid = h5f.open(name, h5f.ACC_RDWR, fapl=fapl)

/home/sarah/anaconda/lib/python2.7/site-packages/h5py/h5f.so in h5py.h5f.open (h5py/h5f.c:1741)()

IOError: Unable to open file (Unable to find a valid file signature)

Can you tell me what that missing file signature is? Did I miss something when I created the file?

Best Answer

Since we resolved the issue in the comments on my question, I'm writing the results out here to mark it as solved.

The main problem was that I forgot to close the file after I created it. There would have been two simple options, either:

import numpy as np
import h5py

f = h5py.File('myfile.hdf5','w')
group = f.create_group('a_group')
group.create_dataset(name='matrix', data=np.zeros((10, 10)), chunks=True, compression='gzip')
f.close()

or, my favourite because the file is closed automatically:

import numpy as np
import h5py

with h5py.File('myfile.hdf5','w') as f:
    group = f.create_group('a_group')
    group.create_dataset(name='matrix', data=np.zeros((10, 10)), chunks=True, compression='gzip')
Related Topic