This is a read only copy of the old FEniCS QA forum. Please visit the new QA forum to ask questions

Enabling HDF5 Compression

+2 votes

We've been trying to reduce the size of our output data and have considered switching to HDF5 for storage and speed purposes. However, we noticed that HDF5 files were much bigger than the old compressed XML (xml.gz) format we usually use. As an example, the following saves data in both formats:

import random
from dolfin import *

mesh = UnitCubeMesh(50, 50, 50)
V = FunctionSpace(mesh, "CG", 2)
u0 = Function(V)

for i in xrange(len(u0.vector())):
    u0.vector()[i] = random.random()

hdf5Out = HDF5File(mpi_comm_world(), 'u.h5', 'w')
hdf5Out.write(u0, "u0")

xmlGzOut = File("u.xml.gz")
xmlGzOut << u0

In our case, the resulting HDF5 and XML files were 48mb and 17mb respectively.

Going through the dolfin source, it appears that no compression is turned on by default for HDF5, hence the older compressed XML format winning out. The HDF Group website (http://www.hdfgroup.org/HDF5/faq/compression.html) gives examples on how to do this via H5Pset_filter.

Beyond doing our own manual compression of the resulting HDF5 files, is there any trick to enabling this in FEniCS? Is compressed HDF5 a feature FEniCS would consider adding?

asked Nov 21, 2014 by mjulian FEniCS Novice (170 points)

1 Answer

+2 votes
 
Best answer

Enabling HDF5 compression (for writing) is not really feasible in parallel, and that is the main reason
it has not been implemented. In theory, it could be implemented for serial code, but
the main point of having HDF5 was parallel usage. You can contribute code to make it work in serial, if you like.

Having said that, it is always possible to compress the HDF5 files after they have been written, using "h5repack" which has various compression options.
Compressed HDF5 files will still be readable in the same way, even in parallel. So that is my preferred option.

answered Nov 21, 2014 by chris_richardson FEniCS Expert (31,740 points)
selected Nov 21, 2014 by mjulian

Thanks, wasn't aware of the h5repack utility! I'll give that a go.

Hello Chris,

Please share what command do you use in command line to compress the HDF5 file?

I came across this link:

http://www.speedup.ch/workshops/w37_2008/HDF5-Tutorial-PDF/HDF5-Tools.pdf

It talks about various filters and compression extensions. What would you recommend to be able to view the result in paraview? In that link, they use something like this:

h5repack –f SHUF –f GZIP=1 output.he5  \compressed_output.he5
Compressing HDF5 to run in PARAVIEW
...