Is there any nice way in FEniCS to assemble systems IN PARALLEL and store the complete (not partioned) assembled sparse matrix? In serial it's no problem but it becomes very slow for large systems. Some workaround like in https://fenicsproject.org/qa/435/read-and-write-matrix-in-dolfin using
A = PETScMatrix()
A.binary_dump()
would be fine for me if it would be possible to load this dumped matrix afterwards in serial.