But I was still wondering: would there be a way to assemble the forms
in serial and perform the linear solve in parallel?
You can serialize it if the parallelization is with OpenMP:
int num_threads = dolfin::parameters["num_threads"];
dolfin::parameters["num_threads"] = 0;
// do assembly
dolfin::parameters["num_threads"] = num_threads;
I did this for assembling along the facets (boundary integral to get heat flux)