Slow assemble for real-valued space

Question

I am solving a problem which in its simplest form is the following weak problem

$$\int_\Omega \sigma\nabla u\cdot\nabla v \,dx + z^{-1}\left( \int_{E_1}(u-U)(v-V)\,ds + \int_{E_2}(u+U)(v+V)\,ds \right) = 2V$$

Here E_1 and E_2 are subdomains on the boundary, $\sigma$ is a function on $\Omega$, $u\in H^1(\Omega)$ and $U\in\mathbb{R}$, and similarly for the test functions (v,V).

When assembling the first part of the form

V = FunctionSpace(mesh, "CG", 1)
R = FunctionSpace(mesh, "R", 0)
W = MixedFunctionSpace([R,V])
(U,u) = TrialFunctions(W)
(V,v) = TestFunctions(W)
dom_int = sigma*inner(grad(u), grad(v))*dx
A = assemble(dom_int)

the assembly is incredibly slow, especially when compared to not using the real function space

V = FunctionSpace(mesh, "CG", 1)
u = TrialFunction(V)
v = TestFunction(V)
dom_int = sigma*inner(grad(u), grad(v))*dx
A = assemble(dom_int)

This seems to be a known problem:
http://fenicsproject.org/qa/1231/hybrid-fem-formulation-with-some-non-spatial-dofs-with-fenics

Since I have to reassemble the A-matrix many times (iterative method where sigma is changing), this slowdown is very bad. Is there an easy way to:

Build the A-matrix using only CG-1
Extend the matrix by a zero row and column (how is this done, and will it ruin performance?)

I tried a workaround with cbc.block by defining

M = block_mat([[0, 0], [0, A]])+block_mat([[B]])

where A is the CG-1 matrix from above and B is the assembled matrix for
$$z^{-1}\left( \int_{E_1}(u-U)(v-V)\,ds + \int_{E_2}(u+U)(v+V)\,ds \right)$$

It did not work, when I tried to solve the system it give the error (I am not even sure if cbc.block supports adding block matrices of different shapes like this):

RuntimeError: Can't allocate vector - no usable template for block 0.
Consider calling something like bb.allocate([V, Q]) to initialise the block_vec.

Any suggestions for a workaround that might help? Ideally the assembly with real spaces should be fixed since it is far too slow.

Kent-Andre Mardal · Answer 1 · May 31, 2016

Best answer

In cbc.block you would have separate matrices for CG and R spaces.
Have a look at e.g. the Stokes demo.

The reason that the use of R space is slow is that this space represent
global dofs that are coupled to all other dofs. This destroys performance.
Your best bet now, I think, is to assemble this matrix separately
for instance with cbc.block or directly in the PETSc backend.

answered May 31, 2016 by Kent-Andre Mardal FEniCS Expert (14,380 points)
selected Jun 1, 2016 by hgar

I am not sure it is possible to separate the two spaces in the forms on E_1 and E_2, at least I do not see any way to separate the reals U and V from the CG1 u and v.

commented May 31, 2016 by hgar FEniCS Novice (770 points)

It is simple. You get the A00 matrix by
using trial and test functions (u,0) and (v,0) in your form.
The A01 block is obtained by using (u,0) and (0, V).

commented May 31, 2016 by Kent-Andre Mardal FEniCS Expert (14,380 points)

Of course, you are completely right, honestly I do not know what I was thinking. Thank you for the help.

commented Jun 1, 2016 by hgar FEniCS Novice (770 points)

Slow assemble for real-valued space

1 Answer