This is a read only copy of the old FEniCS QA forum. Please visit the new QA forum to ask questions

Assembling a matrix equation takes longer than its solution

0 votes

I have a term

h_avg/vnorm('+') * inner(jump(grad(u), u_old), jump(grad(v), u_old))*dS +...

in a quadratic form, where u,v,u_old are VectorFunctions, and it takes considerably more time to assemble the system ("Assembling matrix over interior facets ...") than to solve it using gmres (the final matrix is symmetric). Is it a standard behaviour, or can I use some setting to make the assembling faster?

asked Nov 11, 2013 by franp9am FEniCS Novice (590 points)

(This is not a full answer so I post it as a comment.)

  1. You can turn on optimization of a form compiler and a C++ compiler

    parameters['form_compiler']['optimize'] = True
    parameters['form_compiler']['cpp_optimize'] = True
    

    Further you can tweak more detailed options of both the compilers.

  2. You can reduce quadrature degree, especially if you set globally

    parameters['form_compiler']['quadrature_degree'] = \
        unnecessarily_LARGE_value
    
  3. You should eliminate python-subclassed Expressions and SubDomains and used compiled ones.

  4. You should use tell to an assembler that a form to assemble is symmetric. (BTW you should use CG instead of GMRES for symmetric matrix.)

  5. You could precompute mesh-geometric quantities (like h_avg) if they don't evolve in time (i.e. project/interpolate them to appropriate function space).

Thanks; it seems to me that CG works mainly for positive definite matrices. In 2), a smaller number will result in less precise integration?

One more question: now it tells, in each assembly call, "Building sparsity pattern over interior facets ..." before it starts assembling. Is it not something that should be done only once (and how)?

If I'm not mistaken this can be handled like this:

# First assemble:
M = assemble(form)
# Subsequent assembles:
assemble(form, tensor=M, reset_sparsity=False)

CG works mainly for positive definite matrices

For indefinite symmetric matrices you could employ MINRES or SYMMLQ Krylov methods. The advantage over GMRES is that short recurrences are employed but the lost of orthogonality in Krylov iterations can be an issue. (But this is also lost in GMRES restarts.)

smaller number will result in less precise integration?

If integrand is a polynomial then there is some quadrature degree (under good circumstances estimated by FEniCS) which ensures exact quadrature. Whenever you use divisions, non-integer powers, (square-)roots, exponentiation, logarithms, sines, etc quadrature can no more be exact. The estimate by FEniCS is then just heuristic and it's up to you to consider a quadrature degree suitable for your needs.

You need to provide more detail e.g., the complete form, the function spaces and timings.

Thanks, Garth. Some details: I'm trying to implement some chorin method for NS equation with low viscosity and some kind of stabilization.
V=VectorFunctionSpace(mesh, "CG", 2), Q=FunctionSpace(mesh, "CG", 1)
noslip BC for velocity on the whole boundary
u,v = TrialFunction(V), TestFunction(V)
p,q = TrialFunction(Q), TestFunction(Q)
I was experimenting with this interior penalty aproach (that leads to symmetric matrices):
F1 = (1/dt)*inner(u - u_old, v)*dx + inner(grad(u_old)*u_old, v)*dx \
+ nu*inner(grad(u), grad(v))*dx - inner(f, v)*dx
F1 += tau*(h_avg/(vnorm_square('+')+1E-8))*inner(jump(grad(u),u_old), jump(grad(v),u_old))*dS
a1 = lhs(F1)
L1 = rhs(F1)
pressure and velocity corrections are similar than in the FEniCS demo
A2 resp. A3=assemble(a2) resp. a3

while time < max_time:
...A1, b1=assemble_system(a1, L1, bcu) # updating u_old in the a1 form..
...solver_vel.solve(A1, u_tent.vector(), b1)
...
Solver_vel is now Krylov 'cg' solver with "nonzero_initial_guess"=True, ["preconditioner"]["reuse"] = True etc.
Assembling a1, L1, bcu takes around 30 seconds; most of it writing
Assembling system (facet-wise) [=> ] 4.6%
and, considerably less time but still some time,
Building sparsity pattern over interior facets [====> ] 21.6%
etc.
Solving the system takes around 5 seconds and 70 iterations which is surprisingly fast compared to assembling the matrix. The matrix has around 200 000 rows.

I have already optimized some issues according to Blechta's advice, but maybe more could be done.. I'm still not sure, for example, how to tell to the assembler that the matrix is symmetric.
Thanks for possible feedback.
Peter

You need to supply a short and complete code in a readable form.

...