I wrote some software using FEniCS for the finite element related parts and, although I am happy with the result, I have the feeling it could run a lot faster.
The time-consuming part can be boiled down to this:
Repeatedly solve the following system for (a few thousand) different values of a scalar $\varepsilon$:
$\quad \quad \qquad A + \varepsilon B= 0$
where A and B are known, non-symmetric matrices; I will spare you the weak form they originate from.
I used to re-assemble every time that $\varepsilon$ changed, but that is silly of course. So now I assemble $A$ and $B$ once, cast as a petsc4py
matrix, and add using A.axpy(epsilon, B, 1)
, where the 1
makes sure the nonzero pattern is not recalculated every time. This works, but a few things bother me:
For every different $\epsilon$, it takes approximately 0.05 seconds to solve this say 9500x9500 sparse system. I tried LUSolver('petsc')
and LUSolver('umfpack')
as those are the options available to me. If I cast the sparse matrix to a scipy type (i.e. scipy.sparse.csr_matrix(A.array())
it solves a factor 6x or even 500x faster, using spsolve
and bicg
respectively. This is not worth doing as the casting itself takes too much time. Hence the questions: 1) Is there an efficient way to cast to a scipy-friendly format, ie. without the need for a dense array? 2) Any idea where the difference comes from, since PETSc should be on equal footing with scipy's libraries?
If I try the same with KrylovSolver
every solution takes several
seconds. Is a Krylov-solver recommended for this kind of problem? I
tried using the solution of the previous iteration as an initial
guess for the next, but that didn't do a lot. Probably my
preconditioner is badly chosen but I don't really have an idea how to
choose it here. Any lights on the matter would be greatly
appreciated.
ps I am stuck with version 1.4