This is a read only copy of the old FEniCS QA forum. Please visit the new QA forum to ask questions

Run with MPI: Unable to successfully call PETSc function 'KSPSolve'

+1 vote

Hi All,
i'm new in using Dolfin and I'm trying to run codes in parallel with MPI.
The code is the following:

 from dolfin import *

    #Parameters
    parameters["mesh_partitioner"] = "SCOTCH" #"ParMETIS"
    parameters["linear_algebra_backend"] = "PETSc"

    # Load mesh and subdomains
    mesh = Mesh("prague30.xml")
    sub_domains = MeshFunction("size_t", mesh, "subdomains_praga30.xml")

     # Define function spaces
     V = VectorFunctionSpace(mesh, "CG", 2)
     Q = FunctionSpace(mesh, "CG", 1)
     W = V * Q

     # No-slip boundary condition for velocity
     noslip = Constant((0, 0, 0))
     bc0 = DirichletBC(W.sub(0), noslip, sub_domains, 0)

     # Inflow boundary condition for pressure
     inflow =  Expression("9.81*100*x[2]")
     bc1 = DirichletBC(W.sub(1), inflow, sub_domains, 1)

     # Boundary condition for pressure at outflow
     outflow = Expression(("9.81*100*x[2]"))
     bc2 = DirichletBC(W.sub(1), outflow, sub_domains, 2)

     # Boundary condition for pressure at freesurface
     zero = Constant(0)
     bc3 = DirichletBC(W.sub(1), zero, sub_domains, 3)

     # Collect boundary conditions
     bcs = [bc0, bc1, bc2, bc3]

     # Define variational problem
     (u, p) = TrialFunctions(W)
     (v, q) = TestFunctions(W)
     f =  Constant((0, 0, -9.81))
     a = (inner(grad(u), grad(v)) - div(v)*p + q*div(u))*dx
     L = inner(f, v)*dx

     # Compute solution
     w = Function(W)

     (A, b) = assemble_system(a, L, bcs)
     ww = w.vector()
     solve(A, ww, b, "gmres", "ilu")

     # Split the mixed solution using deepcopy
     # (needed for further computation on coefficient vector)
     (u, p) = w.split(True)

     print "Norm of velocity coefficient vector: %.15g" % u.vector().norm("l2")
     print "Norm of pressure coefficient vector: %.15g" % p.vector().norm("l2")

     # Split the mixed solution using a shallow copy
    (u, p) = w.split()

    # Save solution in VTK format
    ufile_pvd = File("velocity.pvd")
    ufile_pvd << u
    pfile_pvd = File("pressure.pvd")
    pfile_pvd << p

   # Plot solution
    plot(u)
    plot(p)
    interactive()

The code works properly in serial, but when I ran it with MPI I met the following error:

*** -------------------------------------------------------------------------
*** Error: Unable to successfully call PETSc function 'KSPSolve'.
*** Reason: PETSc error code is: 56.
*** Where: This error was encountered inside /build/buildd/dolfin-1.3.0+dfsg/dolfin/la/PETScKrylovSolver.cpp.
*** Process: 1


*** DOLFIN version: 1.3.0
*** Git changeset: unknown
*** -------------------------------------------------------------------------

I use Ubuntu release 13.04.
How can I fix this?

Thank you very much

Lisa

asked May 29, 2014 by lisa_grementieri FEniCS Novice (240 points)
edited May 29, 2014 by lisa_grementieri

Please format your code properly (indentation), so that others can read.

2 Answers

+3 votes

Hi, the ilu preconditioner in your solve method only works in serial. For parallel ILU you
should set the preconditioner to hypre_euclid.

answered May 29, 2014 by MiroK FEniCS Expert (80,920 points)
0 votes

Thank you MiroK, I follow your advice but now I get the following error:

utentilamc@DICAM062014X:~/Desktop/Simulazioni Lisa/prague_30/python_script$ mpirun -n 16 python stokes-taylor-hood_prague30.py 

    Process 0: Solving linear system of size 222486 x 222486 (PETSc Krylov solver).

     ============= error stack trace ====================
     [4] ERROR: zero diagonal in local row 4
        iluk_seq  file= ilu_seq.c  line= 214

    [4] called from: factor_private  file= Euclid_dh.c  line= 541
    [4] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
    [4] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280

    --------------------------------------------------------------------------
   MPI_ABORT was invoked on rank 4 in communicator MPI COMMUNICATOR 5 DUP FROM 3 with errorcode -1.

   NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
   You may or may not see output from other processes, depending on
   exactly when Open MPI kills them.
   --------------------------------------------------------------------------

   ============= error stack trace ====================
   [7] ERROR: zero diagonal in local row 10
       iluk_seq  file= ilu_seq.c  line= 214

   [7] called from: factor_private  file= Euclid_dh.c  line= 541
   [7] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [7] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [13] ERROR: zero diagonal in local row 4
       iluk_seq  file= ilu_seq.c  line= 214

   [13] called from: factor_private  file= Euclid_dh.c  line= 541
   [13] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [13] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [12] ERROR: zero diagonal in local row 542
       iluk_seq  file= ilu_seq.c  line= 214

   [12] called from: factor_private  file= Euclid_dh.c  line= 541
   [12] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [12] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [8] ERROR: zero diagonal in local row 4
       iluk_seq  file= ilu_seq.c  line= 214

   [8] called from: factor_private  file= Euclid_dh.c  line= 541
   [8] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [8] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [15] ERROR: zero diagonal in local row 33
       iluk_seq  file= ilu_seq.c  line= 214

   [15] called from: factor_private  file= Euclid_dh.c  line= 541
   [15] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [15] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [14] ERROR: zero diagonal in local row 4
       iluk_seq  file= ilu_seq.c  line= 214

   [14] called from: factor_private  file= Euclid_dh.c  line= 541
   [14] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [14] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [1] ERROR: zero diagonal in local row 2594
       iluk_seq  file= ilu_seq.c  line= 214

   [1] called from: factor_private  file= Euclid_dh.c  line= 541
   [1] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [1] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [3] ERROR: zero diagonal in local row 69
       iluk_seq  file= ilu_seq.c  line= 214

   [3] called from: factor_private  file= Euclid_dh.c  line= 541
   [3] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [3] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [11] ERROR: zero diagonal in local row 651
       iluk_seq  file= ilu_seq.c  line= 214

   [11] called from: factor_private  file= Euclid_dh.c  line= 541
   [11] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [11] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280

   ============= error stack trace ====================
   [0] ERROR: zero diagonal in local row 6411
       iluk_seq  file= ilu_seq.c  line= 214

   [0] called from: factor_private  file= Euclid_dh.c  line= 541
   [0] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [0] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [2] ERROR: zero diagonal in local row 7535
       iluk_seq  file= ilu_seq.c  line= 214

   [2] called from: factor_private  file= Euclid_dh.c  line= 541
   [2] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [2] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280


   ============= error stack trace ====================
   [6] ERROR: zero diagonal in local row 8310
       iluk_seq  file= ilu_seq.c  line= 214

   [6] called from: factor_private  file= Euclid_dh.c  line= 541
   [6] called from: Euclid_dhSetup  file= Euclid_dh.c  line= 250
   [6] called from: HYPRE_EuclidSetup  file= HYPRE_parcsr_Euclid.c  line= 280

   [DICAM062014X:13564] 12 more processes have sent help message help-mpi-api.txt /   mpi-abort
   [DICAM062014X:13564] Set MCA parameter "orte_base_help_aggregate" to 0 to see all     help / error messages
  mpirun: killing job...

   --------------------------------------------------------------------------
   mpirun noticed that process rank 0 with PID 13565 on node DICAM062014X exited on   signal 0 (Unknown signal 0).
   --------------------------------------------------------------------------
  16 total processes killed (some possibly by mpirun during cleanup)
   mpirun: clean termination accomplished

=====================================================================

Since I'm not confident with MPI, I can't fix this.
Is it not the right solver for this stokes problem?
Which other solvers can I use when I run codes in parallel with MPI?

Thank you very much

Lisa

answered May 29, 2014 by lisa_grementieri FEniCS Novice (240 points)
edited May 29, 2014 by lisa_grementieri

The stokes iterative demo uses a different type of preconditioner. In general, to get all Krylov solvers and preconditioners available with you linear algebra backend use list_krylov_solver_methods() and list_krylov_solver_preconditioners()

...