the code I'm writing contains
problem = NonlinearVariationalProblem(L, u_, bcs, J)
solver = NonlinearVariationalSolver(problem)
while (t < T):
t += dt
u0_.vector()[:] = u_.vector()
solver.solve()
and I'm running it in parallel
$ mpiexec -n 4 python2 rbc.py
...
Process 0: Solving linear system of size 124565 x 124565 (PETSc LU solver, mumps)
...
While the code does some things in parallel like applying boundary conditions, only one process does the Solving.