Parallel vector operations in C++ (Something akin to CellIterator?)

Question

I'm trying to squeeze some performance out of my integration, and some low hanging fruit appears to be some vector operations (based on profiling).

For derived vectors is there a way to iterate upon DOFs to peform the calculation? Currently I have been defining an Expression and interpolating that onto the Function, but this leads to a significant amount of time in evaluating basis functions.

For example, if I were to find internal energy, e, from Cv*T, I would hope to do

e.vector() = Cv.vector()*T.vector()

however, Cv is not a constant scalar so I have been using an expression. If it were all local I would just do a loop on the vector size, but I would like to keep the parallelism. Perhaps something like the CellIterator?

for (CellIterator c(mesh); !c.end(); ++c){
}

I suspect creating a Functional form would help, however, some of the vectors depend on external libraries that wouldn't fit into a UFL form.

Overall, I am looking to prevent e=CvT from consuming 10% of the CPU time (due to evaluating Cv and T on their basis functions).

Thanks!

Hi,

e.vector() = c.vector()*t.vector()

Vector objects support pointwise multiplication with another vector, so the above
can be achieved as

E = e.vector(); C = c.vector(); T = t.vector() E.zero() E.axpy(1, C) E *= T — MiroK, May 29, 2014
Hello MiroK,

Definitely, and the pointwise multiplication is very useful. The issue is when I can't simplify it down to basic operators +-/*. Say, for example, rho=rho(p,T), where rho(p,T) is a library function. In these cases I use an Expression, but the cost of evaluating the basis functions dominates (in that case, evaluating p,T is more expensive than the library function).

Thanks! — Charles, May 29, 2014
To follow up a little, it would be nice to be able to loop through the vector and assign values in parallel. — Charles, May 29, 2014
Does this help? I know you're on C++ but the 'translation' is straight forward.

from dolfin import * import numpy as np mesh = RectangleMesh(-1, -1, 1, 1, 100, 100) V = FunctionSpace(mesh, 'CG', 1) u = Function(V) U = u.vector() dofmap = V.dofmap() dof_x = dofmap.tabulate_all_coordinates(mesh).reshape((-1, 2)) first_dof, last_dof = dofmap.ownership_range() # U.local_size() rank = MPI.process_number() new_values = np.zeros(last_dof - first_dof) for i in range(len(new_values)): x, y = dof_x[i] new_values[i] = abs(x)**(rank+1) + abs(y)**(rank+1) U.set_local(new_values) U.apply('insert') plot(u, title=str(rank)) interactive() — MiroK, May 29, 2014
Thank you for that response MiroK.

I think I have found a simpler solution if I dont need any information about x_i, which is fortunately true for thermal properties. I ensured that all of my scalar functions share the same function space to get the same mapping/localization and then simply did:

std::vector<double> T_val, e_val, cv_val; T->vector()->get_local(T_val); e->vector()->get_local(e_val); cv->vector()->get_local(cv_val); for(size_t i=0; i < e_val.size(); ++i) { e_val[i] = cv_val[i]*T_val[i]; } e->vector()->set_local(e_val); — Charles, Jun 2, 2014

Charles · Answer 1 · Jun 2, 2014

Best answer

I found the following works rather well (and absurdly fast in comparison to interp)

std::vector<double> T_val, e_val, cv_val;
T->vector()->get_local(T_val);
e->vector()->get_local(e_val);
cv->vector()->get_local(cv_val); 

for(size_t i=0; i < e_val.size(); ++i)
{
    e_val[i] = cv_val[i]*T_val[i];
}

e->vector()->set_local(e_val);

answered Jun 2, 2014 by Charles FEniCS User (4,220 points)
edited Jun 2, 2014 by Charles

Have you tried this with axpy + pointwise vector multiplication? That way there would be no need for loop.

commented Jun 2, 2014 by MiroK FEniCS Expert (80,920 points)

Yes, and it works well (I do it for u = rhou/rho, for example). This is a short example of what I am doing, where I am passing values into an external library.

commented Jun 2, 2014 by Charles FEniCS User (4,220 points)

Parallel vector operations in C++ (Something akin to CellIterator?)

1 Answer