This is a read only copy of the old FEniCS QA forum. Please visit the new QA forum to ask questions

Parallel vector operations in C++ (Something akin to CellIterator?)

+1 vote

I'm trying to squeeze some performance out of my integration, and some low hanging fruit appears to be some vector operations (based on profiling).

For derived vectors is there a way to iterate upon DOFs to peform the calculation? Currently I have been defining an Expression and interpolating that onto the Function, but this leads to a significant amount of time in evaluating basis functions.

For example, if I were to find internal energy, e, from Cv*T, I would hope to do

e.vector() = Cv.vector()*T.vector()

however, Cv is not a constant scalar so I have been using an expression. If it were all local I would just do a loop on the vector size, but I would like to keep the parallelism. Perhaps something like the CellIterator?

for (CellIterator c(mesh); !c.end(); ++c){
}

I suspect creating a Functional form would help, however, some of the vectors depend on external libraries that wouldn't fit into a UFL form.

Overall, I am looking to prevent e=CvT from consuming 10% of the CPU time (due to evaluating Cv and T on their basis functions).

Thanks!

asked May 28, 2014 by Charles FEniCS User (4,220 points)

Hi,

e.vector() = c.vector()*t.vector()

Vector objects support pointwise multiplication with another vector, so the above
can be achieved as

E = e.vector(); C = c.vector(); T = t.vector()
E.zero()
E.axpy(1, C)
E *= T

Hello MiroK,

Definitely, and the pointwise multiplication is very useful. The issue is when I can't simplify it down to basic operators +-/*. Say, for example, rho=rho(p,T), where rho(p,T) is a library function. In these cases I use an Expression, but the cost of evaluating the basis functions dominates (in that case, evaluating p,T is more expensive than the library function).

Thanks!

To follow up a little, it would be nice to be able to loop through the vector and assign values in parallel.

Does this help? I know you're on C++ but the 'translation' is straight forward.

from dolfin import *
import numpy as np

mesh = RectangleMesh(-1, -1, 1, 1, 100, 100)
V = FunctionSpace(mesh, 'CG', 1)

u = Function(V)
U = u.vector()

dofmap = V.dofmap()
dof_x = dofmap.tabulate_all_coordinates(mesh).reshape((-1, 2))
first_dof, last_dof = dofmap.ownership_range()  # U.local_size()

rank = MPI.process_number()
new_values = np.zeros(last_dof - first_dof)
for i in range(len(new_values)):
    x, y = dof_x[i]
    new_values[i] = abs(x)**(rank+1) + abs(y)**(rank+1)

U.set_local(new_values)
U.apply('insert')

plot(u, title=str(rank))
interactive()

Thank you for that response MiroK.

I think I have found a simpler solution if I dont need any information about x_i, which is fortunately true for thermal properties. I ensured that all of my scalar functions share the same function space to get the same mapping/localization and then simply did:

std::vector<double> T_val, e_val, cv_val;
T->vector()->get_local(T_val);
e->vector()->get_local(e_val);
cv->vector()->get_local(cv_val); 

for(size_t i=0; i < e_val.size(); ++i)
{
    e_val[i] = cv_val[i]*T_val[i];
}

e->vector()->set_local(e_val);

1 Answer

+1 vote
 
Best answer

I found the following works rather well (and absurdly fast in comparison to interp)

std::vector<double> T_val, e_val, cv_val;
T->vector()->get_local(T_val);
e->vector()->get_local(e_val);
cv->vector()->get_local(cv_val); 

for(size_t i=0; i < e_val.size(); ++i)
{
    e_val[i] = cv_val[i]*T_val[i];
}

e->vector()->set_local(e_val);
answered Jun 2, 2014 by Charles FEniCS User (4,220 points)
edited Jun 2, 2014 by Charles

Have you tried this with axpy + pointwise vector multiplication? That way there would be no need for loop.

Yes, and it works well (I do it for u = rhou/rho, for example). This is a short example of what I am doing, where I am passing values into an external library.

...