This is a read only copy of the old FEniCS QA forum. Please visit the new QA forum to ask questions

Is there a way to configure the Just in Time compiler to save on memory

+1 vote

The question is motivated by my problem that I have
repeatedly run out of memrory on a 8 Gigabyte laptop when
trying to compile forms containing Expressions using polynomials
of 4th order.
This is rather limiting for me.

asked Jun 28, 2013 by moritzbraun FEniCS User (1,390 points)

Please, provide complete but short code so that others can test it. You should also provide used version of FEniCS and OS.

Dear Jan

An example is the following code that
runs out of memory on my
ubuntu 13.04 laptop with fenics 1.2.0 and 8 Gbyte RAM

regards

Moritz

!/usr/bin/env python

import sys
import numpy as np
from dolfin import *
from time import time
Z1=1.0

Define mesh, function space

parameters["linear_algebra_backend"]="uBLAS"
parameters["form_compiler"]["quadrature_degree"]=20
no=4
nel = 6
X = 6.0
mesh = BoxMesh(-1,-1,-1, 1,1,1,nel,nel,nel)
x=mesh.coordinates()[:,0]
y=mesh.coordinates()[:,1]
z=mesh.coordinates()[:,2]
def trans(x):
return np.sign(x)*abs(x)**2*X
def denser(x,y,z):
return [trans(x),trans(y),trans(z)]
x1,y1,z1=denser(x,y,z)
xyz=np.array([x1,y1,z1]).transpose()
mesh.coordinates()[:]=xyz
V = FunctionSpace(mesh, "CG", no)
def u0_boundary(x, on_boundary):
return on_boundary
bc = DirichletBC(V, Expression('0'), u0_boundary)

Define basis and bilinear form

u = TrialFunction(V)
v = TestFunction(V)

coulomb potential with singularity removed

cPot= Expression('-2.0Z/sqrt(x[0]x[0]+x[1]x[1]+x[2]x[2]+1e-16)-(4ZZ-4Z/sqrt(x[0]x[0]+x[1]x[1]+x[2]x[2]+1e-16))exp(-2.0Zsqrt(x[0]x[0]+x[1]x[1]+x[2]x[2]))/(1+exp(-2Zsqrt(x[0]x[0]+x[1]x[1]+x[2]*x[2])))',Z=Z1)

weight function = cusp-factor^2

wf=Expression('pow((1+exp(-2Zsqrt(x[0]x[0]+x[1]x[1]+x[2]*x[2]))),2)',Z=Z1)

form for transformed Hamiltonian without weight function

a = dot(wfgrad(u), grad(v))dx+wfucPotvdx

form for massmatrix with weight function

m=wfuv*dx

form for massmatrix with weight function

q=uvdx

A,M,Q and K are matrices corresponding to forms a,m,q,k

A=Matrix()
M=Matrix()
Q=Matrix()

#

b=v*dx
t1=time()
A,_=assemble_system(a,b,bc)
M,_=assemble_system(m,b,bc)
Q,_=assemble_system(q,b,bc)
bc.zero(M)
bc.zero(Q)
t2=time()

print "Assembly:",t2-t1,"sec"

Please, edit your comment using markdown syntax (indent the code snippet by four spaces).

1 Answer

+3 votes
 
Best answer
  • It's not such an surprise - quadrature degree 20 in 3D needs 1331 quadrature points!

  • For compiled expression you should specify at least degree

    Expression(code, degree=3)
    

    (or whatever degree you need) or element. Just-in-time compiler has no algorithm to guess degree of C++ code supplied to Expression.

  • Specifying

    parameters["form_compiler"]["quadrature_degree"]
    

    should not be needed. FFC estimates degree of each form by algorithms in UFL (although there are some bugs). Better you should specify degree to each form by syntax

    foo*dx(None, form_compiler_parameters={'quadrature_degree': 14})
    

    (or int specifying domain instead of None)

  • Regarding your other post here, note that both your expression are not exactly integrable using Gauss quadrature. I don't know with what degree you intended to integrate wf and cPot but 20 might be quite an overkill. For example degree 20 of the form a assumes that wf*cPot is of degree 12. (In addition wf and cPot are interpolated to degree 1 before integration because you didn't set degree of C++ expression as said above.)

  • You can specify some form compiler parameters to optimize for memory usage.

I set cPot = Expression(..., degree=3), wf = Expression(..., degree=3). Then with

parameters["form_compiler"]["quadrature_degree"] = 14

FFC did the job very promptly and GCC needed 4GB; output Assembly: 277.316293955 sec; output on 2nd run Assembly: 8.01471805573 sec. With

parameters["form_compiler"]["quadrature_degree"] = 14
parameters["form_compiler"]["optimize"] = True
parameters["form_compiler"]["representation"] = 'quadrature'

FFC took longer time but GCC finished very quickly with minimal memory resources; on the other hand actual assembling is much slower than with tensor representation; output Assembly: 99.2105529308 sec; output on 2nd run Assembly: 77.5997550488 sec. With

parameters["form_compiler"]["quadrature_degree"] = 14
parameters["form_compiler"]["representation"] = 'quadrature'

this is practically same; output Assembly: 77.1860561371 sec; output on 2nd run Assembly: 55.6270360947 sec.

answered Jun 28, 2013 by Jan Blechta FEniCS Expert (51,420 points)
selected Jul 11, 2013 by Jan Blechta

Dear Jan

Thanks for your answer
However, making the degree only 3 will make the whole calculation
not accurate enough.
I still don't understand what is the reason for
my code to run out of memory on a 8GB machine?

regards

Moritz

Then use quadrature representation with an understanding that it is much slower on assembler.

I still don't understand what is the reason for
my code to run out of memory on a 8GB machine?

Please, look onto generated code in ~/.instant/cache/foo/ffc_form_foo.h corresponding to form a. You will understand.

Maybe you could also pass some flags to C++ compiler to make it work with less memory.

parameters['form_compiler']['cpp_optimize'] = True
parameters['form_compiler']['cpp_optimize_flags'] = '-foo'
...