We could do an optimization with constraints, i.e., introduce the mean of u by introducing an additional globally constant Lagrange multiplier m. I did not test this or check if this is well-posed and all, but check this out and see if it helps...
from dolfin import *
mesh = UnitSquareMesh(8, 8)
f = Constant(1)
V = FunctionSpace(mesh, 'Lagrange', 1)
R = FunctionSpace(mesh, 'R', 0)
X = V*R
u, m = TrialFunctions(X)
v, r = TestFunctions(X)
F = dot(grad(u),grad(v))*dx + m*v*dx + (u-m)*r*dx - f*v*dx
x = Function(X)
bc = DirichletBC(X.sub(0), Constant(0), 'on_boundary')
solve(lhs(F) == rhs(F), x, bc)
uh, mh = x.split()
plot(uh)
plot(mh)
interactive()
Testing with (v,r) = (0,1) you see that if you obtain a solution, then m is in fact the mean of u (you may need to be careful for non-unit domains). The first equation (test with (v,0)) for any v in H^1_0 then results from deriving the usual weak form and replacing the mean of u with m.