[firedrake] Installing Firedrake on an HPC machine

Justin Chang jychang48 at gmail.com
Thu Aug 6 15:26:45 BST 2015


Lawrence,

> mpiexec -n 2 python no-fork.py

2

2
>mpiexec -n 2 python fork-before.py

child exiting

child exiting

2

2

>mpiexec -n 2 python fork-after.py

2

2

child exiting

child exiting

--------------------------------------------------------------------------

An MPI process has executed an operation involving a call to the

"fork()" system call to create a child process.  Open MPI is currently

operating in a condition that could result in memory corruption or

other system errors; your MPI job may hang, crash, or produce silent

data corruption.  The use of fork() (or system() or other calls that

create child processes) is strongly discouraged.


The process that invoked fork was:


  Local host:          compute-0-0 (PID 43057)

  MPI_COMM_WORLD rank: 0


If you are *absolutely sure* that your application will successfully

and correctly survive a call to fork(), you may disable this warning

by setting the mpi_warn_on_fork MCA parameter to 0.

--------------------------------------------------------------------------

[compute-0-0.local:43055] 1 more process has sent help message
help-mpi-runtime.txt / mpi_init:warn-fork

[compute-0-0.local:43055] Set MCA parameter "orte_base_help_aggregate" to 0
to see all help / error messages

>mpiexec -n 2 python closer-test.py

In parent True

In parent True

In child False

In child False

>mpiexec -n 2 python fork-pyop2.py

2

2


The last three examples pyop2.py, import-firedrake.py, and
firedrake-test.py did not run because they say "ImportError: cannot import
name op2". And now all of my firedrake programs run into this exact error,
which is confusing.


Thanks,
Justin

On Thu, Aug 6, 2015 at 4:18 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Hi Justin,
>
> On 06/08/15 06:16, Justin Chang wrote:
> > Hi everyone,
> >
> > I have installed firedrake on my university's HPC machine, and
> > whenever i attempt to run any Firedrake program, I get this error:
> >
> >
> --------------------------------------------------------------------------
> >
> >  An MPI process has executed an operation involving a call to the
> >
> > "fork()" system call to create a child process.  Open MPI is
> > currently
> >
> > operating in a condition that could result in memory corruption or
> >
> > other system errors; your MPI job may hang, crash, or produce
> > silent
> >
> > data corruption.  The use of fork() (or system() or other calls
> > that
> >
> > create child processes) is strongly discouraged.
> >
> >
> > The process that invoked fork was:
> >
> >
> > Local host:          compute-0-0 (PID 28214)
> >
> > MPI_COMM_WORLD rank: 0
> >
> >
> > If you are *absolutely sure* that your application will
> > successfully
> >
> > and correctly survive a call to fork(), you may disable this
> > warning
> >
> > by setting the mpi_warn_on_fork MCA parameter to 0.
>
> So I recently made some changes to PyOP2 to make us more robust in the
> face of OpenMPI not allowing forking, which we need to do to invoke
> compilers when jit-compiling code.  To do this, we therefore attempt
> to fork a single process /before/ MPI is initialized (which is safe,
> because OpenMPI doesn't see it), this child process then does
> subsequent forks.  Naturally, this will fail if MPI is already
> initialized by the time we come to fork.
>
> So possibly the programs you're running are initialising MPI?
>
> Let's check some things.
>
> Let's first try something that doesn't invoke fork at all:
>
> cat > no-fork.py << EOF
> from mpi4py import MPI
> print MPI.COMM_WORLD.size
> EOF
> mpiexec -n 2 python no-fork.py


> Now something that does call fork, but /before/ initialising MPI
>
> cat > fork-before.py << EOF
> import os
> def my_fork():
>     ret = os.fork()
>     if ret == 0:
>         print 'child exiting'
>         os._exit(0)
>     else:
>         pass
> my_fork()
> from mpi4py import MPI
> print MPI.COMM_WORLD.size
> EOF
> mpiexec -n 2 python fork-before.py
>
> I hope this one works!
>
> Now fork afterwards (which I expect to fail with the error message above):
>
> cat > fork-after.py << EOF
> import os
> def my_fork():
>     ret = os.fork()
>     if ret == 0:
>         print 'child exiting'
>         os._exit(0)
>     else:
>         pass
> from mpi4py import MPI
> print MPI.COMM_WORLD.size
> my_fork()
> EOF
> mpiexec -n 2 python fork-after.py
>
> Now something more like how PyOP2/Firedrake does things:
>
> cat > closer-test.py << EOF
> import os
> import socket
>
> def child(sock):
>     val = sock.recv(1)
>     import mpi4py.rc
>     mpi4py.rc.initialize = False
>     from mpi4py import MPI
>     print 'In child', MPI.Is_initialized()
>     os._exit(0)
>
> def parent(sock):
>     from mpi4py import MPI
>     print 'In parent', MPI.Is_initialized()
>     sock.send("1")
>
> a, b = socket.socketpair()
> ret = os.fork()
>
> if ret == 0:
>     a.close()
>     child(b)
> else:
>     b.close()
>     parent(a)
> EOF
> mpiexec -n 2 python closer-test.
>
> Now let's try doing it the way PyOP2/firedrake does this:
>
> cat > fork-pyop2.py << EOF
> from pyop2_utils import enable_mpi_prefork
> enable_mpi_prefork()
> from mpi4py import MPI
> print MPI.COMM_WORLD.size
> EOF
> mpiexec -n 2 python fork-pyop2.py
>
> I hope this should work, because it's effectively just doing what
> fork-before.py does.
>
> Now let's just run pyop2 on its own:
>
> cat > pyop2.py << EOF
> from pyop2 import op2
> op2.init()
> EOF
> mpiexec -n 2 python pyop2.py
>
> And then firedrake:
>
> cat > import-firedrake.py << EOF
> from firedrake import *
> EOF
> mpiexec -n 2 python import-firedrake.py
>
> And finally a short test in firedrake:
>
> cat > firedrake-test.py << EOF
> from firedrake import *
> mesh = UnitSquareMesh(3, 3)
> print assemble(Constant(1)*dx(domain=mesh))
> EOF
> mpiexec -n 2 python firedrake-test.py
>
>
> Hopefully these tests will allow us to better see where things are
> going wrong.
>
> Cheers,
>
> Lawrence
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQEcBAEBAgAGBQJVwyZ9AAoJECOc1kQ8PEYv+dEIAIYn6MfkhLS1XVKbqzTfhQ6T
> Yb+uoGm2/hXnUki5JYoVRWrWrc3gOYDBxBFWBEBKQHy/d5tzutDvEZyM66nmzAhl
> YXSZEcfputIbT9d6VlmAzdjW39Yi/V6v+imuuyIhsAVDo8P/J5bD4xR2Q6DC+v30
> +QglNfStcAfuQGrlfE7uQpR0SV4+PdkpQHCsbhuV8fGrXptQTSB+Q6GqNxrIK72X
> BmLR20dLZCW01pW0GYoSqak92E8SpFgaFTScPHHj4jV2yDyJpvWBnuxcdbfnOV3r
> 0hOh2gk2pHRcHdetL/pdhdQ2WkXevXTtrGeeqwMaw19Jq/XRaQa9umR4m1O5FKU=
> =68Vi
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> firedrake mailing list
> firedrake at imperial.ac.uk
> https://mailman.ic.ac.uk/mailman/listinfo/firedrake
>
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the firedrake mailing list