[firedrake] Hardware counters for Firedrake

Justin Chang jychang48 at gmail.com
Thu Jul 16 14:32:45 BST 2015


Lawrence,

I have attached the code I am working with. It's basically the one you sent
me a few weeks ago, but I am only working with selfp. Attached are the log
files with 1, 2, and 4 processors on our local HPC machine (Intel Xeon
E5-2680v2 2.8 GHz)

1) I wrapped the PyPAPI calls around solver.solve(). I guess this is doing
what I want. Right now I am estimating the arithmetic intensity by
documenting the FLOPS, Loads, and Stores. When i compare the measured FLOPS
with the PETSc manual FLOP count it seems papi over counts by a factor of 2
(which I suppose is expected coming from a new Intel machine). Anyway, in
terms of computing the FLOPS and AI this is what I want, I just wanted to
make sure these don't account for the DMPlex initialization and stuff
because:

2) According to the attached log_summaries it seems DMPlexDistribute and
MeshMigration still consume a significant portion of the time. By
significant I mean that the %T doesn't reduce as I increase the number of
processors. I remember seeing Michael Lange's presentations (from PETSc-20
and the webinar) that mentioned something about this?

3) Bonus question: how do I also use PAPI_flops(&real_time,
&proc_time, &flpins,
&mflops)? I see there's the flops() function, but in my limited PAPI
experience, I seem to have issues whenever I try to put both that and
PAPI_start_counters into the same program, but I could be wrong.

Thanks,
Justin

On Thu, Jul 16, 2015 at 3:46 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 15/07/15 21:14, Justin Chang wrote:
> > First option works wonderfully for me, but now I am wondering how
> > I would employ the second option.
> >
> > Specifically, I want to profile SNESSolve()
>
> OK, so calls out to PETSc are done from Python (via petsc4py).  It's
> just calls to integral assembly (i.e. evaluation of jacobians and
> residuals) that go through a generated code path.
>
> To be more concrete, let's say you have the following code:
>
> F = some_residual
>
> problem = NonlinearVariationalProblem(F, u, ...)
>
> solver = NonlinearVariationalSolver(problem)
>
> solver.solve()
>
> Then the call chain inside solver.solve is effectively:
>
> solver.solve ->
>   SNESSolve -> # via petsc4py
>     SNESComputeJacobian ->
>       assemble(Jacobian) # Callback to Firedrake
>     SNESComputeFunction ->
>       assemble(residual) # Callback to Firedrake
>     KSPSolve
>
> So if you wrapped flop counting around the outermost solver.solve()
> call, you're pretty close to wrapping SNESSolve.
>
> Or do you mean something else when profiling SNESSolve?
>
> > I would prefer to circumvent profiling of the DMPlex distribution
> > because it seems that is a major bottleneck for multiple processes
> > at the moment.
>
> Can you provide an example mesh/process count that demonstrates this
> issue, or at least characterize it a little better?  Michael Lange and
> Matt Knepley have done a lot of work on making DMPlexDistribute much
> faster than it was over the last 9 months or so.  So if it turns out
> still to be slow, we'd really like to know about it and try and fix it.
>
> Cheers,
>
> Lawrence
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQEcBAEBAgAGBQJVp29aAAoJECOc1kQ8PEYv+xYIAKMWLy2Go1WXjxKAj9+RvbHs
> s26Dr/nJufqgC9GxArKRM0g/iJXD9sTnJckSQQQA1wHZzuVigr+ZFyHkN6HeNkbM
> HILg5Mu7SYWvAwQOo18G3y6e8c7WFryJU7eNcEcfMqgZqQnfQ0JrV5iIshgM36mx
> aP6VN7PfmJgy0CxQ/QuYyemt+U/9qvMAMSqfWNd5xRABTFw+dLcaj/h2T6u8EKxA
> JCbhr3WTpeVsKygdDl01ZkGXjG7xd0tYRq9Y0AoZ7K9fUQlAYcAAPhfjlSz9ABZe
> ZHWgJi724uzcnbAxtnY78TDqD0eHFFfRetEwd5Bn2G8uAssXZYzOg+DO49ETjn8=
> =eDM8
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> firedrake mailing list
> firedrake at imperial.ac.uk
> https://mailman.ic.ac.uk/mailman/listinfo/firedrake
>
-------------- next part --------------
HTML attachment scrubbed and removed
-------------- next part --------------
    Residual norms for selfp_ solve.
    0 KSP preconditioned resid norm 1.329954740724e+02 true resid norm 8.936413686950e-03 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP preconditioned resid norm 1.235753253827e+01 true resid norm 1.136864734894e+00 ||r(i)||/||b|| 1.272171113289e+02
    2 KSP preconditioned resid norm 1.331959321130e+00 true resid norm 2.927389992008e-01 ||r(i)||/||b|| 3.275799548406e+01
    3 KSP preconditioned resid norm 2.005159390419e-01 true resid norm 1.428098374310e-01 ||r(i)||/||b|| 1.598066544743e+01
    4 KSP preconditioned resid norm 5.055808525009e-02 true resid norm 4.070690514827e-02 ||r(i)||/||b|| 4.555172418632e+00
    5 KSP preconditioned resid norm 2.625628299328e-02 true resid norm 3.056686404606e-02 ||r(i)||/||b|| 3.420484449001e+00
    6 KSP preconditioned resid norm 4.509710834114e-03 true resid norm 3.247821966385e-03 ||r(i)||/||b|| 3.634368416860e-01
    7 KSP preconditioned resid norm 1.372042183925e-03 true resid norm 5.355108948802e-04 ||r(i)||/||b|| 5.992458648845e-02
    8 KSP preconditioned resid norm 2.617267203849e-04 true resid norm 1.905031177743e-04 ||r(i)||/||b|| 2.131762521832e-02
    9 KSP preconditioned resid norm 1.015904451656e-04 true resid norm 6.258061304298e-05 ||r(i)||/||b|| 7.002877802576e-03
   10 KSP preconditioned resid norm 3.035554053736e-05 true resid norm 2.203965203820e-05 ||r(i)||/||b|| 2.466274817871e-03
   11 KSP preconditioned resid norm 7.233713447626e-06 true resid norm 4.400071472157e-06 ||r(i)||/||b|| 4.923755352310e-04
Total FLOPS: 2.457323e+09
0.18449829301
norm = 0.000001
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

mixed-poisson.py on a arch-linux2-c-opt named compute-0-0.local with 1 processor, by jchang23 Thu Jul 16 08:20:18 2015
Using Petsc Development GIT revision: v3.6-175-g274dabd  GIT Date: 2015-07-10 22:30:57 +0100

                         Max       Max/Min        Avg      Total 
Time (sec):           1.060e+01      1.00000   1.060e+01
Objects:              3.150e+02      1.00000   3.150e+02
Flops:                1.187e+09      1.00000   1.187e+09  1.187e+09
Flops/sec:            1.120e+08      1.00000   1.120e+08  1.120e+08
MPI Messages:         0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Message Lengths:  0.000e+00      0.00000   0.000e+00  0.000e+00
MPI Reductions:       0.000e+00      0.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 4.9004e+00  46.3%  0.0000e+00   0.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0% 
 1:           selfp: 5.6949e+00  53.7%  1.1865e+09 100.0%  0.000e+00   0.0%  0.000e+00        0.0%  0.000e+00   0.0% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecSet                 9 1.0 1.4837e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        6 1.0 5.8479e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       4 1.0 4.5300e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         4 1.0 3.8953e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
DMPlexInterp           1 1.0 7.3866e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  7  0  0  0  0  15  0  0  0  0     0
DMPlexStratify         3 1.0 2.1979e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   4  0  0  0  0     0
SFSetGraph             7 1.0 2.9490e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0

--- Event Stage 1: selfp

VecMDot               11 1.0 4.7193e-02 1.0 1.65e+08 1.0 0.0e+00 0.0e+00 0.0e+00  0 14  0  0  0   1 14  0  0  0  3499
VecNorm               25 1.0 1.5265e-02 1.0 6.25e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  5  0  0  0   0  5  0  0  0  4098
VecScale              24 1.0 9.8898e-03 1.0 2.40e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  2429
VecCopy               16 1.0 2.2394e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               158 1.0 1.3032e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   2  0  0  0  0     0
VecAXPY               13 1.0 1.4370e-02 1.0 3.25e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0  2263
VecAYPX               12 1.0 1.4731e-02 1.0 1.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  1019
VecMAXPY              23 1.0 1.1603e-01 1.0 3.58e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1 30  0  0  0   2 30  0  0  0  3083
VecScatterBegin       68 1.0 3.9157e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecNormalize          12 1.0 1.4005e-02 1.0 4.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  4  0  0  0   0  4  0  0  0  3216
MatMult               35 1.0 3.3133e-01 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00  3 35  0  0  0   6 35  0  0  0  1263
MatMultAdd            92 1.0 2.7395e-01 1.0 3.80e+08 1.0 0.0e+00 0.0e+00 0.0e+00  3 32  0  0  0   5 32  0  0  0  1385
MatSolve              12 1.0 7.3616e-02 1.0 8.10e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  7  0  0  0   1  7  0  0  0  1100
MatLUFactorNum         1 1.0 3.1441e-02 1.0 8.97e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  1  0  0  0   285
MatILUFactorSym        1 1.0 2.6452e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert             2 1.0 3.8408e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatScale               2 1.0 4.7297e-03 1.0 6.00e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  1268
MatAssemblyBegin       6 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         6 1.0 3.9749e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatGetRow        1000000 1.0 6.0956e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatGetRowIJ            2 1.0 1.1921e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrix        4 1.0 3.4897e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatGetOrdering         1 1.0 2.3370e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         1 1.0 9.2080e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                1 1.0 2.1363e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   4  0  0  0  0     0
MatMatMult             1 1.0 1.0431e-01 1.0 1.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   2  1  0  0  0   144
MatMatMultSym          1 1.0 7.0812e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  1  0  0  0  0   1  0  0  0  0     0
MatMatMultNum          1 1.0 3.3484e-02 1.0 1.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  1  0  0  0   448
PCSetUp                4 1.0 2.0950e+00 1.0 3.00e+07 1.0 0.0e+00 0.0e+00 0.0e+00 20  3  0  0  0  37  3  0  0  0    14
PCSetUpOnBlocks       12 1.0 6.0278e-02 1.0 8.97e+06 1.0 0.0e+00 0.0e+00 0.0e+00  1  1  0  0  0   1  1  0  0  0   149
PCApply               12 1.0 4.1145e+00 1.0 1.38e+08 1.0 0.0e+00 0.0e+00 0.0e+00 39 12  0  0  0  72 12  0  0  0    34
KSPGMRESOrthog        11 1.0 1.0059e-01 1.0 3.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1 28  0  0  0   2 28  0  0  0  3283
KSPSetUp               4 1.0 1.3707e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 5.1125e+00 1.0 1.18e+09 1.0 0.0e+00 0.0e+00 0.0e+00 48100  0  0  0  90100  0  0  0   232
SNESSolve              1 1.0 5.6778e+00 1.0 1.19e+09 1.0 0.0e+00 0.0e+00 0.0e+00 54100  0  0  0 100100  0  0  0   209
SNESFunctionEval       2 1.0 2.2012e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   4  0  0  0  0     0
SNESJacobianEval       1 1.0 3.4403e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   6  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container     6              3         1680     0
              Viewer     1              0            0     0
           Index Set    23             19     30062608     0
   IS L to G Mapping     4              0            0     0
             Section    24              6         3936     0
              Vector    16             31    194215136     0
      Vector Scatter     4              6         3888     0
              Matrix     5              3    112983612     0
      Preconditioner     1              5         4984     0
       Krylov Solver     1              5        23296     0
                SNES     1              1         1324     0
      SNESLineSearch     1              1          856     0
              DMSNES     1              0            0     0
    Distributed Mesh     9              3        14128     0
    GraphPartitioner     3              2         1192     0
Star Forest Bipartite Graph    21             10         7840     0
     Discrete System     9              3         2520     0

--- Event Stage 1: selfp

           Index Set    15             12         9216     0
              Vector   148            121    274399920     0
      Vector Scatter     6              0            0     0
              Matrix     5              2     38021356     0
      Preconditioner     5              1          992     0
       Krylov Solver     5              1         1296     0
     DMKSP interface     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 0
#PETSc Option Table entries:
-log_summary
-selfp_fieldsplit_0_ksp_type preonly
-selfp_fieldsplit_0_pc_type bjacobi
-selfp_fieldsplit_0_sub_pc_type ilu
-selfp_fieldsplit_1_ksp_type preonly
-selfp_fieldsplit_1_pc_type hypre
-selfp_ksp_monitor_true_residual
-selfp_ksp_rtol 1e-07
-selfp_ksp_type gmres
-selfp_pc_fieldsplit_schur_fact_type upper
-selfp_pc_fieldsplit_schur_precondition selfp
-selfp_pc_fieldsplit_type schur
-selfp_pc_type fieldsplit
-selfp_snes_type ksponly
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --download-chaco --download-ctetgen --download-exodusii=1 --download-fblaslapack --download-hdf5 --download-hypre=1 --download-metis --download-netcdf=1 --download-parmetis --download-triangle --with-cc=mpicc --with-cmake=cmake --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-mpiexec=mpiexec --with-shared-libraries=1 --with-valgrind=1 CFLAGS= COPTFLAGS=-O3 CXXFLAGS= CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt
-----------------------------------------
Libraries compiled on Mon Jul 13 02:29:52 2015 on opuntia.cacds.uh.edu 
Machine characteristics: Linux-2.6.32-504.1.3.el6.x86_64-x86_64-with-redhat-6.6-Santiago
Using PETSc directory: /home/jchang23/petsc-dev
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc  -fPIC -O3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90  -fPIC -O3   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/share/apps/intel/impi/5.0.2.044/intel64/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lHYPRE -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -lmpicxx -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lctetgen -lssl -lcrypto -lifport -lifcore -lm -lmpicxx -ldl -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -ldl 
-----------------------------------------
-------------- next part --------------
    Residual norms for selfp_ solve.
    0 KSP preconditioned resid norm 1.314485425210e+02 true resid norm 8.936413686950e-03 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP preconditioned resid norm 1.181230440781e+01 true resid norm 1.139486425923e+00 ||r(i)||/||b|| 1.275104830461e+02
    2 KSP preconditioned resid norm 1.430744382617e+00 true resid norm 3.530346948857e-01 ||r(i)||/||b|| 3.950518712012e+01
    3 KSP preconditioned resid norm 2.245172837560e-01 true resid norm 1.378785534898e-01 ||r(i)||/||b|| 1.542884632693e+01
    4 KSP preconditioned resid norm 5.862664561242e-02 true resid norm 5.365277299917e-02 ||r(i)||/||b|| 6.003837207931e+00
    5 KSP preconditioned resid norm 2.446376192135e-02 true resid norm 2.280110185076e-02 ||r(i)||/||b|| 2.551482356289e+00
    6 KSP preconditioned resid norm 5.441132513221e-03 true resid norm 3.575961722808e-03 ||r(i)||/||b|| 4.001562425462e-01
    7 KSP preconditioned resid norm 1.706968562957e-03 true resid norm 9.450031837969e-04 ||r(i)||/||b|| 1.057474750947e-01
    8 KSP preconditioned resid norm 5.056345293112e-04 true resid norm 3.854704900981e-04 ||r(i)||/||b|| 4.313480816818e-02
    9 KSP preconditioned resid norm 1.553039159700e-04 true resid norm 1.467820388361e-04 ||r(i)||/||b|| 1.642516158920e-02
   10 KSP preconditioned resid norm 7.892034171071e-05 true resid norm 7.550996400946e-05 ||r(i)||/||b|| 8.449694324216e-03
   11 KSP preconditioned resid norm 2.548635548579e-05 true resid norm 2.158211858106e-05 ||r(i)||/||b|| 2.415076040244e-03
   12 KSP preconditioned resid norm 1.004822326206e-05 true resid norm 1.142457186034e-05 ||r(i)||/||b|| 1.278429161916e-03
Total FLOPS: 1.344459e+09
0.143665883646
Total FLOPS: 1.372575e+09
0.148051365581
norm = 0.000001
norm = 0.000001
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

mixed-poisson.py on a arch-linux2-c-opt named compute-0-0.local with 2 processors, by jchang23 Thu Jul 16 08:20:44 2015
Using Petsc Development GIT revision: v3.6-175-g274dabd  GIT Date: 2015-07-10 22:30:57 +0100

                         Max       Max/Min        Avg      Total 
Time (sec):           9.384e+00      1.00007   9.384e+00
Objects:              5.350e+02      1.01518   5.310e+02
Flops:                6.647e+08      1.00089   6.644e+08  1.329e+09
Flops/sec:            7.083e+07      1.00082   7.080e+07  1.416e+08
MPI Messages:         3.450e+02      1.23878   3.118e+02  6.235e+02
MPI Message Lengths:  3.543e+08      1.63156   9.165e+05  5.714e+08
MPI Reductions:       4.710e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 5.7043e+00  60.8%  0.0000e+00   0.0%  4.585e+02  73.5%  8.671e+05       94.6%  1.370e+02  29.1% 
 1:           selfp: 3.6794e+00  39.2%  1.3288e+09 100.0%  1.650e+02  26.5%  4.931e+04        5.4%  3.330e+02  70.7% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecSet                 4 1.0 6.9141e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        6 1.0 2.6169e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterEnd          6 1.0 5.0068e-06 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       4 1.0 5.0068e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         4 1.0 3.3234e-02 1.1 0.00e+00 0.0 8.0e+00 1.3e+03 3.2e+01  0  0  1  0  7   1  0  2  0 23     0
Mesh Partition         2 1.0 7.4749e-01 1.1 0.00e+00 0.0 7.8e+01 5.4e+05 8.0e+00  8  0 13  7  2  13  0 17  8  6     0
Mesh Migration         2 1.0 1.0836e+00 1.0 0.00e+00 0.0 3.3e+02 1.4e+06 1.8e+01 12  0 52 77  4  19  0 71 82 13     0
DMPlexInterp           1 1.0 7.3538e-0173438.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  4  0  0  0  0   6  0  0  0  0     0
DMPlexDistribute       1 1.0 1.2285e+00 1.1 0.00e+00 0.0 1.4e+02 1.9e+06 5.0e+00 13  0 23 48  1  21  0 32 51  4     0
DMPlexDistCones        2 1.0 1.8938e-01 1.0 0.00e+00 0.0 4.8e+01 2.5e+06 0.0e+00  2  0  8 21  0   3  0 10 22  0     0
DMPlexDistLabels       2 1.0 6.2104e-01 1.0 0.00e+00 0.0 2.1e+02 1.3e+06 0.0e+00  7  0 34 46  0  11  0 46 49  0     0
DMPlexDistribOL        1 1.0 6.1585e-01 1.0 0.00e+00 0.0 2.7e+02 9.2e+05 2.1e+01  7  0 43 43  4  11  0 59 45 15     0
DMPlexDistField        3 1.0 5.0144e-02 1.1 0.00e+00 0.0 5.5e+01 5.9e+05 6.0e+00  1  0  9  6  1   1  0 12  6  4     0
DMPlexDistData         2 1.0 3.4269e-0112.8 0.00e+00 0.0 4.6e+01 3.9e+05 0.0e+00  2  0  7  3  0   3  0 10  3  0     0
DMPlexStratify         5 1.2 3.6035e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   4  0  0  0  0     0
SFSetGraph            51 1.0 2.9863e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   5  0  0  0  0     0
SFBcastBegin          94 1.0 3.8696e-01 2.5 0.00e+00 0.0 4.2e+02 1.2e+06 0.0e+00  3  0 68 89  0   5  0 92 94  0     0
SFBcastEnd            94 1.0 2.4197e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   3  0  0  0  0     0
SFReduceBegin          4 1.0 2.5640e-03 1.5 0.00e+00 0.0 9.5e+00 1.3e+06 0.0e+00  0  0  2  2  0   0  0  2  2  0     0
SFReduceEnd            4 1.0 3.4568e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFFetchOpBegin         1 1.0 5.9605e-06 3.1 0.00e+00 0.0 1.0e+00 4.2e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFFetchOpEnd           1 1.0 3.6907e-0412.7 0.00e+00 0.0 1.0e+00 4.2e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 1: selfp

VecMDot               12 1.0 2.9539e-02 1.1 9.76e+07 1.0 0.0e+00 0.0e+00 1.2e+01  0 15  0  0  3   1 15  0  0  4  6607
VecNorm               27 1.0 1.2375e-02 1.6 3.38e+07 1.0 0.0e+00 0.0e+00 2.7e+01  0  5  0  0  6   0  5  0  0  8  5459
VecScale              26 1.0 5.1279e-03 1.0 1.30e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  5075
VecCopy               17 1.0 1.1794e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               120 1.0 2.6371e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPY               14 1.0 6.1862e-03 1.0 1.75e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0  5662
VecAYPX               13 1.0 7.4878e-03 1.0 8.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  2172
VecMAXPY              25 1.0 6.0983e-02 1.0 2.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1 32  0  0  0   2 32  0  0  0  6893
VecScatterBegin      185 1.0 2.3257e-02 1.0 0.00e+00 0.0 1.1e+02 4.9e+03 0.0e+00  0  0 18  0  0   1  0 68  2  0     0
VecScatterEnd        185 1.0 6.4373e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          13 1.0 8.5566e-03 1.2 2.44e+07 1.0 0.0e+00 0.0e+00 1.3e+01  0  4  0  0  3   0  4  0  0  4  5702
MatMult               38 1.0 1.8064e-01 1.0 2.27e+08 1.0 1.1e+02 4.9e+03 2.0e+02  2 34 18  0 42   5 34 68  2 60  2518
MatMultAdd           100 1.0 1.4863e-01 1.0 2.06e+08 1.0 1.0e+02 5.0e+03 0.0e+00  2 31 16  0  0   4 31 61  2  0  2776
MatSolve              13 1.0 4.0525e-02 1.0 4.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  7  0  0  0   1  7  0  0  0  2164
MatLUFactorNum         1 1.0 1.6012e-02 1.0 4.50e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0   560
MatILUFactorSym        1 1.0 8.9319e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert             2 1.0 1.9395e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatScale               2 1.0 2.6979e-03 1.0 3.00e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  2222
MatAssemblyBegin       8 1.0 2.0168e-03 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  1   0  0  0  0  2     0
MatAssemblyEnd         8 1.0 3.8538e-02 1.0 0.00e+00 0.0 8.0e+00 1.0e+03 1.6e+01  0  0  1  0  3   1  0  5  0  5     0
MatGetRow         500000 1.0 4.3154e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatGetRowIJ            3 1.0 1.9073e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrix        4 1.0 1.6660e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  1   0  0  0  0  2     0
MatGetOrdering         1 1.0 1.0438e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         1 1.0 3.3541e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                1 1.0 1.4154e-01 1.0 0.00e+00 0.0 4.0e+00 1.0e+03 1.2e+01  2  0  1  0  3   4  0  2  0  4     0
MatMatMult             1 1.0 9.3379e-02 1.0 5.50e+06 1.0 8.0e+00 3.4e+03 1.6e+01  1  1  1  0  3   3  1  5  0  5   118
MatMatMultSym          1 1.0 7.5960e-02 1.0 0.00e+00 0.0 7.0e+00 2.7e+03 1.4e+01  1  0  1  0  3   2  0  4  0  4     0
MatMatMultNum          1 1.0 1.7408e-02 1.0 5.50e+06 1.0 1.0e+00 8.4e+03 2.0e+00  0  1  0  0  0   0  1  1  0  1   631
MatGetLocalMat         2 1.0 1.4361e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol          2 1.0 3.2091e-04 1.3 0.00e+00 0.0 4.0e+00 5.8e+03 0.0e+00  0  0  1  0  0   0  0  2  0  0     0
PCSetUp                4 1.0 1.4880e+00 1.0 1.30e+07 1.0 2.0e+01 5.0e+05 6.6e+01 16  2  3  2 14  40  2 12 33 20    17
PCSetUpOnBlocks       13 1.0 2.6026e-02 1.0 4.50e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  1  0  0  0   345
PCApply               13 1.0 2.5847e+00 1.0 7.44e+07 1.0 1.3e+01 4.2e+03 4.0e+00 28 11  2  0  1  70 11  8  0  1    58
KSPGMRESOrthog        12 1.0 5.6044e-02 1.0 1.95e+08 1.0 0.0e+00 0.0e+00 1.2e+01  1 29  0  0  3   2 29  0  0  4  6964
KSPSetUp               4 1.0 5.8489e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 3.1988e+00 1.0 6.63e+08 1.0 1.3e+02 8.0e+04 3.0e+02 34100 21  2 65  87100 81 34 92   415
SNESSolve              1 1.0 3.6553e+00 1.0 6.65e+08 1.0 1.6e+02 1.3e+05 3.2e+02 39100 25  4 68  99100 95 67 96   364
SNESFunctionEval       2 1.0 1.7999e-01 1.0 0.00e+00 0.0 2.4e+01 4.2e+05 1.4e+01  2  0  4  2  3   5  0 15 33  4     0
SNESJacobianEval       1 1.0 2.7929e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   8  0  0  0  0     0
SFBcastBegin           5 1.0 3.7880e-0326.9 0.00e+00 0.0 1.6e+01 8.3e+03 0.0e+00  0  0  3  0  0   0  0 10  0  0     0
SFBcastEnd             5 1.0 5.3644e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container     6              3         1680     0
              Viewer     1              0            0     0
           Index Set    85             81     38264488     0
   IS L to G Mapping     7              3     19646728     0
             Section    66             49        32144     0
              Vector    26             39    109166240     0
      Vector Scatter     8              7     15013736     0
              Matrix    13              5     60464524     0
      Preconditioner     1              5         4984     0
       Krylov Solver     1              5        23296     0
                SNES     1              1         1324     0
      SNESLineSearch     1              1          856     0
              DMSNES     1              0            0     0
    Distributed Mesh    13              7        32792     0
    GraphPartitioner     5              4         2384     0
Star Forest Bipartite Graph    72             61        50176     0
     Discrete System    13              7         5880     0

--- Event Stage 1: selfp

           Index Set    19             16        16480     0
              Vector   163            135    147278688     0
      Vector Scatter     9              2         2128     0
              Matrix    13              8     57495728     0
      Preconditioner     5              1          872     0
       Krylov Solver     5              1         1296     0
     DMKSP interface     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 6.19888e-07
Average time for zero size MPI_Send(): 2.98023e-06
#PETSc Option Table entries:
-log_summary
-selfp_fieldsplit_0_ksp_type preonly
-selfp_fieldsplit_0_pc_type bjacobi
-selfp_fieldsplit_0_sub_pc_type ilu
-selfp_fieldsplit_1_ksp_type preonly
-selfp_fieldsplit_1_pc_type hypre
-selfp_ksp_monitor_true_residual
-selfp_ksp_rtol 1e-07
-selfp_ksp_type gmres
-selfp_pc_fieldsplit_schur_fact_type upper
-selfp_pc_fieldsplit_schur_precondition selfp
-selfp_pc_fieldsplit_type schur
-selfp_pc_type fieldsplit
-selfp_snes_type ksponly
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --download-chaco --download-ctetgen --download-exodusii=1 --download-fblaslapack --download-hdf5 --download-hypre=1 --download-metis --download-netcdf=1 --download-parmetis --download-triangle --with-cc=mpicc --with-cmake=cmake --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-mpiexec=mpiexec --with-shared-libraries=1 --with-valgrind=1 CFLAGS= COPTFLAGS=-O3 CXXFLAGS= CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt
-----------------------------------------
Libraries compiled on Mon Jul 13 02:29:52 2015 on opuntia.cacds.uh.edu 
Machine characteristics: Linux-2.6.32-504.1.3.el6.x86_64-x86_64-with-redhat-6.6-Santiago
Using PETSc directory: /home/jchang23/petsc-dev
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc  -fPIC -O3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90  -fPIC -O3   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/share/apps/intel/impi/5.0.2.044/intel64/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lHYPRE -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -lmpicxx -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lctetgen -lssl -lcrypto -lifport -lifcore -lm -lmpicxx -ldl -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -ldl 
-----------------------------------------
-------------- next part --------------
    Residual norms for selfp_ solve.
    0 KSP preconditioned resid norm 1.244799003682e+02 true resid norm 8.936413686950e-03 ||r(i)||/||b|| 1.000000000000e+00
    1 KSP preconditioned resid norm 1.077430018392e+01 true resid norm 1.135750781502e+00 ||r(i)||/||b|| 1.270924580362e+02
    2 KSP preconditioned resid norm 1.505255974309e+00 true resid norm 3.922502513617e-01 ||r(i)||/||b|| 4.389347506758e+01
    3 KSP preconditioned resid norm 2.307791848484e-01 true resid norm 1.372036571657e-01 ||r(i)||/||b|| 1.535332427214e+01
    4 KSP preconditioned resid norm 6.562715074666e-02 true resid norm 5.709626909902e-02 ||r(i)||/||b|| 6.389170320348e+00
    5 KSP preconditioned resid norm 2.402562173647e-02 true resid norm 2.385325465118e-02 ||r(i)||/||b|| 2.669220057036e+00
    6 KSP preconditioned resid norm 5.233903411575e-03 true resid norm 3.780925699813e-03 ||r(i)||/||b|| 4.230920626845e-01
    7 KSP preconditioned resid norm 2.077046170288e-03 true resid norm 1.290043295616e-03 ||r(i)||/||b|| 1.443580546747e-01
    8 KSP preconditioned resid norm 7.129909645110e-04 true resid norm 5.397953356046e-04 ||r(i)||/||b|| 6.040402274493e-02
    9 KSP preconditioned resid norm 1.938698927211e-04 true resid norm 2.092495692156e-04 ||r(i)||/||b|| 2.341538524802e-02
   10 KSP preconditioned resid norm 1.051661965597e-04 true resid norm 9.821862152141e-05 ||r(i)||/||b|| 1.099083200063e-02
   11 KSP preconditioned resid norm 3.347978145221e-05 true resid norm 2.588115205557e-05 ||r(i)||/||b|| 2.896145250456e-03
   12 KSP preconditioned resid norm 1.296607156832e-05 true resid norm 1.305570508666e-05 ||r(i)||/||b|| 1.460955764137e-03
   13 KSP preconditioned resid norm 4.310568985079e-06 true resid norm 4.048377784223e-06 ||r(i)||/||b|| 4.530204090859e-04
Total FLOPS: 7.159714e+08
0.13043831538
Total FLOPS: 7.254224e+08
0.129247453489
Total FLOPS: 7.336663e+08
0.140364615246
Total FLOPS: 7.246571e+08
0.133656827785
norm = 0.000001
norm = 0.000001
norm = 0.000001
norm = 0.000001
************************************************************************************************************************
***             WIDEN YOUR WINDOW TO 120 CHARACTERS.  Use 'enscript -r -fCourier9' to print this document            ***
************************************************************************************************************************

---------------------------------------------- PETSc Performance Summary: ----------------------------------------------

mixed-poisson.py on a arch-linux2-c-opt named compute-0-0.local with 4 processors, by jchang23 Thu Jul 16 08:21:04 2015
Using Petsc Development GIT revision: v3.6-175-g274dabd  GIT Date: 2015-07-10 22:30:57 +0100

                         Max       Max/Min        Avg      Total 
Time (sec):           7.361e+00      1.00010   7.361e+00
Objects:              5.610e+02      1.02186   5.530e+02
Flops:                3.710e+08      1.00173   3.706e+08  1.483e+09
Flops/sec:            5.040e+07      1.00167   5.035e+07  2.014e+08
MPI Messages:         6.210e+02      1.34125   5.746e+02  2.298e+03
MPI Message Lengths:  2.465e+08      2.25833   2.501e+05  5.748e+08
MPI Reductions:       4.900e+02      1.00000

Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
                            e.g., VecAXPY() for real vectors of length N --> 2N flops
                            and VecAXPY() for complex vectors of length N --> 8N flops

Summary of Stages:   ----- Time ------  ----- Flops -----  --- Messages ---  -- Message Lengths --  -- Reductions --
                        Avg     %Total     Avg     %Total   counts   %Total     Avg         %Total   counts   %Total 
 0:      Main Stage: 5.0010e+00  67.9%  0.0000e+00   0.0%  1.500e+03  65.3%  2.363e+05       94.5%  1.370e+02  28.0% 
 1:           selfp: 2.3601e+00  32.1%  1.4826e+09 100.0%  7.980e+02  34.7%  1.382e+04        5.5%  3.520e+02  71.8% 

------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
   Count: number of times phase was executed
   Time and Flops: Max - maximum over all processors
                   Ratio - ratio of maximum to minimum over all processors
   Mess: number of messages sent
   Avg. len: average message length (bytes)
   Reduct: number of global reductions
   Global: entire computation
   Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
      %T - percent time in this phase         %F - percent flops in this phase
      %M - percent messages in this phase     %L - percent message lengths in this phase
      %R - percent reductions in this phase
   Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event                Count      Time (sec)     Flops                             --- Global ---  --- Stage ---   Total
                   Max Ratio  Max     Ratio   Max  Ratio  Mess   Avg len Reduct  %T %F %M %L %R  %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------

--- Event Stage 0: Main Stage

VecSet                 4 1.0 6.1989e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterBegin        6 1.0 1.3058e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecScatterEnd          6 1.0 3.0994e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyBegin       4 1.0 5.0068e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAssemblyEnd         4 1.0 2.0530e-02 1.2 0.00e+00 0.0 4.0e+01 5.7e+02 3.2e+01  0  0  2  0  7   0  0  3  0 23     0
Mesh Partition         2 1.0 8.0247e-01 1.1 0.00e+00 0.0 3.2e+02 1.3e+05 8.0e+00 11  0 14  7  2  16  0 22  8  6     0
Mesh Migration         2 1.0 6.8136e-01 1.0 0.00e+00 0.0 9.8e+02 4.5e+05 1.8e+01  9  0 43 77  4  14  0 65 82 13     0
DMPlexInterp           1 1.0 7.5920e-0183797.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   4  0  0  0  0     0
DMPlexDistribute       1 1.0 1.1350e+00 1.1 0.00e+00 0.0 3.4e+02 8.1e+05 5.0e+00 15  0 15 48  1  22  0 23 51  4     0
DMPlexDistCones        2 1.0 1.2553e-01 1.0 0.00e+00 0.0 1.4e+02 8.4e+05 0.0e+00  2  0  6 21  0   2  0 10 22  0     0
DMPlexDistLabels       2 1.0 4.0275e-01 1.0 0.00e+00 0.0 6.2e+02 4.3e+05 0.0e+00  5  0 27 46  0   8  0 41 49  0     0
DMPlexDistribOL        1 1.0 3.5835e-01 1.0 0.00e+00 0.0 9.9e+02 2.5e+05 2.1e+01  5  0 43 43  4   7  0 66 46 15     0
DMPlexDistField        3 1.0 2.9933e-02 1.1 0.00e+00 0.0 1.8e+02 1.8e+05 6.0e+00  0  0  8  6  1   1  0 12  6  4     0
DMPlexDistData         2 1.0 3.9284e-0122.0 0.00e+00 0.0 1.9e+02 9.6e+04 0.0e+00  4  0  8  3  0   6  0 13  3  0     0
DMPlexStratify         5 1.2 3.0168e-01 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   3  0  0  0  0     0
SFSetGraph            51 1.0 1.7229e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   3  0  0  0  0     0
SFBcastBegin          94 1.0 4.3039e-01 3.0 0.00e+00 0.0 1.4e+03 3.7e+05 0.0e+00  5  0 60 89  0   7  0 92 94  0     0
SFBcastEnd            94 1.0 2.0336e-01 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  2  0  0  0  0   3  0  0  0  0     0
SFReduceBegin          4 1.0 2.4109e-03 2.7 0.00e+00 0.0 4.2e+01 2.9e+05 0.0e+00  0  0  2  2  0   0  0  3  2  0     0
SFReduceEnd            4 1.0 4.2181e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFFetchOpBegin         1 1.0 8.1062e-06 3.8 0.00e+00 0.0 5.0e+00 1.9e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
SFFetchOpEnd           1 1.0 1.5402e-0410.3 0.00e+00 0.0 5.0e+00 1.9e+03 0.0e+00  0  0  0  0  0   0  0  0  0  0     0

--- Event Stage 1: selfp

VecMDot               13 1.0 2.0536e-02 1.1 5.70e+07 1.0 0.0e+00 0.0e+00 1.3e+01  0 15  0  0  3   1 15  0  0  4 11087
VecNorm               29 1.0 1.0441e-02 1.8 1.82e+07 1.0 0.0e+00 0.0e+00 2.9e+01  0  5  0  0  6   0  5  0  0  8  6949
VecScale              28 1.0 2.8911e-03 1.0 7.02e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  2  0  0  0   0  2  0  0  0  9695
VecCopy               18 1.0 6.7000e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecSet               129 1.0 1.5137e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
VecAXPY               15 1.0 3.6442e-03 1.1 9.39e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  3  0  0  0   0  3  0  0  0 10298
VecAYPX               14 1.0 4.4365e-03 1.1 4.38e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  3948
VecMAXPY              27 1.0 4.0347e-02 1.0 1.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00  1 33  0  0  0   2 33  0  0  0 12092
VecScatterBegin      198 1.0 1.3824e-02 1.1 0.00e+00 0.0 6.1e+02 2.2e+03 0.0e+00  0  0 27  0  0   1  0 76  4  0     0
VecScatterEnd        198 1.0 7.1239e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
VecNormalize          14 1.0 6.6764e-03 1.5 1.31e+07 1.0 0.0e+00 0.0e+00 1.4e+01  0  4  0  0  3   0  4  0  0  4  7870
MatMult               41 1.0 1.0776e-01 1.1 1.23e+08 1.0 6.1e+02 2.2e+03 2.2e+02  1 33 27  0 44   4 33 76  4 61  4557
MatMultAdd           108 1.0 8.7452e-02 1.1 1.11e+08 1.0 5.4e+02 2.3e+03 0.0e+00  1 30 23  0  0   4 30 68  4  0  5095
MatSolve              14 1.0 2.3307e-02 1.0 2.36e+07 1.0 0.0e+00 0.0e+00 0.0e+00  0  6  0  0  0   1  6  0  0  0  4049
MatLUFactorNum         1 1.0 8.3220e-03 1.1 2.27e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   0  1  0  0  0  1078
MatILUFactorSym        1 1.0 4.7340e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatConvert             2 1.0 1.0427e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatScale               2 1.0 1.4601e-03 1.0 1.50e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0  4106
MatAssemblyBegin       8 1.0 4.3786e-0339.3 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  1   0  0  0  0  2     0
MatAssemblyEnd         8 1.0 2.0906e-02 1.0 0.00e+00 0.0 4.0e+01 4.8e+02 1.6e+01  0  0  2  0  3   1  0  5  0  5     0
MatGetRow         250000 1.0 2.3014e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   1  0  0  0  0     0
MatGetRowIJ            3 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetSubMatrix        4 1.0 8.7552e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00  0  0  0  0  1   0  0  0  0  2     0
MatGetOrdering         1 1.0 5.6911e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatZeroEntries         1 1.0 1.8420e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatAXPY                1 1.0 7.5296e-02 1.0 0.00e+00 0.0 2.0e+01 4.8e+02 1.2e+01  1  0  1  0  2   3  0  3  0  3     0
MatMatMult             1 1.0 5.0722e-02 1.0 2.75e+06 1.0 4.0e+01 1.5e+03 1.6e+01  1  1  2  0  3   2  1  5  0  5   217
MatMatMultSym          1 1.0 4.1227e-02 1.0 0.00e+00 0.0 3.5e+01 1.2e+03 1.4e+01  1  0  2  0  3   2  0  4  0  4     0
MatMatMultNum          1 1.0 9.5499e-03 1.0 2.75e+06 1.0 5.0e+00 3.8e+03 2.0e+00  0  1  0  0  0   0  1  1  0  1  1151
MatGetLocalMat         2 1.0 8.4138e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
MatGetBrAoCol          2 1.0 3.0017e-04 1.7 0.00e+00 0.0 2.0e+01 2.6e+03 0.0e+00  0  0  1  0  0   0  0  3  0  0     0
PCSetUp                4 1.0 9.3147e-01 1.0 6.52e+06 1.0 7.6e+01 1.3e+05 6.6e+01 13  2  3  2 13  39  2 10 32 19    28
PCSetUpOnBlocks       14 1.0 1.3659e-02 1.1 2.27e+06 1.0 0.0e+00 0.0e+00 0.0e+00  0  1  0  0  0   1  1  0  0  0   657
PCApply               14 1.0 1.6104e+00 1.0 3.99e+07 1.0 7.0e+01 1.9e+03 4.0e+00 22 11  3  0  1  68 11  9  0  1    99
KSPGMRESOrthog        13 1.0 3.8282e-02 1.0 1.14e+08 1.0 0.0e+00 0.0e+00 1.3e+01  1 31  0  0  3   2 31  0  0  4 11895
KSPSetUp               4 1.0 3.7081e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
KSPSolve               1 1.0 1.9752e+00 1.0 3.70e+08 1.0 6.9e+02 1.7e+04 3.2e+02 27100 30  2 66  84100 86 36 92   749
SNESSolve              1 1.0 2.3453e+00 1.0 3.71e+08 1.0 7.8e+02 2.8e+04 3.4e+02 32100 34  4 69  99100 98 68 96   632
SNESFunctionEval       2 1.0 1.5018e-01 1.0 0.00e+00 0.0 9.6e+01 1.1e+05 1.4e+01  2  0  4  2  3   6  0 12 32  4     0
SNESJacobianEval       1 1.0 2.2324e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  3  0  0  0  0   9  0  0  0  0     0
SFBcastBegin           5 1.0 3.3138e-0324.6 0.00e+00 0.0 8.0e+01 3.8e+03 0.0e+00  0  0  3  0  0   0  0 10  1  0     0
SFBcastEnd             5 1.0 7.2956e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00  0  0  0  0  0   0  0  0  0  0     0
------------------------------------------------------------------------------------------------------------------------

Memory usage is given in bytes:

Object Type          Creations   Destructions     Memory  Descendants' Mem.
Reports information only for process 0.

--- Event Stage 0: Main Stage

           Container     6              3         1680     0
              Viewer     1              0            0     0
           Index Set    91             87     26252316     0
   IS L to G Mapping     7              3     14763160     0
             Section    66             49        32144     0
              Vector    26             49     85656568     0
      Vector Scatter     8              7      7509320     0
              Matrix    13              5     30224220     0
      Preconditioner     1              5         4984     0
       Krylov Solver     1              5        23296     0
                SNES     1              1         1324     0
      SNESLineSearch     1              1          856     0
              DMSNES     1              0            0     0
    Distributed Mesh    13              7        32792     0
    GraphPartitioner     5              4         2384     0
Star Forest Bipartite Graph    72             61        50176     0
     Discrete System    13              7         5880     0

--- Event Stage 1: selfp

           Index Set    19             16        16200     0
              Vector   183            145     78752808     0
      Vector Scatter     9              2         2128     0
              Matrix    13              8     28752528     0
      Preconditioner     5              1          872     0
       Krylov Solver     5              1         1296     0
     DMKSP interface     1              0            0     0
========================================================================================================================
Average time to get PetscTime(): 1.19209e-07
Average time for MPI_Barrier(): 8.10623e-07
Average time for zero size MPI_Send(): 2.02656e-06
#PETSc Option Table entries:
-log_summary
-selfp_fieldsplit_0_ksp_type preonly
-selfp_fieldsplit_0_pc_type bjacobi
-selfp_fieldsplit_0_sub_pc_type ilu
-selfp_fieldsplit_1_ksp_type preonly
-selfp_fieldsplit_1_pc_type hypre
-selfp_ksp_monitor_true_residual
-selfp_ksp_rtol 1e-07
-selfp_ksp_type gmres
-selfp_pc_fieldsplit_schur_fact_type upper
-selfp_pc_fieldsplit_schur_precondition selfp
-selfp_pc_fieldsplit_type schur
-selfp_pc_type fieldsplit
-selfp_snes_type ksponly
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --download-chaco --download-ctetgen --download-exodusii=1 --download-fblaslapack --download-hdf5 --download-hypre=1 --download-metis --download-netcdf=1 --download-parmetis --download-triangle --with-cc=mpicc --with-cmake=cmake --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-mpiexec=mpiexec --with-shared-libraries=1 --with-valgrind=1 CFLAGS= COPTFLAGS=-O3 CXXFLAGS= CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt
-----------------------------------------
Libraries compiled on Mon Jul 13 02:29:52 2015 on opuntia.cacds.uh.edu 
Machine characteristics: Linux-2.6.32-504.1.3.el6.x86_64-x86_64-with-redhat-6.6-Santiago
Using PETSc directory: /home/jchang23/petsc-dev
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------

Using C compiler: mpicc  -fPIC -O3  ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90  -fPIC -O3   ${FOPTFLAGS} ${FFLAGS} 
-----------------------------------------

Using include paths: -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/share/apps/intel/impi/5.0.2.044/intel64/include
-----------------------------------------

Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lHYPRE -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -lmpicxx -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lctetgen -lssl -lcrypto -lifport -lifcore -lm -lmpicxx -ldl -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -ldl 
-----------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mixed-poisson.py
Type: text/x-python
Size: 6688 bytes
Desc: not available
URL: <http://mailman.ic.ac.uk/pipermail/firedrake/attachments/20150716/37cc2c98/attachment.py>


More information about the firedrake mailing list