[firedrake] Hardware counters for Firedrake
Justin Chang
jychang48 at gmail.com
Thu Jul 16 14:32:45 BST 2015
Lawrence,
I have attached the code I am working with. It's basically the one you sent
me a few weeks ago, but I am only working with selfp. Attached are the log
files with 1, 2, and 4 processors on our local HPC machine (Intel Xeon
E5-2680v2 2.8 GHz)
1) I wrapped the PyPAPI calls around solver.solve(). I guess this is doing
what I want. Right now I am estimating the arithmetic intensity by
documenting the FLOPS, Loads, and Stores. When i compare the measured FLOPS
with the PETSc manual FLOP count it seems papi over counts by a factor of 2
(which I suppose is expected coming from a new Intel machine). Anyway, in
terms of computing the FLOPS and AI this is what I want, I just wanted to
make sure these don't account for the DMPlex initialization and stuff
because:
2) According to the attached log_summaries it seems DMPlexDistribute and
MeshMigration still consume a significant portion of the time. By
significant I mean that the %T doesn't reduce as I increase the number of
processors. I remember seeing Michael Lange's presentations (from PETSc-20
and the webinar) that mentioned something about this?
3) Bonus question: how do I also use PAPI_flops(&real_time,
&proc_time, &flpins,
&mflops)? I see there's the flops() function, but in my limited PAPI
experience, I seem to have issues whenever I try to put both that and
PAPI_start_counters into the same program, but I could be wrong.
Thanks,
Justin
On Thu, Jul 16, 2015 at 3:46 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 15/07/15 21:14, Justin Chang wrote:
> > First option works wonderfully for me, but now I am wondering how
> > I would employ the second option.
> >
> > Specifically, I want to profile SNESSolve()
>
> OK, so calls out to PETSc are done from Python (via petsc4py). It's
> just calls to integral assembly (i.e. evaluation of jacobians and
> residuals) that go through a generated code path.
>
> To be more concrete, let's say you have the following code:
>
> F = some_residual
>
> problem = NonlinearVariationalProblem(F, u, ...)
>
> solver = NonlinearVariationalSolver(problem)
>
> solver.solve()
>
> Then the call chain inside solver.solve is effectively:
>
> solver.solve ->
> SNESSolve -> # via petsc4py
> SNESComputeJacobian ->
> assemble(Jacobian) # Callback to Firedrake
> SNESComputeFunction ->
> assemble(residual) # Callback to Firedrake
> KSPSolve
>
> So if you wrapped flop counting around the outermost solver.solve()
> call, you're pretty close to wrapping SNESSolve.
>
> Or do you mean something else when profiling SNESSolve?
>
> > I would prefer to circumvent profiling of the DMPlex distribution
> > because it seems that is a major bottleneck for multiple processes
> > at the moment.
>
> Can you provide an example mesh/process count that demonstrates this
> issue, or at least characterize it a little better? Michael Lange and
> Matt Knepley have done a lot of work on making DMPlexDistribute much
> faster than it was over the last 9 months or so. So if it turns out
> still to be slow, we'd really like to know about it and try and fix it.
>
> Cheers,
>
> Lawrence
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQEcBAEBAgAGBQJVp29aAAoJECOc1kQ8PEYv+xYIAKMWLy2Go1WXjxKAj9+RvbHs
> s26Dr/nJufqgC9GxArKRM0g/iJXD9sTnJckSQQQA1wHZzuVigr+ZFyHkN6HeNkbM
> HILg5Mu7SYWvAwQOo18G3y6e8c7WFryJU7eNcEcfMqgZqQnfQ0JrV5iIshgM36mx
> aP6VN7PfmJgy0CxQ/QuYyemt+U/9qvMAMSqfWNd5xRABTFw+dLcaj/h2T6u8EKxA
> JCbhr3WTpeVsKygdDl01ZkGXjG7xd0tYRq9Y0AoZ7K9fUQlAYcAAPhfjlSz9ABZe
> ZHWgJi724uzcnbAxtnY78TDqD0eHFFfRetEwd5Bn2G8uAssXZYzOg+DO49ETjn8=
> =eDM8
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> firedrake mailing list
> firedrake at imperial.ac.uk
> https://mailman.ic.ac.uk/mailman/listinfo/firedrake
>
-------------- next part --------------
HTML attachment scrubbed and removed
-------------- next part --------------
Residual norms for selfp_ solve.
0 KSP preconditioned resid norm 1.329954740724e+02 true resid norm 8.936413686950e-03 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 1.235753253827e+01 true resid norm 1.136864734894e+00 ||r(i)||/||b|| 1.272171113289e+02
2 KSP preconditioned resid norm 1.331959321130e+00 true resid norm 2.927389992008e-01 ||r(i)||/||b|| 3.275799548406e+01
3 KSP preconditioned resid norm 2.005159390419e-01 true resid norm 1.428098374310e-01 ||r(i)||/||b|| 1.598066544743e+01
4 KSP preconditioned resid norm 5.055808525009e-02 true resid norm 4.070690514827e-02 ||r(i)||/||b|| 4.555172418632e+00
5 KSP preconditioned resid norm 2.625628299328e-02 true resid norm 3.056686404606e-02 ||r(i)||/||b|| 3.420484449001e+00
6 KSP preconditioned resid norm 4.509710834114e-03 true resid norm 3.247821966385e-03 ||r(i)||/||b|| 3.634368416860e-01
7 KSP preconditioned resid norm 1.372042183925e-03 true resid norm 5.355108948802e-04 ||r(i)||/||b|| 5.992458648845e-02
8 KSP preconditioned resid norm 2.617267203849e-04 true resid norm 1.905031177743e-04 ||r(i)||/||b|| 2.131762521832e-02
9 KSP preconditioned resid norm 1.015904451656e-04 true resid norm 6.258061304298e-05 ||r(i)||/||b|| 7.002877802576e-03
10 KSP preconditioned resid norm 3.035554053736e-05 true resid norm 2.203965203820e-05 ||r(i)||/||b|| 2.466274817871e-03
11 KSP preconditioned resid norm 7.233713447626e-06 true resid norm 4.400071472157e-06 ||r(i)||/||b|| 4.923755352310e-04
Total FLOPS: 2.457323e+09
0.18449829301
norm = 0.000001
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
mixed-poisson.py on a arch-linux2-c-opt named compute-0-0.local with 1 processor, by jchang23 Thu Jul 16 08:20:18 2015
Using Petsc Development GIT revision: v3.6-175-g274dabd GIT Date: 2015-07-10 22:30:57 +0100
Max Max/Min Avg Total
Time (sec): 1.060e+01 1.00000 1.060e+01
Objects: 3.150e+02 1.00000 3.150e+02
Flops: 1.187e+09 1.00000 1.187e+09 1.187e+09
Flops/sec: 1.120e+08 1.00000 1.120e+08 1.120e+08
MPI Messages: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Message Lengths: 0.000e+00 0.00000 0.000e+00 0.000e+00
MPI Reductions: 0.000e+00 0.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 4.9004e+00 46.3% 0.0000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
1: selfp: 5.6949e+00 53.7% 1.1865e+09 100.0% 0.000e+00 0.0% 0.000e+00 0.0% 0.000e+00 0.0%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecSet 9 1.0 1.4837e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 6 1.0 5.8479e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 4 1.0 4.5300e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 4 1.0 3.8953e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
DMPlexInterp 1 1.0 7.3866e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 7 0 0 0 0 15 0 0 0 0 0
DMPlexStratify 3 1.0 2.1979e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 4 0 0 0 0 0
SFSetGraph 7 1.0 2.9490e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
--- Event Stage 1: selfp
VecMDot 11 1.0 4.7193e-02 1.0 1.65e+08 1.0 0.0e+00 0.0e+00 0.0e+00 0 14 0 0 0 1 14 0 0 0 3499
VecNorm 25 1.0 1.5265e-02 1.0 6.25e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 5 0 0 0 0 5 0 0 0 4098
VecScale 24 1.0 9.8898e-03 1.0 2.40e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 2429
VecCopy 16 1.0 2.2394e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 158 1.0 1.3032e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 2 0 0 0 0 0
VecAXPY 13 1.0 1.4370e-02 1.0 3.25e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 2263
VecAYPX 12 1.0 1.4731e-02 1.0 1.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1019
VecMAXPY 23 1.0 1.1603e-01 1.0 3.58e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 30 0 0 0 2 30 0 0 0 3083
VecScatterBegin 68 1.0 3.9157e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
VecNormalize 12 1.0 1.4005e-02 1.0 4.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 4 0 0 0 0 4 0 0 0 3216
MatMult 35 1.0 3.3133e-01 1.0 4.19e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 35 0 0 0 6 35 0 0 0 1263
MatMultAdd 92 1.0 2.7395e-01 1.0 3.80e+08 1.0 0.0e+00 0.0e+00 0.0e+00 3 32 0 0 0 5 32 0 0 0 1385
MatSolve 12 1.0 7.3616e-02 1.0 8.10e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 7 0 0 0 1 7 0 0 0 1100
MatLUFactorNum 1 1.0 3.1441e-02 1.0 8.97e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 285
MatILUFactorSym 1 1.0 2.6452e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatConvert 2 1.0 3.8408e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatScale 2 1.0 4.7297e-03 1.0 6.00e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1268
MatAssemblyBegin 6 1.0 1.9073e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 6 1.0 3.9749e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatGetRow 1000000 1.0 6.0956e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatGetRowIJ 2 1.0 1.1921e-06 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetSubMatrix 4 1.0 3.4897e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatGetOrdering 1 1.0 2.3370e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 1 1.0 9.2080e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAXPY 1 1.0 2.1363e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 4 0 0 0 0 0
MatMatMult 1 1.0 1.0431e-01 1.0 1.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 2 1 0 0 0 144
MatMatMultSym 1 1.0 7.0812e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 1 0 0 0 0 1 0 0 0 0 0
MatMatMultNum 1 1.0 3.3484e-02 1.0 1.50e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 448
PCSetUp 4 1.0 2.0950e+00 1.0 3.00e+07 1.0 0.0e+00 0.0e+00 0.0e+00 20 3 0 0 0 37 3 0 0 0 14
PCSetUpOnBlocks 12 1.0 6.0278e-02 1.0 8.97e+06 1.0 0.0e+00 0.0e+00 0.0e+00 1 1 0 0 0 1 1 0 0 0 149
PCApply 12 1.0 4.1145e+00 1.0 1.38e+08 1.0 0.0e+00 0.0e+00 0.0e+00 39 12 0 0 0 72 12 0 0 0 34
KSPGMRESOrthog 11 1.0 1.0059e-01 1.0 3.30e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 28 0 0 0 2 28 0 0 0 3283
KSPSetUp 4 1.0 1.3707e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 5.1125e+00 1.0 1.18e+09 1.0 0.0e+00 0.0e+00 0.0e+00 48100 0 0 0 90100 0 0 0 232
SNESSolve 1 1.0 5.6778e+00 1.0 1.19e+09 1.0 0.0e+00 0.0e+00 0.0e+00 54100 0 0 0 100100 0 0 0 209
SNESFunctionEval 2 1.0 2.2012e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 4 0 0 0 0 0
SNESJacobianEval 1 1.0 3.4403e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 6 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 6 3 1680 0
Viewer 1 0 0 0
Index Set 23 19 30062608 0
IS L to G Mapping 4 0 0 0
Section 24 6 3936 0
Vector 16 31 194215136 0
Vector Scatter 4 6 3888 0
Matrix 5 3 112983612 0
Preconditioner 1 5 4984 0
Krylov Solver 1 5 23296 0
SNES 1 1 1324 0
SNESLineSearch 1 1 856 0
DMSNES 1 0 0 0
Distributed Mesh 9 3 14128 0
GraphPartitioner 3 2 1192 0
Star Forest Bipartite Graph 21 10 7840 0
Discrete System 9 3 2520 0
--- Event Stage 1: selfp
Index Set 15 12 9216 0
Vector 148 121 274399920 0
Vector Scatter 6 0 0 0
Matrix 5 2 38021356 0
Preconditioner 5 1 992 0
Krylov Solver 5 1 1296 0
DMKSP interface 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 0
#PETSc Option Table entries:
-log_summary
-selfp_fieldsplit_0_ksp_type preonly
-selfp_fieldsplit_0_pc_type bjacobi
-selfp_fieldsplit_0_sub_pc_type ilu
-selfp_fieldsplit_1_ksp_type preonly
-selfp_fieldsplit_1_pc_type hypre
-selfp_ksp_monitor_true_residual
-selfp_ksp_rtol 1e-07
-selfp_ksp_type gmres
-selfp_pc_fieldsplit_schur_fact_type upper
-selfp_pc_fieldsplit_schur_precondition selfp
-selfp_pc_fieldsplit_type schur
-selfp_pc_type fieldsplit
-selfp_snes_type ksponly
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --download-chaco --download-ctetgen --download-exodusii=1 --download-fblaslapack --download-hdf5 --download-hypre=1 --download-metis --download-netcdf=1 --download-parmetis --download-triangle --with-cc=mpicc --with-cmake=cmake --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-mpiexec=mpiexec --with-shared-libraries=1 --with-valgrind=1 CFLAGS= COPTFLAGS=-O3 CXXFLAGS= CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt
-----------------------------------------
Libraries compiled on Mon Jul 13 02:29:52 2015 on opuntia.cacds.uh.edu
Machine characteristics: Linux-2.6.32-504.1.3.el6.x86_64-x86_64-with-redhat-6.6-Santiago
Using PETSc directory: /home/jchang23/petsc-dev
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------
Using C compiler: mpicc -fPIC -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/share/apps/intel/impi/5.0.2.044/intel64/include
-----------------------------------------
Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lHYPRE -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -lmpicxx -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lctetgen -lssl -lcrypto -lifport -lifcore -lm -lmpicxx -ldl -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -ldl
-----------------------------------------
-------------- next part --------------
Residual norms for selfp_ solve.
0 KSP preconditioned resid norm 1.314485425210e+02 true resid norm 8.936413686950e-03 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 1.181230440781e+01 true resid norm 1.139486425923e+00 ||r(i)||/||b|| 1.275104830461e+02
2 KSP preconditioned resid norm 1.430744382617e+00 true resid norm 3.530346948857e-01 ||r(i)||/||b|| 3.950518712012e+01
3 KSP preconditioned resid norm 2.245172837560e-01 true resid norm 1.378785534898e-01 ||r(i)||/||b|| 1.542884632693e+01
4 KSP preconditioned resid norm 5.862664561242e-02 true resid norm 5.365277299917e-02 ||r(i)||/||b|| 6.003837207931e+00
5 KSP preconditioned resid norm 2.446376192135e-02 true resid norm 2.280110185076e-02 ||r(i)||/||b|| 2.551482356289e+00
6 KSP preconditioned resid norm 5.441132513221e-03 true resid norm 3.575961722808e-03 ||r(i)||/||b|| 4.001562425462e-01
7 KSP preconditioned resid norm 1.706968562957e-03 true resid norm 9.450031837969e-04 ||r(i)||/||b|| 1.057474750947e-01
8 KSP preconditioned resid norm 5.056345293112e-04 true resid norm 3.854704900981e-04 ||r(i)||/||b|| 4.313480816818e-02
9 KSP preconditioned resid norm 1.553039159700e-04 true resid norm 1.467820388361e-04 ||r(i)||/||b|| 1.642516158920e-02
10 KSP preconditioned resid norm 7.892034171071e-05 true resid norm 7.550996400946e-05 ||r(i)||/||b|| 8.449694324216e-03
11 KSP preconditioned resid norm 2.548635548579e-05 true resid norm 2.158211858106e-05 ||r(i)||/||b|| 2.415076040244e-03
12 KSP preconditioned resid norm 1.004822326206e-05 true resid norm 1.142457186034e-05 ||r(i)||/||b|| 1.278429161916e-03
Total FLOPS: 1.344459e+09
0.143665883646
Total FLOPS: 1.372575e+09
0.148051365581
norm = 0.000001
norm = 0.000001
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
mixed-poisson.py on a arch-linux2-c-opt named compute-0-0.local with 2 processors, by jchang23 Thu Jul 16 08:20:44 2015
Using Petsc Development GIT revision: v3.6-175-g274dabd GIT Date: 2015-07-10 22:30:57 +0100
Max Max/Min Avg Total
Time (sec): 9.384e+00 1.00007 9.384e+00
Objects: 5.350e+02 1.01518 5.310e+02
Flops: 6.647e+08 1.00089 6.644e+08 1.329e+09
Flops/sec: 7.083e+07 1.00082 7.080e+07 1.416e+08
MPI Messages: 3.450e+02 1.23878 3.118e+02 6.235e+02
MPI Message Lengths: 3.543e+08 1.63156 9.165e+05 5.714e+08
MPI Reductions: 4.710e+02 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 5.7043e+00 60.8% 0.0000e+00 0.0% 4.585e+02 73.5% 8.671e+05 94.6% 1.370e+02 29.1%
1: selfp: 3.6794e+00 39.2% 1.3288e+09 100.0% 1.650e+02 26.5% 4.931e+04 5.4% 3.330e+02 70.7%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecSet 4 1.0 6.9141e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 6 1.0 2.6169e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterEnd 6 1.0 5.0068e-06 1.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 4 1.0 5.0068e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 4 1.0 3.3234e-02 1.1 0.00e+00 0.0 8.0e+00 1.3e+03 3.2e+01 0 0 1 0 7 1 0 2 0 23 0
Mesh Partition 2 1.0 7.4749e-01 1.1 0.00e+00 0.0 7.8e+01 5.4e+05 8.0e+00 8 0 13 7 2 13 0 17 8 6 0
Mesh Migration 2 1.0 1.0836e+00 1.0 0.00e+00 0.0 3.3e+02 1.4e+06 1.8e+01 12 0 52 77 4 19 0 71 82 13 0
DMPlexInterp 1 1.0 7.3538e-0173438.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 4 0 0 0 0 6 0 0 0 0 0
DMPlexDistribute 1 1.0 1.2285e+00 1.1 0.00e+00 0.0 1.4e+02 1.9e+06 5.0e+00 13 0 23 48 1 21 0 32 51 4 0
DMPlexDistCones 2 1.0 1.8938e-01 1.0 0.00e+00 0.0 4.8e+01 2.5e+06 0.0e+00 2 0 8 21 0 3 0 10 22 0 0
DMPlexDistLabels 2 1.0 6.2104e-01 1.0 0.00e+00 0.0 2.1e+02 1.3e+06 0.0e+00 7 0 34 46 0 11 0 46 49 0 0
DMPlexDistribOL 1 1.0 6.1585e-01 1.0 0.00e+00 0.0 2.7e+02 9.2e+05 2.1e+01 7 0 43 43 4 11 0 59 45 15 0
DMPlexDistField 3 1.0 5.0144e-02 1.1 0.00e+00 0.0 5.5e+01 5.9e+05 6.0e+00 1 0 9 6 1 1 0 12 6 4 0
DMPlexDistData 2 1.0 3.4269e-0112.8 0.00e+00 0.0 4.6e+01 3.9e+05 0.0e+00 2 0 7 3 0 3 0 10 3 0 0
DMPlexStratify 5 1.2 3.6035e-01 2.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 4 0 0 0 0 0
SFSetGraph 51 1.0 2.9863e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 5 0 0 0 0 0
SFBcastBegin 94 1.0 3.8696e-01 2.5 0.00e+00 0.0 4.2e+02 1.2e+06 0.0e+00 3 0 68 89 0 5 0 92 94 0 0
SFBcastEnd 94 1.0 2.4197e-01 2.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0
SFReduceBegin 4 1.0 2.5640e-03 1.5 0.00e+00 0.0 9.5e+00 1.3e+06 0.0e+00 0 0 2 2 0 0 0 2 2 0 0
SFReduceEnd 4 1.0 3.4568e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFFetchOpBegin 1 1.0 5.9605e-06 3.1 0.00e+00 0.0 1.0e+00 4.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFFetchOpEnd 1 1.0 3.6907e-0412.7 0.00e+00 0.0 1.0e+00 4.2e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: selfp
VecMDot 12 1.0 2.9539e-02 1.1 9.76e+07 1.0 0.0e+00 0.0e+00 1.2e+01 0 15 0 0 3 1 15 0 0 4 6607
VecNorm 27 1.0 1.2375e-02 1.6 3.38e+07 1.0 0.0e+00 0.0e+00 2.7e+01 0 5 0 0 6 0 5 0 0 8 5459
VecScale 26 1.0 5.1279e-03 1.0 1.30e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 5075
VecCopy 17 1.0 1.1794e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 120 1.0 2.6371e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
VecAXPY 14 1.0 6.1862e-03 1.0 1.75e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 5662
VecAYPX 13 1.0 7.4878e-03 1.0 8.13e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 2172
VecMAXPY 25 1.0 6.0983e-02 1.0 2.10e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 32 0 0 0 2 32 0 0 0 6893
VecScatterBegin 185 1.0 2.3257e-02 1.0 0.00e+00 0.0 1.1e+02 4.9e+03 0.0e+00 0 0 18 0 0 1 0 68 2 0 0
VecScatterEnd 185 1.0 6.4373e-04 1.5 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 13 1.0 8.5566e-03 1.2 2.44e+07 1.0 0.0e+00 0.0e+00 1.3e+01 0 4 0 0 3 0 4 0 0 4 5702
MatMult 38 1.0 1.8064e-01 1.0 2.27e+08 1.0 1.1e+02 4.9e+03 2.0e+02 2 34 18 0 42 5 34 68 2 60 2518
MatMultAdd 100 1.0 1.4863e-01 1.0 2.06e+08 1.0 1.0e+02 5.0e+03 0.0e+00 2 31 16 0 0 4 31 61 2 0 2776
MatSolve 13 1.0 4.0525e-02 1.0 4.39e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 7 0 0 0 1 7 0 0 0 2164
MatLUFactorNum 1 1.0 1.6012e-02 1.0 4.50e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 560
MatILUFactorSym 1 1.0 8.9319e-03 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatConvert 2 1.0 1.9395e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatScale 2 1.0 2.6979e-03 1.0 3.00e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 2222
MatAssemblyBegin 8 1.0 2.0168e-03 5.1 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0
MatAssemblyEnd 8 1.0 3.8538e-02 1.0 0.00e+00 0.0 8.0e+00 1.0e+03 1.6e+01 0 0 1 0 3 1 0 5 0 5 0
MatGetRow 500000 1.0 4.3154e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatGetRowIJ 3 1.0 1.9073e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetSubMatrix 4 1.0 1.6660e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0
MatGetOrdering 1 1.0 1.0438e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 1 1.0 3.3541e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAXPY 1 1.0 1.4154e-01 1.0 0.00e+00 0.0 4.0e+00 1.0e+03 1.2e+01 2 0 1 0 3 4 0 2 0 4 0
MatMatMult 1 1.0 9.3379e-02 1.0 5.50e+06 1.0 8.0e+00 3.4e+03 1.6e+01 1 1 1 0 3 3 1 5 0 5 118
MatMatMultSym 1 1.0 7.5960e-02 1.0 0.00e+00 0.0 7.0e+00 2.7e+03 1.4e+01 1 0 1 0 3 2 0 4 0 4 0
MatMatMultNum 1 1.0 1.7408e-02 1.0 5.50e+06 1.0 1.0e+00 8.4e+03 2.0e+00 0 1 0 0 0 0 1 1 0 1 631
MatGetLocalMat 2 1.0 1.4361e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 2 1.0 3.2091e-04 1.3 0.00e+00 0.0 4.0e+00 5.8e+03 0.0e+00 0 0 1 0 0 0 0 2 0 0 0
PCSetUp 4 1.0 1.4880e+00 1.0 1.30e+07 1.0 2.0e+01 5.0e+05 6.6e+01 16 2 3 2 14 40 2 12 33 20 17
PCSetUpOnBlocks 13 1.0 2.6026e-02 1.0 4.50e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 345
PCApply 13 1.0 2.5847e+00 1.0 7.44e+07 1.0 1.3e+01 4.2e+03 4.0e+00 28 11 2 0 1 70 11 8 0 1 58
KSPGMRESOrthog 12 1.0 5.6044e-02 1.0 1.95e+08 1.0 0.0e+00 0.0e+00 1.2e+01 1 29 0 0 3 2 29 0 0 4 6964
KSPSetUp 4 1.0 5.8489e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 3.1988e+00 1.0 6.63e+08 1.0 1.3e+02 8.0e+04 3.0e+02 34100 21 2 65 87100 81 34 92 415
SNESSolve 1 1.0 3.6553e+00 1.0 6.65e+08 1.0 1.6e+02 1.3e+05 3.2e+02 39100 25 4 68 99100 95 67 96 364
SNESFunctionEval 2 1.0 1.7999e-01 1.0 0.00e+00 0.0 2.4e+01 4.2e+05 1.4e+01 2 0 4 2 3 5 0 15 33 4 0
SNESJacobianEval 1 1.0 2.7929e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 8 0 0 0 0 0
SFBcastBegin 5 1.0 3.7880e-0326.9 0.00e+00 0.0 1.6e+01 8.3e+03 0.0e+00 0 0 3 0 0 0 0 10 0 0 0
SFBcastEnd 5 1.0 5.3644e-05 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 6 3 1680 0
Viewer 1 0 0 0
Index Set 85 81 38264488 0
IS L to G Mapping 7 3 19646728 0
Section 66 49 32144 0
Vector 26 39 109166240 0
Vector Scatter 8 7 15013736 0
Matrix 13 5 60464524 0
Preconditioner 1 5 4984 0
Krylov Solver 1 5 23296 0
SNES 1 1 1324 0
SNESLineSearch 1 1 856 0
DMSNES 1 0 0 0
Distributed Mesh 13 7 32792 0
GraphPartitioner 5 4 2384 0
Star Forest Bipartite Graph 72 61 50176 0
Discrete System 13 7 5880 0
--- Event Stage 1: selfp
Index Set 19 16 16480 0
Vector 163 135 147278688 0
Vector Scatter 9 2 2128 0
Matrix 13 8 57495728 0
Preconditioner 5 1 872 0
Krylov Solver 5 1 1296 0
DMKSP interface 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 9.53674e-08
Average time for MPI_Barrier(): 6.19888e-07
Average time for zero size MPI_Send(): 2.98023e-06
#PETSc Option Table entries:
-log_summary
-selfp_fieldsplit_0_ksp_type preonly
-selfp_fieldsplit_0_pc_type bjacobi
-selfp_fieldsplit_0_sub_pc_type ilu
-selfp_fieldsplit_1_ksp_type preonly
-selfp_fieldsplit_1_pc_type hypre
-selfp_ksp_monitor_true_residual
-selfp_ksp_rtol 1e-07
-selfp_ksp_type gmres
-selfp_pc_fieldsplit_schur_fact_type upper
-selfp_pc_fieldsplit_schur_precondition selfp
-selfp_pc_fieldsplit_type schur
-selfp_pc_type fieldsplit
-selfp_snes_type ksponly
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --download-chaco --download-ctetgen --download-exodusii=1 --download-fblaslapack --download-hdf5 --download-hypre=1 --download-metis --download-netcdf=1 --download-parmetis --download-triangle --with-cc=mpicc --with-cmake=cmake --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-mpiexec=mpiexec --with-shared-libraries=1 --with-valgrind=1 CFLAGS= COPTFLAGS=-O3 CXXFLAGS= CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt
-----------------------------------------
Libraries compiled on Mon Jul 13 02:29:52 2015 on opuntia.cacds.uh.edu
Machine characteristics: Linux-2.6.32-504.1.3.el6.x86_64-x86_64-with-redhat-6.6-Santiago
Using PETSc directory: /home/jchang23/petsc-dev
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------
Using C compiler: mpicc -fPIC -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/share/apps/intel/impi/5.0.2.044/intel64/include
-----------------------------------------
Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lHYPRE -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -lmpicxx -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lctetgen -lssl -lcrypto -lifport -lifcore -lm -lmpicxx -ldl -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -ldl
-----------------------------------------
-------------- next part --------------
Residual norms for selfp_ solve.
0 KSP preconditioned resid norm 1.244799003682e+02 true resid norm 8.936413686950e-03 ||r(i)||/||b|| 1.000000000000e+00
1 KSP preconditioned resid norm 1.077430018392e+01 true resid norm 1.135750781502e+00 ||r(i)||/||b|| 1.270924580362e+02
2 KSP preconditioned resid norm 1.505255974309e+00 true resid norm 3.922502513617e-01 ||r(i)||/||b|| 4.389347506758e+01
3 KSP preconditioned resid norm 2.307791848484e-01 true resid norm 1.372036571657e-01 ||r(i)||/||b|| 1.535332427214e+01
4 KSP preconditioned resid norm 6.562715074666e-02 true resid norm 5.709626909902e-02 ||r(i)||/||b|| 6.389170320348e+00
5 KSP preconditioned resid norm 2.402562173647e-02 true resid norm 2.385325465118e-02 ||r(i)||/||b|| 2.669220057036e+00
6 KSP preconditioned resid norm 5.233903411575e-03 true resid norm 3.780925699813e-03 ||r(i)||/||b|| 4.230920626845e-01
7 KSP preconditioned resid norm 2.077046170288e-03 true resid norm 1.290043295616e-03 ||r(i)||/||b|| 1.443580546747e-01
8 KSP preconditioned resid norm 7.129909645110e-04 true resid norm 5.397953356046e-04 ||r(i)||/||b|| 6.040402274493e-02
9 KSP preconditioned resid norm 1.938698927211e-04 true resid norm 2.092495692156e-04 ||r(i)||/||b|| 2.341538524802e-02
10 KSP preconditioned resid norm 1.051661965597e-04 true resid norm 9.821862152141e-05 ||r(i)||/||b|| 1.099083200063e-02
11 KSP preconditioned resid norm 3.347978145221e-05 true resid norm 2.588115205557e-05 ||r(i)||/||b|| 2.896145250456e-03
12 KSP preconditioned resid norm 1.296607156832e-05 true resid norm 1.305570508666e-05 ||r(i)||/||b|| 1.460955764137e-03
13 KSP preconditioned resid norm 4.310568985079e-06 true resid norm 4.048377784223e-06 ||r(i)||/||b|| 4.530204090859e-04
Total FLOPS: 7.159714e+08
0.13043831538
Total FLOPS: 7.254224e+08
0.129247453489
Total FLOPS: 7.336663e+08
0.140364615246
Total FLOPS: 7.246571e+08
0.133656827785
norm = 0.000001
norm = 0.000001
norm = 0.000001
norm = 0.000001
************************************************************************************************************************
*** WIDEN YOUR WINDOW TO 120 CHARACTERS. Use 'enscript -r -fCourier9' to print this document ***
************************************************************************************************************************
---------------------------------------------- PETSc Performance Summary: ----------------------------------------------
mixed-poisson.py on a arch-linux2-c-opt named compute-0-0.local with 4 processors, by jchang23 Thu Jul 16 08:21:04 2015
Using Petsc Development GIT revision: v3.6-175-g274dabd GIT Date: 2015-07-10 22:30:57 +0100
Max Max/Min Avg Total
Time (sec): 7.361e+00 1.00010 7.361e+00
Objects: 5.610e+02 1.02186 5.530e+02
Flops: 3.710e+08 1.00173 3.706e+08 1.483e+09
Flops/sec: 5.040e+07 1.00167 5.035e+07 2.014e+08
MPI Messages: 6.210e+02 1.34125 5.746e+02 2.298e+03
MPI Message Lengths: 2.465e+08 2.25833 2.501e+05 5.748e+08
MPI Reductions: 4.900e+02 1.00000
Flop counting convention: 1 flop = 1 real number operation of type (multiply/divide/add/subtract)
e.g., VecAXPY() for real vectors of length N --> 2N flops
and VecAXPY() for complex vectors of length N --> 8N flops
Summary of Stages: ----- Time ------ ----- Flops ----- --- Messages --- -- Message Lengths -- -- Reductions --
Avg %Total Avg %Total counts %Total Avg %Total counts %Total
0: Main Stage: 5.0010e+00 67.9% 0.0000e+00 0.0% 1.500e+03 65.3% 2.363e+05 94.5% 1.370e+02 28.0%
1: selfp: 2.3601e+00 32.1% 1.4826e+09 100.0% 7.980e+02 34.7% 1.382e+04 5.5% 3.520e+02 71.8%
------------------------------------------------------------------------------------------------------------------------
See the 'Profiling' chapter of the users' manual for details on interpreting output.
Phase summary info:
Count: number of times phase was executed
Time and Flops: Max - maximum over all processors
Ratio - ratio of maximum to minimum over all processors
Mess: number of messages sent
Avg. len: average message length (bytes)
Reduct: number of global reductions
Global: entire computation
Stage: stages of a computation. Set stages with PetscLogStagePush() and PetscLogStagePop().
%T - percent time in this phase %F - percent flops in this phase
%M - percent messages in this phase %L - percent message lengths in this phase
%R - percent reductions in this phase
Total Mflop/s: 10e-6 * (sum of flops over all processors)/(max time over all processors)
------------------------------------------------------------------------------------------------------------------------
Event Count Time (sec) Flops --- Global --- --- Stage --- Total
Max Ratio Max Ratio Max Ratio Mess Avg len Reduct %T %F %M %L %R %T %F %M %L %R Mflop/s
------------------------------------------------------------------------------------------------------------------------
--- Event Stage 0: Main Stage
VecSet 4 1.0 6.1989e-06 1.2 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterBegin 6 1.0 1.3058e-03 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecScatterEnd 6 1.0 3.0994e-06 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyBegin 4 1.0 5.0068e-06 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAssemblyEnd 4 1.0 2.0530e-02 1.2 0.00e+00 0.0 4.0e+01 5.7e+02 3.2e+01 0 0 2 0 7 0 0 3 0 23 0
Mesh Partition 2 1.0 8.0247e-01 1.1 0.00e+00 0.0 3.2e+02 1.3e+05 8.0e+00 11 0 14 7 2 16 0 22 8 6 0
Mesh Migration 2 1.0 6.8136e-01 1.0 0.00e+00 0.0 9.8e+02 4.5e+05 1.8e+01 9 0 43 77 4 14 0 65 82 13 0
DMPlexInterp 1 1.0 7.5920e-0183797.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 4 0 0 0 0 0
DMPlexDistribute 1 1.0 1.1350e+00 1.1 0.00e+00 0.0 3.4e+02 8.1e+05 5.0e+00 15 0 15 48 1 22 0 23 51 4 0
DMPlexDistCones 2 1.0 1.2553e-01 1.0 0.00e+00 0.0 1.4e+02 8.4e+05 0.0e+00 2 0 6 21 0 2 0 10 22 0 0
DMPlexDistLabels 2 1.0 4.0275e-01 1.0 0.00e+00 0.0 6.2e+02 4.3e+05 0.0e+00 5 0 27 46 0 8 0 41 49 0 0
DMPlexDistribOL 1 1.0 3.5835e-01 1.0 0.00e+00 0.0 9.9e+02 2.5e+05 2.1e+01 5 0 43 43 4 7 0 66 46 15 0
DMPlexDistField 3 1.0 2.9933e-02 1.1 0.00e+00 0.0 1.8e+02 1.8e+05 6.0e+00 0 0 8 6 1 1 0 12 6 4 0
DMPlexDistData 2 1.0 3.9284e-0122.0 0.00e+00 0.0 1.9e+02 9.6e+04 0.0e+00 4 0 8 3 0 6 0 13 3 0 0
DMPlexStratify 5 1.2 3.0168e-01 4.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0
SFSetGraph 51 1.0 1.7229e-01 1.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0
SFBcastBegin 94 1.0 4.3039e-01 3.0 0.00e+00 0.0 1.4e+03 3.7e+05 0.0e+00 5 0 60 89 0 7 0 92 94 0 0
SFBcastEnd 94 1.0 2.0336e-01 3.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 2 0 0 0 0 3 0 0 0 0 0
SFReduceBegin 4 1.0 2.4109e-03 2.7 0.00e+00 0.0 4.2e+01 2.9e+05 0.0e+00 0 0 2 2 0 0 0 3 2 0 0
SFReduceEnd 4 1.0 4.2181e-03 2.1 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFFetchOpBegin 1 1.0 8.1062e-06 3.8 0.00e+00 0.0 5.0e+00 1.9e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
SFFetchOpEnd 1 1.0 1.5402e-0410.3 0.00e+00 0.0 5.0e+00 1.9e+03 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
--- Event Stage 1: selfp
VecMDot 13 1.0 2.0536e-02 1.1 5.70e+07 1.0 0.0e+00 0.0e+00 1.3e+01 0 15 0 0 3 1 15 0 0 4 11087
VecNorm 29 1.0 1.0441e-02 1.8 1.82e+07 1.0 0.0e+00 0.0e+00 2.9e+01 0 5 0 0 6 0 5 0 0 8 6949
VecScale 28 1.0 2.8911e-03 1.0 7.02e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 2 0 0 0 0 2 0 0 0 9695
VecCopy 18 1.0 6.7000e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecSet 129 1.0 1.5137e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
VecAXPY 15 1.0 3.6442e-03 1.1 9.39e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 3 0 0 0 0 3 0 0 0 10298
VecAYPX 14 1.0 4.4365e-03 1.1 4.38e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 3948
VecMAXPY 27 1.0 4.0347e-02 1.0 1.22e+08 1.0 0.0e+00 0.0e+00 0.0e+00 1 33 0 0 0 2 33 0 0 0 12092
VecScatterBegin 198 1.0 1.3824e-02 1.1 0.00e+00 0.0 6.1e+02 2.2e+03 0.0e+00 0 0 27 0 0 1 0 76 4 0 0
VecScatterEnd 198 1.0 7.1239e-04 1.6 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
VecNormalize 14 1.0 6.6764e-03 1.5 1.31e+07 1.0 0.0e+00 0.0e+00 1.4e+01 0 4 0 0 3 0 4 0 0 4 7870
MatMult 41 1.0 1.0776e-01 1.1 1.23e+08 1.0 6.1e+02 2.2e+03 2.2e+02 1 33 27 0 44 4 33 76 4 61 4557
MatMultAdd 108 1.0 8.7452e-02 1.1 1.11e+08 1.0 5.4e+02 2.3e+03 0.0e+00 1 30 23 0 0 4 30 68 4 0 5095
MatSolve 14 1.0 2.3307e-02 1.0 2.36e+07 1.0 0.0e+00 0.0e+00 0.0e+00 0 6 0 0 0 1 6 0 0 0 4049
MatLUFactorNum 1 1.0 8.3220e-03 1.1 2.27e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 0 1 0 0 0 1078
MatILUFactorSym 1 1.0 4.7340e-03 1.3 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatConvert 2 1.0 1.0427e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatScale 2 1.0 1.4601e-03 1.0 1.50e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 4106
MatAssemblyBegin 8 1.0 4.3786e-0339.3 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0
MatAssemblyEnd 8 1.0 2.0906e-02 1.0 0.00e+00 0.0 4.0e+01 4.8e+02 1.6e+01 0 0 2 0 3 1 0 5 0 5 0
MatGetRow 250000 1.0 2.3014e-02 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 1 0 0 0 0 0
MatGetRowIJ 3 1.0 0.0000e+00 0.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetSubMatrix 4 1.0 8.7552e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 6.0e+00 0 0 0 0 1 0 0 0 0 2 0
MatGetOrdering 1 1.0 5.6911e-04 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatZeroEntries 1 1.0 1.8420e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatAXPY 1 1.0 7.5296e-02 1.0 0.00e+00 0.0 2.0e+01 4.8e+02 1.2e+01 1 0 1 0 2 3 0 3 0 3 0
MatMatMult 1 1.0 5.0722e-02 1.0 2.75e+06 1.0 4.0e+01 1.5e+03 1.6e+01 1 1 2 0 3 2 1 5 0 5 217
MatMatMultSym 1 1.0 4.1227e-02 1.0 0.00e+00 0.0 3.5e+01 1.2e+03 1.4e+01 1 0 2 0 3 2 0 4 0 4 0
MatMatMultNum 1 1.0 9.5499e-03 1.0 2.75e+06 1.0 5.0e+00 3.8e+03 2.0e+00 0 1 0 0 0 0 1 1 0 1 1151
MatGetLocalMat 2 1.0 8.4138e-03 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
MatGetBrAoCol 2 1.0 3.0017e-04 1.7 0.00e+00 0.0 2.0e+01 2.6e+03 0.0e+00 0 0 1 0 0 0 0 3 0 0 0
PCSetUp 4 1.0 9.3147e-01 1.0 6.52e+06 1.0 7.6e+01 1.3e+05 6.6e+01 13 2 3 2 13 39 2 10 32 19 28
PCSetUpOnBlocks 14 1.0 1.3659e-02 1.1 2.27e+06 1.0 0.0e+00 0.0e+00 0.0e+00 0 1 0 0 0 1 1 0 0 0 657
PCApply 14 1.0 1.6104e+00 1.0 3.99e+07 1.0 7.0e+01 1.9e+03 4.0e+00 22 11 3 0 1 68 11 9 0 1 99
KSPGMRESOrthog 13 1.0 3.8282e-02 1.0 1.14e+08 1.0 0.0e+00 0.0e+00 1.3e+01 1 31 0 0 3 2 31 0 0 4 11895
KSPSetUp 4 1.0 3.7081e-03 2.8 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
KSPSolve 1 1.0 1.9752e+00 1.0 3.70e+08 1.0 6.9e+02 1.7e+04 3.2e+02 27100 30 2 66 84100 86 36 92 749
SNESSolve 1 1.0 2.3453e+00 1.0 3.71e+08 1.0 7.8e+02 2.8e+04 3.4e+02 32100 34 4 69 99100 98 68 96 632
SNESFunctionEval 2 1.0 1.5018e-01 1.0 0.00e+00 0.0 9.6e+01 1.1e+05 1.4e+01 2 0 4 2 3 6 0 12 32 4 0
SNESJacobianEval 1 1.0 2.2324e-01 1.0 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 3 0 0 0 0 9 0 0 0 0 0
SFBcastBegin 5 1.0 3.3138e-0324.6 0.00e+00 0.0 8.0e+01 3.8e+03 0.0e+00 0 0 3 0 0 0 0 10 1 0 0
SFBcastEnd 5 1.0 7.2956e-05 1.4 0.00e+00 0.0 0.0e+00 0.0e+00 0.0e+00 0 0 0 0 0 0 0 0 0 0 0
------------------------------------------------------------------------------------------------------------------------
Memory usage is given in bytes:
Object Type Creations Destructions Memory Descendants' Mem.
Reports information only for process 0.
--- Event Stage 0: Main Stage
Container 6 3 1680 0
Viewer 1 0 0 0
Index Set 91 87 26252316 0
IS L to G Mapping 7 3 14763160 0
Section 66 49 32144 0
Vector 26 49 85656568 0
Vector Scatter 8 7 7509320 0
Matrix 13 5 30224220 0
Preconditioner 1 5 4984 0
Krylov Solver 1 5 23296 0
SNES 1 1 1324 0
SNESLineSearch 1 1 856 0
DMSNES 1 0 0 0
Distributed Mesh 13 7 32792 0
GraphPartitioner 5 4 2384 0
Star Forest Bipartite Graph 72 61 50176 0
Discrete System 13 7 5880 0
--- Event Stage 1: selfp
Index Set 19 16 16200 0
Vector 183 145 78752808 0
Vector Scatter 9 2 2128 0
Matrix 13 8 28752528 0
Preconditioner 5 1 872 0
Krylov Solver 5 1 1296 0
DMKSP interface 1 0 0 0
========================================================================================================================
Average time to get PetscTime(): 1.19209e-07
Average time for MPI_Barrier(): 8.10623e-07
Average time for zero size MPI_Send(): 2.02656e-06
#PETSc Option Table entries:
-log_summary
-selfp_fieldsplit_0_ksp_type preonly
-selfp_fieldsplit_0_pc_type bjacobi
-selfp_fieldsplit_0_sub_pc_type ilu
-selfp_fieldsplit_1_ksp_type preonly
-selfp_fieldsplit_1_pc_type hypre
-selfp_ksp_monitor_true_residual
-selfp_ksp_rtol 1e-07
-selfp_ksp_type gmres
-selfp_pc_fieldsplit_schur_fact_type upper
-selfp_pc_fieldsplit_schur_precondition selfp
-selfp_pc_fieldsplit_type schur
-selfp_pc_type fieldsplit
-selfp_snes_type ksponly
#End of PETSc Option Table entries
Compiled without FORTRAN kernels
Compiled with full precision matrices (default)
sizeof(short) 2 sizeof(int) 4 sizeof(long) 8 sizeof(void*) 8 sizeof(PetscScalar) 8 sizeof(PetscInt) 4
Configure options: --download-chaco --download-ctetgen --download-exodusii=1 --download-fblaslapack --download-hdf5 --download-hypre=1 --download-metis --download-netcdf=1 --download-parmetis --download-triangle --with-cc=mpicc --with-cmake=cmake --with-cxx=mpicxx --with-debugging=0 --with-fc=mpif90 --with-mpiexec=mpiexec --with-shared-libraries=1 --with-valgrind=1 CFLAGS= COPTFLAGS=-O3 CXXFLAGS= CXXOPTFLAGS=-O3 FOPTFLAGS=-O3 PETSC_ARCH=arch-linux2-c-opt
-----------------------------------------
Libraries compiled on Mon Jul 13 02:29:52 2015 on opuntia.cacds.uh.edu
Machine characteristics: Linux-2.6.32-504.1.3.el6.x86_64-x86_64-with-redhat-6.6-Santiago
Using PETSc directory: /home/jchang23/petsc-dev
Using PETSc arch: arch-linux2-c-opt
-----------------------------------------
Using C compiler: mpicc -fPIC -O3 ${COPTFLAGS} ${CFLAGS}
Using Fortran compiler: mpif90 -fPIC -O3 ${FOPTFLAGS} ${FFLAGS}
-----------------------------------------
Using include paths: -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/include -I/home/jchang23/petsc-dev/arch-linux2-c-opt/include -I/share/apps/intel/impi/5.0.2.044/intel64/include
-----------------------------------------
Using C linker: mpicc
Using Fortran linker: mpif90
Using libraries: -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lpetsc -Wl,-rpath,/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -L/home/jchang23/petsc-dev/arch-linux2-c-opt/lib -lHYPRE -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -lmpicxx -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -lflapack -lfblas -lparmetis -lmetis -lchaco -lexoIIv2for -lexodus -lnetcdf -lhdf5hl_fortran -lhdf5_fortran -lhdf5_hl -lhdf5 -ltriangle -lX11 -lctetgen -lssl -lcrypto -lifport -lifcore -lm -lmpicxx -ldl -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -lmpifort -lmpi -lmpigi -lrt -lpthread -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib/release_mt -Wl,-rpath,/opt/intel/mpi-rt/5.0/intel64/lib -limf -lsvml -lirng -lipgo -ldecimal -lcilkrts -lstdc++ -lgcc_s -lirc -lirc_s -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -L/share/apps/intel/impi/5.0.2.044/intel64/lib/release_mt -Wl,-rpath,/share/apps/intel/impi/5.0.2.044/intel64/lib -L/share/apps/intel/impi/5.0.2.044/intel64/lib -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -L/share/apps/intel/composer_xe_2015.1.133/tbb/lib/intel64/gcc4.4.7 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -L/share/apps/gcc-4.9.2/lib/gcc/x86_64-unknown-linux-gnu/4.9.2 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib64 -L/share/apps/gcc-4.9.2/lib64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/ipp/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/compiler/lib/intel64 -Wl,-rpath,/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -L/share/apps/intel/composer_xe_2015.1.133/mkl/lib/intel64 -Wl,-rpath,/share/apps/gcc-4.9.2/lib -L/share/apps/gcc-4.9.2/lib -ldl
-----------------------------------------
-------------- next part --------------
A non-text attachment was scrubbed...
Name: mixed-poisson.py
Type: text/x-python
Size: 6688 bytes
Desc: not available
URL: <http://mailman.ic.ac.uk/pipermail/firedrake/attachments/20150716/37cc2c98/attachment.py>
More information about the firedrake
mailing list