[firedrake] More detailled breakdown of PETSc timings / higher order geometric MG
Eike Mueller
e.mueller at bath.ac.uk
Mon Mar 30 20:19:43 BST 2015
Dear all,
apologies for posting from the wrong account again...
Having talked to Lawrence last week, this is now resolved. The reason that the velocity mass solve was so expensive is that I used the solve() interface instead of a LinearSolver class. Using solve() resulting in recomputing the block-ILU-factorisation in every mass-matrix solve, which is clearly a bad idea.
I'm now also able to measure the time spent in different parts of the PETSc solver by extracting the KSP/PC objects manually and measuring those.
I'm now looking at the timings in more detail to understand why the matrix-free solver is still slower than expected.
Thanks,
Eike
> On 17 Mar 2015, at 09:02, Eike Mueller <eike.h.mueller at googlemail.com> wrote:
>
> Hi Lawrence (cc firedrake),
>
> having talked to Rob yesterday, I’m also looking at the performance of the current (non-hybridised) solver at higher order again. As I said, the main bottleneck that makes the geometric multigrid more expensive is the high cost of the velocity mass matrix solve, which I have to do in the Schur-complement forward- and backward substitution (i.e. applying the triangular matrices in the full Schur-complement), and also in the pressure solve (since I don’t use the diagonal-only form, i.e. 'pc_fieldsplit_schur_fact_type': 'FULL'). However, the PETSc solver has to invert the velocity mass matrix as well, so it should be hit by the same costs. Do you know how I can extract the time for this to make a proper comparison? If I run with PETSC_OPTION=-log_summary then I get a breakdown of the PETSc times, I can only get PCApply.
>
> I guess what I’m saying is that I'm now unsure what PCApply actually measures. If the PETSc solver does 11 GMRES iterations, it claims that PCApply was called 23 times, so my conjucture is that this measures 11 pressure solves and 11 mass matrix solves, but probably not the time spent in the forward/backward substitution (as I said, I do run with the 'pc_fieldsplit_schur_fact_type': ‘FULL’ option).
>
> Can I break those times down further, so that I get, for example, the time spent in the two velocity mass matrix solves in the forward/backward substitution and the time in solving the Schur-complement pressure system M_p + \omega^2*D^T*diag(M_u)*D?
>
> Data I have currently:
> In the matrix-free solver, one velocity mass matrix inverse costs 2.27s, and I need two per iteration just for the forward/backward substitution. On the other hand, one GMRES iteration of the PETSc solver (which includes everything: applying the mixed operator, solving the pressure system, inverting the velocity mass matrices) takes 3.87s, so something is not right there.
>
> If I can get a better like-for-like comparison of the times in the PETSc and matrix-free solver it should be possible to identify the bottlenecks.
>
> Thanks,
>
> Eike
> _______________________________________________
> firedrake mailing list
> firedrake at imperial.ac.uk
> https://mailman.ic.ac.uk/mailman/listinfo/firedrake
More information about the firedrake
mailing list