[firedrake] Hardware counters for Firedrake

Justin Chang jychang48 at gmail.com
Thu Jul 16 15:09:58 BST 2015


Lawrence

1) Okay that makes sense. Don't know why I didn't see this earlier.

2) Okay thanks,

3) The only reason I wanted that function was to get &real_time. Unless
there's a more efficient way (or "firedrake" way) of getting this metric?
In my case, I only want the time from SNESSolve()

Thanks,
Justin

On Thu, Jul 16, 2015 at 8:47 AM, Lawrence Mitchell <
lawrence.mitchell at imperial.ac.uk> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 16/07/15 14:32, Justin Chang wrote:
> > Lawrence,
> >
> > I have attached the code I am working with. It's basically the one
> > you sent me a few weeks ago, but I am only working with selfp.
> > Attached are the log files with 1, 2, and 4 processors on our local
> > HPC machine (Intel Xeon E5-2680v2 2.8 GHz)
> >
> > 1) I wrapped the PyPAPI calls around solver.solve(). I guess this
> > is doing what I want. Right now I am estimating the arithmetic
> > intensity by documenting the FLOPS, Loads, and Stores. When i
> > compare the measured FLOPS with the PETSc manual FLOP count it
> > seems papi over counts by a factor of 2 (which I suppose is
> > expected coming from a new Intel machine). Anyway, in terms of
> > computing the FLOPS and AI this is what I want, I just wanted to
> > make sure these don't account for the DMPlex initialization and
> > stuff because:
>
> So note that inside SNESSolve, petsc attributes zero flops to forming
> the residual and jacobian (since that's a user function that it knows
> nothing about).  We could actually do a reasonable job of adding a
> PetscLogFlops call, since we can inspect the kernel and make a
> reasonable guess at the number of flops it does, but we don't
> currently do that.
>
> This may explain the difference in flop counts.
>
> > 2) According to the attached log_summaries it seems
> > DMPlexDistribute and MeshMigration still consume a significant
> > portion of the time. By significant I mean that the %T doesn't
> > reduce as I increase the number of processors. I remember seeing
> > Michael Lange's presentations (from PETSc-20 and the webinar) that
> > mentioned something about this?
>
> Yes, for more details on what scales and doesn't, see this paper:
>
> http://arxiv.org/abs/1506.06194
>
>
> > 3) Bonus question: how do I also use PAPI_flops(&real_time,
> > &proc_time, &flpins, &mflops)? I see there's the flops() function,
> > but in my limited PAPI experience, I seem to have issues whenever I
> > try to put both that and PAPI_start_counters into the same program,
> > but I could be wrong.
>
> I'm by no means a PAPI expert, but can you not just obtain the result
> of PAPI_flops by measuring the PAPI_FP_INS counter?
>
> Cheers,
>
> Lawrence
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1
>
> iQEcBAEBAgAGBQJVp7XVAAoJECOc1kQ8PEYvrU0IALr8aXhyfb+uVkQuS07s8Wov
> hn1a8i2Fu1ERNk/0W8NYXNckY0g+HP0zfgPpo/vsVv4b4W3l3uqsCqvXzzUG8AH7
> GYgKIfBTqr9d6OvFN2niZNnrogbbpsq1u6RxVzqYCQCKgXkJ++BaGStHsQIyg++M
> 8zFoJ97HWEUdEgcjsNvuugqf14M/2PfZnMFrJJghr7xf4W37w47Ya4bizzAH2NNh
> yIefY5DFldPmfBgbEfDGhjEUig8wkuwTinVo8NnXW4yXJsqqu+THgdncQagiQwDo
> ugV6oABIdkX/JNHt0spADarL3vX+lk7aMZ7Zyibj7L+65YXCJlhgv8vXc/KQyzM=
> =WY+c
> -----END PGP SIGNATURE-----
>
> _______________________________________________
> firedrake mailing list
> firedrake at imperial.ac.uk
> https://mailman.ic.ac.uk/mailman/listinfo/firedrake
>
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the firedrake mailing list