[firedrake] PP14 slides

Fri Feb 14 10:28:37 GMT 2014

Hi Paul,

comments in line below.  Updated slides attached.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: SIAM-PP-2014-02-22.pdf
Type: application/pdf
Size: 463201 bytes
Desc: not available
URL: <http://mailman.ic.ac.uk/pipermail/firedrake/attachments/20140214/e0b441b9/attachment.pdf>
-------------- next part --------------

On 13 Feb 2014, at 20:36, Kelly, Paul H J <p.kelly at imperial.ac.uk> wrote:

> Hi Lawrence
> 
> slide 7 - if you want to show that image it needs to fill the slide.
> 
> (people don't notice if you occasionally turn off the background, the top and bottom bars etc).

Thanks, yes, it looks better bigger.

> slide 8 - consider using blank lines and perhaps even comments, to clarify structure (I hope you will point to lines of code to explain what is going on - so you need the bits separated.
> I like to say that FFC "weaves" the weak form of the PDE with the function space specifications (in the "aspect oriented programming" sense of the word weave).
> This helps clarify that the tool is precise, "do what I say", it doesn't make up any details of what is to be calculated.

I'm not sold on this slide at all.  I don't really have the time to walk through what's going on in detail, so I'm sort of inclined to remove it.

> slide 11 "for each ele" but there is no ele - ah OK....

> slide 11 it's unclear whether "count" is a user-chosen identifier or a PyOP thing to say it's a counter (maybe this even applies to the other variable names).

I've added some sketch variable declarations as well.  But in some sense there's a reason this looks almost like pseudo-code!

> slide 11-12 people often ask why we can't *analyse* the kernels to determine the access descriptors - why do we impose the ugly burden of spelling them out?
> In contrast, the Liszt people require the mesh to be accessed via getters and setters, and they statically analyse the kernel to track them all down - to compute what they call the "stencil of the kernel".  
> This is a different point in the design space with different tradeoffs.  They *still* require access to the mesh to be abstract - ie via getters and setters.  We hide all details of the mesh representation from the kernel - there is no explicit dereferencing of a map/pointer in the kernel.  We say the access descriptors belong to the loop, they say the stencil belongs to the kernel.

I'm uninterested in addressing this up front.  It's a decision we've made.

> Slide 16 - consider merging this with slide 15 so you can point to the picture.

I've reproduced the picture on slide 16 as well as 15.

> Slide 21 "stream bandwidth" - you need words on the slide to clarify that you mean the well-know STREAM benchmark?

I've capitalised and will say things.  I'm unwilling to spell everything out on slides, I'm not writing a paper.

> Slide 22 - how many layers in this experiment?
> (you have 8 threads but show perf only up to 4?)

1 layer (clarified).  I don't think 8 threads really adds anything at this point.

> Slide 23 - the L3 cache bandwidth at 4 threads appears to match the STREAM bandwidth (is this for a well-ordered mesh?)

1. It doesn't.  2. Meaningless I think.

> Slide 25 - *valuable bandwidth* reaches 82% of STREAM.

The figure legend indicates this is valuable bandwidth, I will be saying things

> Is the title right, given the actual content of the talk?

That is the talk title we're advertised with.  I'm happy to consider alternative options.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 455 bytes
Desc: Message signed with OpenPGP using GPGMail
URL: <http://mailman.ic.ac.uk/pipermail/firedrake/attachments/20140214/e0b441b9/attachment.sig>