[firedrake] cached kernels

Mon Nov 9 09:45:37 GMT 2015

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 08/11/15 10:06, Eike Mueller wrote:
> Hi Lawrence (copied to firedrake, since overheads from loading 
> libraries might be a general concern),
> 
> I tried it on ARCHER and adding caching for the kernels does not
> make any difference. The LU solve performance at lowest order is
> poor, but an individual call takes actually more time (~0.01s) than
> the operator application (~0.001s), so I would have thought the
> overheads are actually relatively smaller for the LU solve. For the
> operator application the reported BW is excellent, but for the LU
> solve it is very poor. At higher order both BWs are good, here the
> data volume is larger, but the time for one LU solve call is still
> ~0.01s. Maybe in this case any overhead that shows up at lowest
> order is hidden.
> 
> Could there be an overhead from loading the LAPACK library, which
> is required for the LU solve?

This isn't how dynamic loading works.  The first time you load the
.so, in the warmup phase, the symbol is resolved, and the trampoline
is replaced by a direct call.

I have effectively no idea what's going on.  Does the LU solve take
this long on this much data if you just call it from C?

IOW, I think it's not "our" fault, unless somehow you're managing to
get a recompile or similar every time you call _lu_solve.

Lawrence
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)

iQEcBAEBAgAGBQJWQGtBAAoJECOc1kQ8PEYv+PUH/0o9la78TbSn7UTWe9anzMwC
o4GkJ0lfbwvmZ6PWI+fPzrsH4lnR1AOiWSvG/BBNIW4SQvMhx50otImyeQePZ+9s
7uZqOcKdyvsRncFDSpdlND5eDO4+o9QVfINrmw4W9eXe9WsIUPHAWNsINkvyqnfX
GlW8dRynKoIPqs7ZR3DfNHUF0RRtbY3z4Zo/jjeDzGXnvdXVagmhLRG17UQ2WB8H
p8qSFBTNgnSKS1kKvUNlaR0cL2agTuoPSAY6ITnb7hJzBxSGXrWNcj8dFuune6hi
wWkSxS5Y2Lgio+X/Jw36zMUdBTXLzwWSfjBhiYpHgch9zGXAskNSMdZM8DbeA3I=
=7KZc
-----END PGP SIGNATURE-----