[klee-dev] Dealing with missing intrinsics (Rust support)

Alastair Reid adreid at google.com
Mon Aug 17 10:25:20 BST 2020


Thanks Cristian and Martin,

The pointer to lib/Core/Executor.cpp was exactly what I needed to finish
implementing this and I've submitted a PR.

And, in case anybody else out there is trying to use KLEE with Rust at the
moment, here is a list of the PRs that I am currently merging into master
in my work.
These are sufficient that I can verify properties of programs that use Rust
data structures (Vec, BTreeMap, Option, enums, traits, etc.).

https://github.com/klee/klee/pull/1295 More robust handling of unknown
intrinsics
https://github.com/klee/klee/pull/1290 Stub definition of
__cxa_thread_atexit_impl
https://github.com/klee/klee/pull/1271 [WIP] Handle global variables

(The last one is a WIP change by Martin that he mentioned here
https://github.com/klee/klee/issues/1263#issuecomment-643254348)

--
Alastair

On Fri, Aug 14, 2020 at 6:21 PM Cristian Cadar <c.cadar at imperial.ac.uk>
wrote:

> +1 to this, we should only terminate those paths that hit unhandled
> intrinsics.  This is what we do with unavailable external functions and
> inline assembly for instance, so we should adopt the same behaviour here.
>
> Best,
> Cristian
>
> On 14/08/2020 18:06, Nowack, Martin wrote:
> > Hi Alastair,
> >
> > I would also rather use option 3.
> > Normally, the `LowerIntrinsic` pass should try to replace appropriate
> > calls if possible. It should not terminate it.
> >
> > Beside your attached code, I would suggest to change the following code
> > line in `lib/Core/Executor.cpp`:
> >
> https://github.com/klee/klee/blob/088fc21e12c9675161172899be5ef5051f1ae96b/lib/Core/Executor.cpp#L1514
> >
> > Instead of an `klee_error` which terminates the whole KLEE execution,
> > one could terminate the current path.
> > This allows to explore other paths that don’t contain such instructions
> > or at least explore a path until it reaches such instruction, which I
> > guess might be valuable in itself.
> >
> >
> > As you mentioned inline assembly, there is a currently a pass which
> > tries to lift mostly memory barriers into LLVM intrinsics that are
> > handles as NOPs. More complex things are hard to handle.
> > (
> https://github.com/klee/klee/blob/088fc21e12c9675161172899be5ef5051f1ae96b/lib/Module/RaiseAsm.cpp
> )
> >
> > Hope that helps.
> >
> > Best,
> > Martin
> >
> >> On 14. Aug 2020, at 17:33, Alastair Reid <adreid at google.com
> >> <mailto:adreid at google.com>> wrote:
> >>
> >> Hi,
> >> <mailto:klee-dev at imperial.ac.uk>
> >>
> >> I want to check what the best way is to deal with missing intrinsics
> >> when adapting KLEE for a new language and new standard libraries.
> >> In particular, what kind of changes should I make and submit as PRs?
> >>
> >> I am trying to use KLEE with a modest sized Rust application (about
> >> 20kloc) but that depends on over 100 other libraries ("crates") and
> >> Rust's fairly rich standard library.
> >> Given the amount of code involved, it is unsurprising that the
> >> resulting LLVM bitcode files use some intrinsics that KLEE does not
> >> support.
> >> I don't know whether I have a complete list but at least the following
> >> are needed:
> >>
> >>     maxnum               minnum
> >>     x86_avx2_pshuf_b     x86_ssse3_pshuf_b_128
> >>     x86_avx2_pslli_d     x86_avx2_psrli_d
> >>     x86_avx_ldu_dq_256   x86_sse3_ldu_dq
> >>     x86_sse2_pslli_d     x86_sse2_psrli_d
> >>     x86_sse2_pause
> >>     x86_avx_vzeroupper
> >>
> >> As far as I can tell, KLEE checks that it accepts all intrinsics
> >> before starting symbolic execution.
> >> So it can reject a program even if there is no feasible path these
> >> intrinsics just because they are in the bitcode file.
> >> I have seen this happen in my current trials (on short test programs
> >> that only use the standard library).
> >>
> >>
> >> I see three solutions - with various pros/cons
> >>
> >> 1) Implement all missing intrinsics.
> >>     This seems like a good option for target independent intrinsics
> >> like maxnum (IEEE float min/max)
> >>     And it might be necessary for x86_sse2_pause because LLVM will
> >> emit this instruction even if you disable SSE2 (this instruction is a
> >> NOP on old x86 cores - it makes spinlocks faster on modern cores).
> >>
> >>     The disadvantage is that implementing all LLVM intrinsics is a
> >> Sisyphean task.
> >>     (I have not even explored what intrinsics are required on Arm -
> >> which is a much more
> >>     widespread ISA than x86!)
> >>
> >> 2) Compile the libraries with AVX and SSE features disabled.
> >>     This should fix most of the intrinsics (except for
> >> maxnum/minnum/x86_sse2_pause).
> >>
> >>     The disadvantage of messing with compilation options is that this
> >> becomes quite hard
> >>      to do in large, complex projects.
> >>
> >> 3) Change KLEE to ignore any missing intrinsics unless it actually
> >> executes them.
> >>     This would make option 1 less painful because I would only need to
> >> implement
> >>     intrinsics that are truly needed.
> >>
> >>     I have a change that would do this - but I don't think it is ready
> >> to submit as a PR yet.
> >>     (diff attached at end).
> >>
> >>     (Using this change, I got a bit further with the application - I
> >> was able to start running
> >>     the code in KLEE but it then got stuck because one of the
> >> libraries contains a mix of
> >>     Rust, C and assembly code. But that's a problem for next week...)
> >>
> >> I am currently leaning towards option 3 but I want to check which is
> >> preferred and there are a few possible variants (sketched at the end)
> >>
> >> --
> >> Alastair Reid
> >>
> >> ps Here is the change. It ignores all intrinsics on the above list
> >> during the Cleaner pass (but still reports an error if it hits them
> >> during execution).
> >>
> >> The main goal of this code is to avoid calling
> >> "IL->LowerIntrinsicCall(ii);" in the default case because that LLVM
> >> function fails an assertion if it cannot lower the call.
> >> (The function is defined in LLVM in
> >> llvm/lib/CodeGen/IntrinsicLowering.cpp)
> >>
> >> Some things that I am unsure about are:
> >> - I am a bit uneasy about hardcoding the list of intrinsics to ignore.
> >> - Is there a need for a command line option to enable/disable ignoring
> >> unknown intrinsics.
> >>
> >> diff --git a/lib/Module/IntrinsicCleaner.cpp
> >> b/lib/Module/IntrinsicCleaner.cpp
> >> index a1d4fdda..e88d5300 100644
> >> --- a/lib/Module/IntrinsicCleaner.cpp
> >> +++ b/lib/Module/IntrinsicCleaner.cpp
> >> @@ -10,6 +10,7 @@
> >>  #include "Passes.h"
> >>
> >>  #include "klee/Config/Version.h"
> >> +#include "klee/Support/ErrorHandling.h"
> >>  #include "llvm/Analysis/MemoryBuiltins.h"
> >>  #include "llvm/Analysis/ConstantFolding.h"
> >>  #include "llvm/IR/Constants.h"
> >> @@ -20,6 +21,7 @@
> >>  #include "llvm/IR/Instruction.h"
> >>  #include "llvm/IR/Instructions.h"
> >>  #include "llvm/IR/IntrinsicInst.h"
> >> +#include "llvm/IR/IntrinsicsX86.h"
> >>  #include "llvm/IR/Module.h"
> >>  #include "llvm/IR/Type.h"
> >>  #include "llvm/Pass.h"
> >> @@ -340,6 +342,33 @@ bool
> >> IntrinsicCleanerPass::runOnBasicBlock(BasicBlock &b, Module &M) {
> >>          break;
> >>        }
> >>  #endif
> >> +
> >> +        // The following intrinsics are known to be missing from KLEE
> >> +        // but can be generated by Rustc and are found in libcore,
> >> libstd,
> >> +        // common Rust crates, etc.
> >> +        //
> >> +        // Adding an intrinsic to this list allows KLEE to execute
> >> programs that
> >> +        // use these intrinsics as long as the paths explored by KLEE
> >> +        // do not invoke these intrinsics.
> >> +      case Intrinsic::maxnum:
> >> +      case Intrinsic::minnum:
> >> +      case Intrinsic::x86_avx2_pshuf_b:
> >> +      case Intrinsic::x86_avx2_pslli_d:
> >> +      case Intrinsic::x86_avx2_psrli_d:
> >> +      case Intrinsic::x86_avx_ldu_dq_256:
> >> +      case Intrinsic::x86_avx_vzeroupper:
> >> +      case Intrinsic::x86_sse2_pause:
> >> +      case Intrinsic::x86_sse2_pslli_d:
> >> +      case Intrinsic::x86_sse2_psrli_d:
> >> +      case Intrinsic::x86_sse3_ldu_dq:
> >> +      case Intrinsic::x86_ssse3_pshuf_b_128:
> >> +      {
> >> +        const Function *Callee = ii->getCalledFunction();
> >> +        llvm::StringRef name = Callee->getName();
> >> +        klee_warning_once((void*)Callee, "unsupported intrinsic
> >> %.*s", (int)name.size(), name.data());
> >> +        break;
> >> +      }
> >> +
> >>        default:
> >>          IL->LowerIntrinsicCall(ii);
> >>          dirty = true;
> >>
> >>
> >>
> >>
> >> pps Some alternatives to having a hardcoded list of intrinsics to
> >> ignore are
> >>
> >> 1) to change the last three lines of the above (i.e., the "default
> >> case") to only invoke
> >>    LowerIntrinsicCall if the intrinsic is one of the intrinsics that
> >> it will not barf on.
> >>    (LowerIntrinsicCall accepts 44 intrinsics. They are all target
> >> independent.)
> >>
> >> 2) to change the hardcoded list of intrinsics to ignore to
> >>
> >>     case x86_3dnow_pavgusb ... x86_xtest :
> >>
> >>     which will match against all x86 intrinsics.
> >>     (However, this also feels fragile - what if a new intrinsic is
> >> added before pavgusb or
> >>     after xtest?)
> >>
> >> _______________________________________________
> >> klee-dev mailing list
> >> klee-dev at imperial.ac.uk <mailto:klee-dev at imperial.ac.uk>
> >> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
> >
> >
> > _______________________________________________
> > klee-dev mailing list
> > klee-dev at imperial.ac.uk
> > https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
> >
>
-------------- next part --------------
HTML attachment scrubbed and removed


More information about the klee-dev mailing list