[klee-dev] Dealing with missing intrinsics (Rust support)

Nowack, Martin m.nowack at imperial.ac.uk
Fri Aug 14 18:06:55 BST 2020


Hi Alastair,

I would also rather use option 3.
Normally, the `LowerIntrinsic` pass should try to replace appropriate calls if possible. It should not terminate it.

Beside your attached code, I would suggest to change the following code line in `lib/Core/Executor.cpp`:
https://github.com/klee/klee/blob/088fc21e12c9675161172899be5ef5051f1ae96b/lib/Core/Executor.cpp#L1514

Instead of an `klee_error` which terminates the whole KLEE execution, one could terminate the current path.
This allows to explore other paths that don’t contain such instructions or at least explore a path until it reaches such instruction, which I guess might be valuable in itself.


As you mentioned inline assembly, there is a currently a pass which tries to lift mostly memory barriers into LLVM intrinsics that are handles as NOPs. More complex things are hard to handle.
(https://github.com/klee/klee/blob/088fc21e12c9675161172899be5ef5051f1ae96b/lib/Module/RaiseAsm.cpp)

Hope that helps.

Best,
Martin

On 14. Aug 2020, at 17:33, Alastair Reid <adreid at google.com<mailto:adreid at google.com>> wrote:

Hi,
<mailto:klee-dev at imperial.ac.uk>

I want to check what the best way is to deal with missing intrinsics when adapting KLEE for a new language and new standard libraries.
In particular, what kind of changes should I make and submit as PRs?

I am trying to use KLEE with a modest sized Rust application (about 20kloc) but that depends on over 100 other libraries ("crates") and Rust's fairly rich standard library.
Given the amount of code involved, it is unsurprising that the resulting LLVM bitcode files use some intrinsics that KLEE does not support.
I don't know whether I have a complete list but at least the following are needed:

    maxnum               minnum
    x86_avx2_pshuf_b     x86_ssse3_pshuf_b_128
    x86_avx2_pslli_d     x86_avx2_psrli_d
    x86_avx_ldu_dq_256   x86_sse3_ldu_dq
    x86_sse2_pslli_d     x86_sse2_psrli_d
    x86_sse2_pause
    x86_avx_vzeroupper

As far as I can tell, KLEE checks that it accepts all intrinsics before starting symbolic execution.
So it can reject a program even if there is no feasible path these intrinsics just because they are in the bitcode file.
I have seen this happen in my current trials (on short test programs that only use the standard library).


I see three solutions - with various pros/cons

1) Implement all missing intrinsics.
    This seems like a good option for target independent intrinsics like maxnum (IEEE float min/max)
    And it might be necessary for x86_sse2_pause because LLVM will emit this instruction even if you disable SSE2 (this instruction is a NOP on old x86 cores - it makes spinlocks faster on modern cores).

    The disadvantage is that implementing all LLVM intrinsics is a Sisyphean task.
    (I have not even explored what intrinsics are required on Arm - which is a much more
    widespread ISA than x86!)

2) Compile the libraries with AVX and SSE features disabled.
    This should fix most of the intrinsics (except for maxnum/minnum/x86_sse2_pause).

    The disadvantage of messing with compilation options is that this becomes quite hard
     to do in large, complex projects.

3) Change KLEE to ignore any missing intrinsics unless it actually executes them.
    This would make option 1 less painful because I would only need to implement
    intrinsics that are truly needed.

    I have a change that would do this - but I don't think it is ready to submit as a PR yet.
    (diff attached at end).

    (Using this change, I got a bit further with the application - I was able to start running
    the code in KLEE but it then got stuck because one of the libraries contains a mix of
    Rust, C and assembly code. But that's a problem for next week...)

I am currently leaning towards option 3 but I want to check which is preferred and there are a few possible variants (sketched at the end)

--
Alastair Reid

ps Here is the change. It ignores all intrinsics on the above list during the Cleaner pass (but still reports an error if it hits them during execution).

The main goal of this code is to avoid calling "IL->LowerIntrinsicCall(ii);" in the default case because that LLVM function fails an assertion if it cannot lower the call.
(The function is defined in LLVM in llvm/lib/CodeGen/IntrinsicLowering.cpp)

Some things that I am unsure about are:
- I am a bit uneasy about hardcoding the list of intrinsics to ignore.
- Is there a need for a command line option to enable/disable ignoring unknown intrinsics.

diff --git a/lib/Module/IntrinsicCleaner.cpp b/lib/Module/IntrinsicCleaner.cpp
index a1d4fdda..e88d5300 100644
--- a/lib/Module/IntrinsicCleaner.cpp
+++ b/lib/Module/IntrinsicCleaner.cpp
@@ -10,6 +10,7 @@
 #include "Passes.h"

 #include "klee/Config/Version.h"
+#include "klee/Support/ErrorHandling.h"
 #include "llvm/Analysis/MemoryBuiltins.h"
 #include "llvm/Analysis/ConstantFolding.h"
 #include "llvm/IR/Constants.h"
@@ -20,6 +21,7 @@
 #include "llvm/IR/Instruction.h"
 #include "llvm/IR/Instructions.h"
 #include "llvm/IR/IntrinsicInst.h"
+#include "llvm/IR/IntrinsicsX86.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
 #include "llvm/Pass.h"
@@ -340,6 +342,33 @@ bool IntrinsicCleanerPass::runOnBasicBlock(BasicBlock &b, Module &M) {
         break;
       }
 #endif
+
+        // The following intrinsics are known to be missing from KLEE
+        // but can be generated by Rustc and are found in libcore, libstd,
+        // common Rust crates, etc.
+        //
+        // Adding an intrinsic to this list allows KLEE to execute programs that
+        // use these intrinsics as long as the paths explored by KLEE
+        // do not invoke these intrinsics.
+      case Intrinsic::maxnum:
+      case Intrinsic::minnum:
+      case Intrinsic::x86_avx2_pshuf_b:
+      case Intrinsic::x86_avx2_pslli_d:
+      case Intrinsic::x86_avx2_psrli_d:
+      case Intrinsic::x86_avx_ldu_dq_256:
+      case Intrinsic::x86_avx_vzeroupper:
+      case Intrinsic::x86_sse2_pause:
+      case Intrinsic::x86_sse2_pslli_d:
+      case Intrinsic::x86_sse2_psrli_d:
+      case Intrinsic::x86_sse3_ldu_dq:
+      case Intrinsic::x86_ssse3_pshuf_b_128:
+      {
+        const Function *Callee = ii->getCalledFunction();
+        llvm::StringRef name = Callee->getName();
+        klee_warning_once((void*)Callee, "unsupported intrinsic %.*s", (int)name.size(), name.data());
+        break;
+      }
+
       default:
         IL->LowerIntrinsicCall(ii);
         dirty = true;




pps Some alternatives to having a hardcoded list of intrinsics to ignore are

1) to change the last three lines of the above (i.e., the "default case") to only invoke
   LowerIntrinsicCall if the intrinsic is one of the intrinsics that it will not barf on.
   (LowerIntrinsicCall accepts 44 intrinsics. They are all target independent.)

2) to change the hardcoded list of intrinsics to ignore to

    case x86_3dnow_pavgusb ... x86_xtest :

    which will match against all x86 intrinsics.
    (However, this also feels fragile - what if a new intrinsic is added before pavgusb or
    after xtest?)


_______________________________________________
klee-dev mailing list
klee-dev at imperial.ac.uk<mailto:klee-dev at imperial.ac.uk>
https://mailman.ic.ac.uk/mailman/listinfo/klee-dev

-------------- next part --------------
HTML attachment scrubbed and removed


More information about the klee-dev mailing list