[klee-dev] Dealing with missing intrinsics (Rust support)

Cristian Cadar c.cadar at imperial.ac.uk
Fri Aug 14 18:21:17 BST 2020


+1 to this, we should only terminate those paths that hit unhandled 
intrinsics.  This is what we do with unavailable external functions and 
inline assembly for instance, so we should adopt the same behaviour here.

Best,
Cristian

On 14/08/2020 18:06, Nowack, Martin wrote:
> Hi Alastair,
> 
> I would also rather use option 3.
> Normally, the `LowerIntrinsic` pass should try to replace appropriate 
> calls if possible. It should not terminate it.
> 
> Beside your attached code, I would suggest to change the following code 
> line in `lib/Core/Executor.cpp`:
> https://github.com/klee/klee/blob/088fc21e12c9675161172899be5ef5051f1ae96b/lib/Core/Executor.cpp#L1514
> 
> Instead of an `klee_error` which terminates the whole KLEE execution, 
> one could terminate the current path.
> This allows to explore other paths that don’t contain such instructions 
> or at least explore a path until it reaches such instruction, which I 
> guess might be valuable in itself.
> 
> 
> As you mentioned inline assembly, there is a currently a pass which 
> tries to lift mostly memory barriers into LLVM intrinsics that are 
> handles as NOPs. More complex things are hard to handle.
> (https://github.com/klee/klee/blob/088fc21e12c9675161172899be5ef5051f1ae96b/lib/Module/RaiseAsm.cpp)
> 
> Hope that helps.
> 
> Best,
> Martin
> 
>> On 14. Aug 2020, at 17:33, Alastair Reid <adreid at google.com 
>> <mailto:adreid at google.com>> wrote:
>>
>> Hi,
>> <mailto:klee-dev at imperial.ac.uk>
>>
>> I want to check what the best way is to deal with missing intrinsics 
>> when adapting KLEE for a new language and new standard libraries.
>> In particular, what kind of changes should I make and submit as PRs?
>>
>> I am trying to use KLEE with a modest sized Rust application (about 
>> 20kloc) but that depends on over 100 other libraries ("crates") and 
>> Rust's fairly rich standard library.
>> Given the amount of code involved, it is unsurprising that the 
>> resulting LLVM bitcode files use some intrinsics that KLEE does not 
>> support.
>> I don't know whether I have a complete list but at least the following 
>> are needed:
>>
>>     maxnum               minnum
>>     x86_avx2_pshuf_b     x86_ssse3_pshuf_b_128
>>     x86_avx2_pslli_d     x86_avx2_psrli_d
>>     x86_avx_ldu_dq_256   x86_sse3_ldu_dq
>>     x86_sse2_pslli_d     x86_sse2_psrli_d
>>     x86_sse2_pause
>>     x86_avx_vzeroupper
>>
>> As far as I can tell, KLEE checks that it accepts all intrinsics 
>> before starting symbolic execution.
>> So it can reject a program even if there is no feasible path these 
>> intrinsics just because they are in the bitcode file.
>> I have seen this happen in my current trials (on short test programs 
>> that only use the standard library).
>>
>>
>> I see three solutions - with various pros/cons
>>
>> 1) Implement all missing intrinsics.
>>     This seems like a good option for target independent intrinsics 
>> like maxnum (IEEE float min/max)
>>     And it might be necessary for x86_sse2_pause because LLVM will 
>> emit this instruction even if you disable SSE2 (this instruction is a 
>> NOP on old x86 cores - it makes spinlocks faster on modern cores).
>>
>>     The disadvantage is that implementing all LLVM intrinsics is a 
>> Sisyphean task.
>>     (I have not even explored what intrinsics are required on Arm - 
>> which is a much more
>>     widespread ISA than x86!)
>>
>> 2) Compile the libraries with AVX and SSE features disabled.
>>     This should fix most of the intrinsics (except for 
>> maxnum/minnum/x86_sse2_pause).
>>
>>     The disadvantage of messing with compilation options is that this 
>> becomes quite hard
>>      to do in large, complex projects.
>>
>> 3) Change KLEE to ignore any missing intrinsics unless it actually 
>> executes them.
>>     This would make option 1 less painful because I would only need to 
>> implement
>>     intrinsics that are truly needed.
>>
>>     I have a change that would do this - but I don't think it is ready 
>> to submit as a PR yet.
>>     (diff attached at end).
>>
>>     (Using this change, I got a bit further with the application - I 
>> was able to start running
>>     the code in KLEE but it then got stuck because one of the 
>> libraries contains a mix of
>>     Rust, C and assembly code. But that's a problem for next week...)
>>
>> I am currently leaning towards option 3 but I want to check which is 
>> preferred and there are a few possible variants (sketched at the end)
>>
>> --
>> Alastair Reid
>>
>> ps Here is the change. It ignores all intrinsics on the above list 
>> during the Cleaner pass (but still reports an error if it hits them 
>> during execution).
>>
>> The main goal of this code is to avoid calling 
>> "IL->LowerIntrinsicCall(ii);" in the default case because that LLVM 
>> function fails an assertion if it cannot lower the call.
>> (The function is defined in LLVM in 
>> llvm/lib/CodeGen/IntrinsicLowering.cpp)
>>
>> Some things that I am unsure about are:
>> - I am a bit uneasy about hardcoding the list of intrinsics to ignore.
>> - Is there a need for a command line option to enable/disable ignoring 
>> unknown intrinsics.
>>
>> diff --git a/lib/Module/IntrinsicCleaner.cpp 
>> b/lib/Module/IntrinsicCleaner.cpp
>> index a1d4fdda..e88d5300 100644
>> --- a/lib/Module/IntrinsicCleaner.cpp
>> +++ b/lib/Module/IntrinsicCleaner.cpp
>> @@ -10,6 +10,7 @@
>>  #include "Passes.h"
>>
>>  #include "klee/Config/Version.h"
>> +#include "klee/Support/ErrorHandling.h"
>>  #include "llvm/Analysis/MemoryBuiltins.h"
>>  #include "llvm/Analysis/ConstantFolding.h"
>>  #include "llvm/IR/Constants.h"
>> @@ -20,6 +21,7 @@
>>  #include "llvm/IR/Instruction.h"
>>  #include "llvm/IR/Instructions.h"
>>  #include "llvm/IR/IntrinsicInst.h"
>> +#include "llvm/IR/IntrinsicsX86.h"
>>  #include "llvm/IR/Module.h"
>>  #include "llvm/IR/Type.h"
>>  #include "llvm/Pass.h"
>> @@ -340,6 +342,33 @@ bool 
>> IntrinsicCleanerPass::runOnBasicBlock(BasicBlock &b, Module &M) {
>>          break;
>>        }
>>  #endif
>> +
>> +        // The following intrinsics are known to be missing from KLEE
>> +        // but can be generated by Rustc and are found in libcore, 
>> libstd,
>> +        // common Rust crates, etc.
>> +        //
>> +        // Adding an intrinsic to this list allows KLEE to execute 
>> programs that
>> +        // use these intrinsics as long as the paths explored by KLEE
>> +        // do not invoke these intrinsics.
>> +      case Intrinsic::maxnum:
>> +      case Intrinsic::minnum:
>> +      case Intrinsic::x86_avx2_pshuf_b:
>> +      case Intrinsic::x86_avx2_pslli_d:
>> +      case Intrinsic::x86_avx2_psrli_d:
>> +      case Intrinsic::x86_avx_ldu_dq_256:
>> +      case Intrinsic::x86_avx_vzeroupper:
>> +      case Intrinsic::x86_sse2_pause:
>> +      case Intrinsic::x86_sse2_pslli_d:
>> +      case Intrinsic::x86_sse2_psrli_d:
>> +      case Intrinsic::x86_sse3_ldu_dq:
>> +      case Intrinsic::x86_ssse3_pshuf_b_128:
>> +      {
>> +        const Function *Callee = ii->getCalledFunction();
>> +        llvm::StringRef name = Callee->getName();
>> +        klee_warning_once((void*)Callee, "unsupported intrinsic 
>> %.*s", (int)name.size(), name.data());
>> +        break;
>> +      }
>> +
>>        default:
>>          IL->LowerIntrinsicCall(ii);
>>          dirty = true;
>>
>>
>>
>>
>> pps Some alternatives to having a hardcoded list of intrinsics to 
>> ignore are
>>
>> 1) to change the last three lines of the above (i.e., the "default 
>> case") to only invoke
>>    LowerIntrinsicCall if the intrinsic is one of the intrinsics that 
>> it will not barf on.
>>    (LowerIntrinsicCall accepts 44 intrinsics. They are all target 
>> independent.)
>>
>> 2) to change the hardcoded list of intrinsics to ignore to
>>
>>     case x86_3dnow_pavgusb ... x86_xtest :
>>
>>     which will match against all x86 intrinsics.
>>     (However, this also feels fragile - what if a new intrinsic is 
>> added before pavgusb or
>>     after xtest?)
>>
>> _______________________________________________
>> klee-dev mailing list
>> klee-dev at imperial.ac.uk <mailto:klee-dev at imperial.ac.uk>
>> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
> 
> 
> _______________________________________________
> klee-dev mailing list
> klee-dev at imperial.ac.uk
> https://mailman.ic.ac.uk/mailman/listinfo/klee-dev
> 



More information about the klee-dev mailing list