[klee-dev] On KLEE's ability to perform formula-slicing

Mon Mar 2 17:46:17 GMT 2015

Hey Andrea,
I actually noticed the same thing.  I believe that what you're seeing is coming from the fact that during its exploration phase, Klee slices the incoming constraints and only checks the Satisiability of the the newest subformula.  You can see this if you look at the computeTruth, computeValidity, and the computeValue functions of lib/Solver/IndependentSolver.cpp which are tasked with solving/operating on only the newest piece of the formula.  These functions utilize the IndependentElementSet class (also in IndepenentSolver.cpp), which does the graph based analysis I think you used. 

After a path reaches the exit of the program or a termination condition is met (too much time, forks, etc), Klee enters its test generation phase.  During this phase, complete solutions to the set of constraints along a particular path are required.  A different function, computeInitialValues, is invoked to solve for these full constraints.  The results for this function become the tests that Klee outputs.  (Note: Usually for large programs during a default search this phase occurs almost exclusively after Klee times out.  It is possible, however, for this phase to be mixed in with the regular exploration phase.  This occurs when a path reaches a program exit.  In this case, a test is generated immediately.)

I think it's likely that you're observation of sliceable constraints comes from this particular part of the program.  In it's current form, computeInitialValues passes these large constraints forward without any analysis. (line 289 of IndependentSolver.cpp) I believe, however, that it is possible to slice the larger constraint, solve the resulting pieces individually, and then reassemble them to create a complete solution.  I have submitted a pull request that will handle a lot of these larger, unsliced queries that appear towards the end of Klee's execution. (#198 "Independent Solver GetInitalValues")  I'm working with Christian Cadar right now to get them integrated.

To test whether the computeInitialValues function is indeed the place where these unsliced formula's are coming from, I suggest running Klee with the additional option --no-output.  This eliminates the test generation phase and should similarly eliminate a lot of the unsliced calls.

Just as a warning, this is the first time I've posted an answer on this forum, so be open to the possibility of a complete reversal coming from a more experienced kleer.

Eric Rizzi

________________________________________
From: klee-dev-bounces at imperial.ac.uk <klee-dev-bounces at imperial.ac.uk> on behalf of Andrea Aquino <andrex.aquino at gmail.com>
Sent: Monday, March 2, 2015 5:01 AM
To: klee-dev at imperial.ac.uk
Subject: [klee-dev] On KLEE's ability to perform formula-slicing

Dear all,

I succeeded in extracting the formulas generated by KLEE during the analysis of the coreutils tools. I am currently running KLEE (llvm version 2.9) on a Ubuntu-linux 32 bit virtual machine setting a timeout of 60 seconds with the following command:

klee --use-cache=false —use-cex-cache=false —use-query-log=all:smt2 —use-query-log=solver:smt2 --simplify-sym-indices --write-cvcs --write-cov --output-module --max-memory=1000 --disable-inlining --optimize --use-forked-solver --libc=uclibc --posix-runtime --allow-external-sym-calls --only-output-states-covering-new --environ=test.env --run-in=/tmp/sandbox --max-sym-array-size=4096 --max-instruction-time=30. --max-time=60. --watchdog --max-memory-inhibit=false --max-static-fork-pct=1 --max-static-solve-pct=1 --max-static-cpfork-pct=1 --switch-type=internal --randomize-fork --search=random-path --search=nurs:covnew --use-batching-search --batch-instructions=10000 ./PROGRAM.bc --sym-args 0 1 10 --sym-args 0 2 2 --sym-files 1 8 --sym-stdout

Everything is ok and several thousands of formulas are generated. Having disabled both the standard cache and the counter-example cache, the number of formulas in the logged files (all-formulas and solver-formulas) are the same.

Now I was trying to slice these formulas with my own procedure. Basically I am separating every formula into the set of its independent subformulas (i.e. subformulas that do not share any variable). To do so I create a graph consisting of a node for each clause in the original formula, then I connect the nodes representing two clauses iff they have at least a variable in common, finally I extract the connected components of the graph.

As far as I know, KLEE does this too (at least to some extent, that is using path condition slicing), that is why I was assuming my slicing procedure not to be able to slice any formula generated during the abovementioned analysis. Suprisingly, I found some formulas that was actually sliceable.

I attach to this email a formula “sliceable.smt2” that my tool can slice in 9 independent subformulas. This formula belongs to the analysis of the coreutil program [.bc (test).
I manually double-checked it and it seems the formula can indeed be sliced into 9 subformulas (composed of a single variable each). Notice that even if the formula defines 12 variables only 9 of them are actually used in its assertion.

My questions are the followings:

is it normal that KLEE generates formulas that are sliceable even if it performs formulas slicing?

am I doing some mistake in the slicing of this formula or is it indeed sliceable in 9 subformulas?

am I missing some flag to enable the slicing capability of KLEE?

Best regards,
Andrea Aquino