[klee-dev] Non-determinism in KLEE

Tue Nov 12 16:10:39 GMT 2013

Dear all,

    We are doing some experiments with some determinism.

I find that there was at least 2 threads about it before:
1. Non-determinism in Klee(
http://keeda.stanford.edu/pipermail/klee-dev/2010-September/000470.html)
2. computing the coverage(
http://keeda.stanford.edu/pipermail/klee-dev/2010-March/000251.html)
Unfortunately I failed to fully understand them.

So here is what we've done:

(1) We basically follow the options at "coreutils experiments" page.

klee  \
 --simplify-sym-indices --write-cvcs --write-cov --output-module    \
--max-memory=1000 --disable-inlining --optimize --use-forked-solver \
--use-cex-cache --libc=uclibc --posix-runtime \
--allow-external-sym-calls --only-output-states-covering-new \
--environ=test.env --run-in=/tmp/sandbox \
--max-sym-array-size=4096 --max-instruction-time=30. --max-time=3600. \
--watchdog --max-memory-inhibit=false --max-static-fork-pct=1 \
--max-static-solve-pct=1 --max-static-cpfork-pct=1 --switch-type=internal \
*--randomize-fork* *--search=random-path --search=nurs:covnew \ *
*--use-batching-search --batch-instructions=10000 \ *
./rm.bc --sym-args 0 1 10 --sym-args 0 2 2 --sym-files 1 8 --sym-stdout

Firstly, we change the search strategy to DFS, i.e.
*--search=dfs*

But when tested with a slightly *modified **rm *case, we found that there
are
some HUGE differences for the running time: KLEE finds the error within
about
2400s for once, but about one day later it finds the exact error within
only 30s-50s!
*So is it a regular result*?
The only potential difference I can think out is: the machine I ran KLEE on
may be used
by other CPU-bound operations(but since I don't have priviledge to know the
details of the machine I cannot make sure) when KLEE took 2400s to file the
bug.

(2) Later in order to keep the results a bit more determinist, we also

1. discard "*--randomize-fork*"
2. discard "*--use-batching-search --batch-instructions=10000*"

So the final option we are using is

klee  \
 --simplify-sym-indices --write-cvcs --write-cov --output-module    \
--max-memory=1000 --disable-inlining --optimize --use-forked-solver \
--use-cex-cache --libc=uclibc --posix-runtime \
--allow-external-sym-calls --only-output-states-covering-new \
--environ=test.env --run-in=/tmp/sandbox \
--max-sym-array-size=4096 --max-instruction-time=30. --max-time=3600. \
--watchdog --max-memory-inhibit=false --max-static-fork-pct=1 \
--max-static-solve-pct=1 --max-static-cpfork-pct=1 --switch-type=internal \
*  --search=dfs* \
./rm.bc --sym-args 0 1 10 --sym-args 0 2 2 --sym-files 1 8 --sym-stdout

However it seems that when running, there are still some time difference
even on a SINGLE machine(still mainly about the time; but it seems that
the time is still unstable. From what we observed,the longest time may
be bigger than 10% than the shortest one).

And for 2 machines that almost have the same power and system
configurations,
the running time difference is even bigger.

The counter example path condition also has several differences for
a simple test case(I only compared the diff of the xxx.pc files and notice
there are several changes but didn't get a better knowledge about the
semantics).
*Is it reasonable?*

(3) Also I tested with a script by running with a simple case:
This case is taken from one of the previous issues on GITHUB:

https://github.com/ccadar/klee/issues/50
Only the "main" function's signature has been changed to 2-args' version.

#include <assert.h>
#include <klee/klee.h>

const char *const errmsg[2] = {0, };

const char *get_error_message(int err) {
  char const *x = errmsg[err];
  return x;
}

int main(int argc, char** argv) {
  int err;
  klee_make_symbolic(&err, sizeof(err), "err");
  get_error_message(err);
}

I ran it with a script like below:

while [ 1 ]
do
  klee --search=dfs test.bc
  sleep 10
done

>From the 306 results KLEE executed, the longest time is  76.88s(50.15%) and
the
 shortest is 41.89s(TSolver: 48.22%).
*So is it common?*
Also I notice that when using a zero-args version of "main", the time will
be
much less; is it because the function function call "stack" or the
 environment(but there is no posix-runtime here)?

Best regards,
Hongxu Chen
-------------- next part --------------
HTML attachment scrubbed and removed