Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Path not found #23

Open
vanhauser-thc opened this issue Sep 21, 2020 · 4 comments
Open

Path not found #23

vanhauser-thc opened this issue Sep 21, 2020 · 4 comments
Labels
documentation Improvements or additions to documentation enhancement New feature or request

Comments

@vanhauser-thc
Copy link

This example file:

#include <stdio.h>
#include <string.h>
#include <stdarg.h>
#include <stdlib.h>
#include <stdint.h>
#include <unistd.h>
int main(int argc, char *argv[]) {

  char    buf[1024];
  ssize_t i;
  if ((i = read(0, buf, sizeof(buf) - 1)) < 24) return 0;
  buf[i] = 0;
  if (buf[0] != 'A') return 0;
  if (buf[1] != 'B') return 0;
  if (buf[2] != 'C') return 0;
  if (buf[3] != 'D') return 0;
  if (memcmp(buf + 4, "1234", 4) || memcmp(buf + 8, "EFGH", 4)) return 0;
  if (strncmp(buf + 12, "IJKL", 4) == 0 && strcmp(buf + 16, "DEADBEEF") == 0)
    abort();
  return 0;

}

the abort() can be triggered with:
echo -en 'ABCD1234EFGHIJKLDEADBEEF\0'|./test

when I compile the test.c with symcc, and feed the most advanced input it finds, it can not get past "ABCD1234EFGHAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA", so it cannot solve the last strcmp with DEADBEEF\0 - I guess this is a bug? or is this too complex to solve?

@sebastianpoeplau
Copy link
Collaborator

@vanhauser-thc this is most likely because our symbolic understanding of libc is incomplete. So when you call strncmp, SymCC can't trace the computations that happen in the function. There are several ways to address the issue:

  1. We can add strncmp and strcmp to the collection of wrapped libc functions and register the wrapper in the compiler. This is quick and easy, but it doesn't scale if your target uses many libc functions on symbolic data.
  2. Alternatively, one can build a fully instrumented libc (like, for example, KLEE does with uclibc). I refrained from doing this so far because it seems like a big burden on the user to set up their own instrumented libc; maybe we could automate the process to make it less tedious.
  3. Something I've done in the past is to cherry-pick individual libc functions from a libc implementation (I used musl) and include them in the target program. For example, the implementation of strncmp is self-contained, so you can just add it to your build (e.g., symcc -o test test.c /path/to/musl/src/string/strncmp.c).

Suggestions how to make such situations easier to deal with are highly welcome :)

@sebastianpoeplau sebastianpoeplau added documentation Improvements or additions to documentation enhancement New feature or request labels Sep 22, 2020
@vanhauser-thc
Copy link
Author

I think string compare functions make sense to add (so solution 1) as these are a limited number of common functions. 2 sounds like too much overhead and 3) should only be needed to be done for very specific cases (e.g. a third party library that is binary only),

@adrianherrera
Copy link
Contributor

KLEE's libc may also be a good starting point for 1 and/or 3: https://github.com/klee/klee/tree/master/runtime/klee-libc

@yiyunliu
Copy link

Has anyone already tried 2)? If all it takes to get full support of libc functions is to compile uclibc with SymCC from scratch then I don't mind the extra work. Are there any additional steps required to set up the instrumented libc that I'm missing?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants