Fuzzing

440 & AFL and maybe JavanWartyPig

Testing a system using the inputs that are wholly or partially randomly-generated
Goal: Find inputs that make the SUT behave in interesting ways
Oracles:
- Determines if a test case passes or fails
- Usually an oracle for correctness does not exist (beyond source-code assertions)
- Derived oracles for correctness testing
Generation-based fuzzing:
- Inputs produced from nothing
Mutation-based fuzzing:
- Inputs produced from modification and/or combination of existing inputs
Types of inputs the fuzzer may aim to produce:
- Totally invalid inputs (random strings)
- Somewhat malformed inputs (sequences of tokens)
- Inputs with a degree of validity (sentences in the grammar, but not well-typed)
- High integrity inputs (well-formed inputs free from any undefined behaviour)
Dumb Fuzzing:
- Generates or mutates inputs without the knowledge of the SUT's input domain
- Used in the original Fuzzing paper of 1990 that found UB in most of UNIX utilities
- Could trigger stack overflow, division by 0, maybe an integer overflow
Smart Fuzzing:
- Uses some information about the expected input format
- Has a high chance of getting stack underflows and signed integer overflows
- Can be helped by Compiler sanitisers
Feedback-directed Fuzzing:
- Can be smart or dumb
- Input generation or mutation procedure is informed by observing the SUT on previous inputs
- An input can be flagged if it makes the SUT do something noteworthy:
  - Cover new code
  - Consume more memory than usual
  - Log a previously unseen message
  - Execute a particular syscall
- Having flagged input:
  - Generation-based fuzzer tries to generate inputs with similar properties
  - Mutation-based fuzzer prioritises it over others when making a mutation
- Intuition:
  - Input providing new coverage exercises the SUT in a new way
  - Related input may push the SUT further into the unknown
  - New message means either more coverage or has a noteworthy (for the SUT) property
Black-box fuzzing:
- SUT is exercised in an unmodified form
- Advantages:
  - Can be applied to closed source SUT's
  - SUT runs at full speed, enabling higher rate of fuzzing
- Disadvantages:
  - Feedback-directed fuzzing can only make use of externally-visible behaviour
  - No coverage-directed fuzzing -> May not cover all execution paths
Grey-box fuzzing:
- Instrumentation applied to the SUT, to expose information about its inner workings
- Typically exposes coverage
- Instrumentation can be applied at compile or binary level
- Advantages:
  - A feedback-directed fuzzer has more information to work with
- Disadvantages:
  - Instrumentation overhead -> Slower rate of fuzzing
  - Compile-time instrumentation requires access to the SUT sources
White-box fuzzing:
- Analysis via constraint solving used to generate inputs that exercise different parts of SUT
- Can prove to cover all execution paths (given enough time)
- Known as Symbolic and Concolic execution
- In many ways this is not really fuzzing
- Advantages:
  - Constraint solvers generate inputs that reach hard to hit parts of the SUT
- Disadvantages:
  - Slow (due to overhead of solving)
  - Heavyweight, requires specialised tool chain
AFL:
- Mutation-based, dumb, grey-box fuzzer
- State of the art fuzzing for security
- TODO: Read the paper on AFL
Fuzzing compilers:
- What are the challenges:
  - Absence of an oracle (for code generation)
  - Nondeterminism
  - Lack of clear language semantics
  - Undefined, unspecified and implementation-defined behaviour:
    1. Undefined - The standard imposes no requirement
    2. Unspecified - The standard defines a set of allowable behaviours to choose at any occurrence
    3. Implementation-defined - The standard defines a set of allowable behaviours;the compiler has to choose one and document a consistent behaviour
- Compilers are non-testable because:
  - There does not exist an oracle (for code generation)
  - It is theoretically possible, but practically too difficult to determine the correct output
- Ways to solve the absence of an oracle for code generation:
  - Derived oracle -- not absolute source of truth
  - Cross-check implementations:
    1. Method:
      1. Compile the code with multiple different compilers and compare the output
      2. Mismatches indicate bugs
      3. Majority is probably correct
      4. Known as differential testing
    2. Requirements:
      1. Program must be deterministic
      2. Program must be free from undefined/unspecified behaviour
      3. Multiple compilers must exist
      4. Compilers must agree on implementation-defined behaviour
  - Cross-check equivalent programs:
    1. Method:
      1. Compile equivalent programs with one compiler
      2. Mismatches indicate bugs
      3. Majority is probably correct
      4. Is a form of metamorphing testing
    2. Requirements:
      1. Equivalent programs (obviously)
      2. Program must be deterministic
      3. Program must be free from undefined/unspecified behaviour
Metamorphic testing:
- No reference implementation (oracle)
- Exploit the property of an input when mutating it to obtain a result based on the result from the previous input (like sin x == sin x + 2pi)
- Simply finding a function for input transformation that mutates output in a deterministic way
Test case reduction:
- Transform the input in such a way that the property of interest is preserved
- C-Reduce is a test case reducer for C programs

Notes on Fuzzing