Primarily GCC and maybe Clang
  1. Compiler-based dynamic analysis at compile time that can reliably detect errors
  2. Address Sanitiser:
    • ASan, -fsanitise=address
    • To get reasonable performance, one can use -O1
    • To get nicer stack trace in error messages use -fno-omit-frame-pointer
    • Detects out-of-bounds accesses, use-after-free errors, memory leaks, etc..
    • Can detect stack buffer overflow
    • Accesses in uninstrumented code are not checked (while allocations and accesses after are)
    • Instrumentation is a compiler module that performs an LLVM pass
    • Before each memory access (*addr), add an if_poisoned check that throws an error
    • Memory mapping and instrumentation:
      • Virtual memory is divided in two disjoint classes (Mem and Shadow)
      • Shadow contains metadata about the actual memory used by the application (Mem)
      • Poisoning a byte in application memory means writing a special byte into the Shadow partition
      • Both classes are laid out in a way which makes conversion Shadow<->Mem fast
      • ASan maps 8 bytes of the application memory to 1 byte of the Shadow memory
    • Shadow memory:
      • All bytes are addresible: shadow_byte = 0
      • No bytes are addresible: shadow_byte < 0
      • First k bytes are addresible: shadow_byte = k
    • In order to catch stack buffer overflow:
      • Allocate a 32 byte poisoned buffer before an array on the stack
      • Allocate a 32 byte poisoned buffer after the array on the stack
      • Buffer the array on the stack to be a multiple of 32 bytes
      • In its current compact mapping implementation, cannot catch partially unaligned OOB accesses. A viable solution is described here (https://github.com/google/sanitizers/issues/100), but it comes at a performance cost
    • The runtime library replaces malloc/free and provides reporting functions like _asan_report_load8
    • New malloc allocates a block of memory with red-zones around it
    • Free poisons shadow values for the entire region and puts the chunk of memory into a quarantine queue
    • Cannot deal with globals allocated in the uninstrumented code
    • More could be found here: https://github.com/google/sanitizers/wiki/AddressSanitizer
    • False positives:
      • None by design
    • False Negatives:
      • Overflows bypassing the red-zone padding
      • Use-after-free with large allocations in-between
        1. Memory may be reused
        2. A stale pointer could now access a valid address that was recently allocated
      • Unaligned access partially OOB: int* a = new int[2]; --> Array with range of [0-7] bytes int* pointer = (int*)((char*)a + 6); *pointer = 1; --> Access out of range [6-9]
      • Cannot detect errors in custom allocators
  3. Memory Sanitiser:
    • MSan, -fsanitize=memory
    • Detects uninitialised memory accesses
    • To get nicer stack trace in error messages use -fno-omit-frame-pointer
    • Can track every uninitialised address to its origin with -fsanitize-memory-track-origins
    • Additional functionality (like origin tracking) comes with a higher performance cost
    • MSan implements a subset of the functionality found in Valgrind (but its much faster)
    • Bit-level granularity is important for bitfields and bitarrays
    • Algorithm:
      • Shadow each allocated bit with validity bit
      • When uninitialised memory is allocated, set validity to 1
      • Every operation that produces a value is augmented with corresponding change to the validity bit (including copying)
      • Before each read that affects program's observable behaviour, the validity bit is checked
    • False Positives:
      • Interaction with uninstrumented code
      • Initialisations in instrumented code are not intercepted
      • MSan allows marking objects as "safe"
  4. Undefined Behaviour Sanitiser:
    • UBSan, -fsanitize=undefined
    • Detects division by 0, overshifts, signed integer overflow, indirect call of a function through a function pointer of the wrong type
    • Saturated add, abs (what happens close to INT_MAX and INT_MIN?)
    • Anything at all can happen
    • UB in C:
      • Signed integer overflow
      • Reading from uninitialised variable
      • Pointing past the last element of an array
      • Aliasing between pointers of different types
      • Calling memcpy with overlapping buffers
    • UB concept:
      • The programming language has rules
      • The compiler assumes that the rules were followed, when optimising the code
      • If the rules are broken, the optimisations can cause some unexpected behaviour
    • Saturated add:
      • x + y is treated mathematically by the compiler
      • The optimiser then removes the final condition guard in the saturated add
    • Abs:
      • Abs(x) is undefined for -231 and otherwise the int is non-negative
      • So, a guard of (abs(x) < 0) is optimised away
    • Security critical example: struct x *str = object->str; if (!object) { return error; }
    • Why do languages allow undefined behaviour:
      • Catering for diverse hardware:
        1. x86 signed addition wraps on overflow
        2. MIPS signed addition traps on overflow
      • To allow powerful optimisations:
        1. Extending int to long on a 64-bit machine when using int as an address (in a loop)