The art of disrespecting AV (and other old-school controls), Part 3

February 4, 2016 in Malware Analysis, Preaching

This is the third part of the series (part 1, part 2) which this time is somehow shorter, but it is just an excuse to jot down some notes about the actual engines that AV uses internally.

Many people complain about AV using hashes to detect malware – I would say that AV that detects malware via hashes only should not be even on the market, because it would not survive. Your average AV contains a significant number of engines, and subengines using many algos – many of which are lightning fast. Reducing the discussion about AV internal working to ‘AV uses hashes’ is simply not fair.

Let’s have a look – I use the word ‘engine’ quite loosely here and it does not necessarily help with pure detection-specific logic, but it often facilitates the detection itself – each of these are typically quite serious programmatic efforts that are combined to create the ‘holistic’ coverage – yes, it fails, it contains vulnerabilities like any other software, but take a moment to think about the effort that goes into designing, testing all this clustergoodness:

  • static binary string search
  • binary string with a simple wildcards search
  • binary string with a regex (or regex-like) search
  • multi-pattern search engines that are using lookup tables of any sort/trees/tries and proprietary algorithms
  • container/archiver processor – reads files or streams embedded inside the other files/containers
  • file/specific content analyzer/processor – for each file type, content type there is a dedicated engine f.ex. MBR, old Dos .COM file, Flash, OLE files, Symbian SIS, ISO, etc. – note that many of engine expire due to technologies being no longer in use/popular, but it is _there_
  • unpacker  – decompresses streams of data to present them to other engines
  • emulator – simple state machines with a basic understanding of some opcodes
  • emulator – full-blown emulator with most opcodes supported
  • sandbox – full-blown emulator with support of API & memory
  • hooks – dynamic, for on-access scans
  • heuristics engine
  • whitelisting engine
  • detection engine based on file properties
  • rootkit detection engine
  • native file system parser (for various file systems)
  • memory dumper/file rebuilders
  • online scanner (virustotal-like)
  • behavioral engines
  • reputation engines
  • quarantine engine
  • crc/incremental crc search
  • hash-based search
  • entropy analysis
  • X-rays
  • and finally… removal and repair engine – if none of the above engines impress you… think for a second what effort goes to ensure you can remove a complex polymorphic or metamorphic file virus from a gazillion of files on the system without corrupting the files and crashing the system.

There are probably others which I forgot about, but this is really a lot more than just hashing.

If you talk about AV detection and the only thing you talk about is hash, it is probably because you smoke too much of it… 🙂

Share this :)

Comments are closed.