Removed some non-sense, fixed grammar mistakes and added info about regression.
I am really tired today so the only thing I can do is preaching I have a few (I hope cool) research posts piled up, but really no time recently to polish them and publish. Pls wait for the second half of November when I am back from holidays.
Today I want to talk about Virus Total.
It’s an awesome web site that went from a resource known to a few to ‘yet another lucky guy acquired by Google’.
Now, let me tell you one thing: Virus Total’s importance is the biggest B/S in the IR universe. Note that I love VT, I just hate the perception of its importance.
There are many reasons, but the simplest to pick up on is “but VT says so”.
It is not uncommon nowadays for people – often including these who can’t distinguish a virus from a trojan – to utilize VT on daily basis and treat its statistics as a deity that tells them about law & order in the software/sample universe.
I uploaded the file XYZ to VT and it says: bad.
Let me tell you a little secret of Antivirus industry here:
- Problem: Lots of samples. Mucho unknown samples
- Question: How to cut the corners?
- Answer: Use other AV to tell us if it is bad; if we get lucky, we will generate an automatic def/sig for it and move on; we can be smart and rely on the judgment only if at least 1,2,3…N of them say so, but we still cut corners.
Yup. You heard that right.
Your VT score is now worth mierda. One guy detects it, suddenly everyone detects it. Human involvement = 0, maybe 1. All of it is bots at work. And sometimes the first guy realizes it was an FP and removes the sig. Yet the others who blindly followed don’t. Not everyone cares or can afford regression testing and the file remains ‘detected’ forever.
VT is a resource that presents you an aggregated information from various providers including, but not limited to:
- antivirus vendors
- result of running various proprietary tools
but it DOES NOT tell you how many of these detections/hits are derived from each other: vendors, others on the list, some heuristic rules.
Read the score. Understand it. But also understand the context of it.
It’s unreliable & you should NOT use it blindly or you will make bots replace you as a decision maker.