Enter Sandbox – part 17: The clique of clickers

Every sandbox needs to click stuff. Let it be ‘Next’, ‘Finish’, or ‘OK’ button – when the sample runs and the GUI shows up these buttons must be simply pressed somehow to ensure the execution continues.

Writing a clicker tool that handles it smoothly is not a trivial task.

First, it has to click only windows that are created by new processes – this can be kinda easily achieved by PID filtering f.ex. based on a process snapshot obtained prior to launching the sample, but don’t be fooled – the OS message boxes, or running services or programs may sometimes require additional clicking that are ‘out of band’ (e.g. hard error messages from csrss.exe process). There may also be some issues if sandbox tools don’t handle errors silently (e.g. exceptions in .NET tools).

Next, the actual clicking…

Oh boy… this is a tricky one…

Do we send a keyboard event, or mouse event?

Do we even know where the actual button we must click… is?

For installers that rely on Windows GUI system to draw its components – it’s quite easy. Every sych GUI control is a separate window which we can enumerate, find, inspect its class name, text and, send a message to, if we need to. Unfortunately, the generic solution is much harder to write, because:

  • Borland/Delphi/Embarcadero etc. use VCL and its own components (and programmers often created their own libraries with dedicated owner-drawn controls – this was really popular in late 90s and early naughties)
  • .NET uses its own components (Read about Windows Presentation Foundation (WPF))
  • Office uses its own GUI components (forms, etc.) – read about UI automation below and about Office UI Fabric
  • Java uses its own components; many testing frameworks rely on access to the source code (from what I learnt so far), so it’s hard to do a blackbox test
  • Adobe Flash and Silverlight are also custom drawn, same as Google Chrome, Mozilla Firefox and many other software packages may and often do use their own GUI libraries
  • Some (especially old-school) applications use their own components that are proprietary and only known to the software company that produced them, so it’s hard to write a generic clicker for them (Lotus Notes anyone?)
  • Then there are CLI tools that require manual input; these are impossible to handle in a generic way (interestingly, many executables submitted to VT are student projects and require manual input to produce some results; while not malicious, they pose a significant challenge to sandboxes)
  • The GUI elements can be localized
  • The localization itself is one issue, the other is the fact that localized software, especially the old one, often relies on the old ANSI code pages and ANSI APIs, so you have to guess what code page is being used by the sample’s GUI
  • The GUI elements can be ordered, or not (this may determine which button is ‘default‘ – in the absence of the standard buttons like ‘OK’, of ‘Finish’ this may be important)
  • The GUI elements can be bitmaps so no way to determine what to click… other than via known bitmap hashes, OCR/ICR, or a matrix clicking where you walk through the surface of the window and click every few pixels, moving both horizontally and vertically through the whole window area; it’s prone to errors and may force the app to exit early
  • The GUI elements can be drawn using a HTML code, so you may need to either access it via existing means (eg. via WM_HTML_GETOBJECT) or use Accessibility options (read below about that)
  • Many installers are interactive
    • The Installers often include EULA that has to be accepted (various keys used to accept, depending on language)
    • You may also need to choose options for installation e.g. installed UI languages, destination path, etc.
  • Many installers download other components from the net or drop generic component installers during the installation, so the clicker needs to work across multiple processes w/o disturbing parent while the child process is still being ‘processed’ (e.g. can’t send ‘Next’ to the parent process until the child process exited successfully; for examples, see anything that relies on winpcap)
  • There could be decoy GUI elements (e.g. transparent elements added to fool auto clickers, ad hoc passwords, captchas, etc.)

What can significantly help to code a good clicker is the use of mixed technologies and in general – be ready for lots of tuning:

  • Recognize installer / sample file type and apply appropriate playbook
    • This is very important, because sometimes you can use unattended options, if available
    • You can also try to decompile the installer to extract files w/o GUI
  • In a generic way:
    • If possible, enumerate windows and their children, and recognize standard controls
    • Ensure you you use proper windows APIs e.g. RealChildWindowFromPoint vs. ChildWindowFromPoint/ChildWindowFromPointEx
    • Recognize standard shortcuts/accelerators
    • Recognize default buttons (depends on the sequence of resources compiled into the binary)
    • Recognize ‘Next’, ‘Finish’, OK class of buttons, etc.
    • Recognize button classes in general (substrings: ‘button’, ‘btn’, etc.)
    • Recognize Window texts e.g. ‘Accept EULA’ type of buttons (checkboxes, or radiobuttons)
    • Recognize localized versions of the above
    • Recognize automation possibilities
      • The problem of automatic clicking is actually not reserved to sandboxes only
      • QA testing of any sort relies on it a lot: regression tests for coding and localization projects etc. involves lots of clicking and a huge progress has been made over the years to make it work smoothly, and fast
      • Luckily, many software packages support GUI controls’ automation via IAccessible interface
      • Have a read about Automated UI testing and Microsoft UI Automation
    • Have a look at other large projects where the localized strings for standard buttons are already localized e.g. LibreOffice
    • During windows enumeration search for both Unicode, and ANSI versions of these strings while enumerating windows
    • If no success, consider injecting instrumentation code that is ‘native’ to the installer’s platform
    • If no success, consider OCR/ICRing elements of the screen, or use the Babylon technique of hooking API/redrawing text and catching text printed using Text APIs
    • For Delphi, it’s a mundane task of understanding all versions of their VCLs (I have code some PoC code and it was painful, but I was able to read some labels from Delphi programs)
    • As a last resort, redirect trouble samples to a team doing manual malware analysis so the clicker can be tuned

And just to give you a localization sample… this is a list of ‘Agree’ buttons in various languages extracted from one of the Open Source projects:

  • Afrikaans Regso
  • Albanian Pranoj
  • Arabic موافق
  • Armenian Համաձայն եմ
  • Basque Onartu
  • Belarusian Згодзен
  • Bosnian Prihvatam
  • Breton A-du emaon
  • Bulgarian Съгласен
  • Catalan Hi estic d’acord
  • Cibemba Nasumina
  • Croatian Prihvaćam
  • Czech Souhlasím
  • Danish Acceptér
  • Dutch Akkoord
  • Efik Ami Mmenyịme
  • English I Agree
  • Esperanto Akceptite
  • Estonian Nõustun
  • Farsi موافقم
  • Finnish Hyväksyn
  • French J’accepte
  • Galician Aceito
  • Georgian ვეთანხმები
  • German Annehmen
  • Greek Συμφωνώ
  • Hebrew אני מסכים
  • Hindi सहमत
  • Hungarian Elfogadom
  • Icelandic Ég Samþykki
  • Igbo M Kwere
  • Indonesian Saya Setuju
  • Irish Glacaim Leis
  • Italian Accetto
  • Khmer I យល់​ព្រម​
  • Korean 동의함
  • Kurdish Ez Dipejirînim
  • Latvian Es piekrītu
  • Lithuanian Sutinku
  • Luxembourgish Unhuelen
  • Macedonian Да
  • Malagasy Ekeko
  • Mongolian Зєвшєєрлєє
  • Norwegian Godta
  • Pashto زه منم
  • Polish Zgadzam się
  • Portuguese Brasilian Eu Concordo
  • Romanian De acord
  • Russian Принимаю
  • Serbian Прихватам
  • Sesotho Kea Lumela
  • Shona Ndinobvuma
  • Simplified Chinese 我接受
  • Slovak Súhlasím
  • Slovenian Se strinjam
  • Spanish Acepto
  • Swahili Nakubali
  • Swedish Jag Godkänner
  • Tamil A நான் ஒப்புக்கொள்ளுகிறேன்
  • Thai ตกลง
  • Turkish Kabul Ediyorum
  • Twi Migye Tom
  • Ukrainian Приймаю
  • Uyghur قوشۇلىمەن
  • Uzbek Qabul qilaman
  • Valencian Accepte
  • Vietnamese Tôi đồng ý
  • Welsh Cytuno
  • Yoruba Mo Gbà
  • Zulu Ngiyavuma