{"id":9496,"date":"2024-10-02T23:08:05","date_gmt":"2024-10-02T23:08:05","guid":{"rendered":"https:\/\/www.hexacorn.com\/blog\/?p=9496"},"modified":"2024-10-02T23:08:05","modified_gmt":"2024-10-02T23:08:05","slug":"using-guids-to-guide-the-id-of-samples-capabilities-or-unique-attributable-properties","status":"publish","type":"post","link":"https:\/\/www.hexacorn.com\/blog\/2024\/10\/02\/using-guids-to-guide-the-id-of-samples-capabilities-or-unique-attributable-properties\/","title":{"rendered":"Using Guids to guide the ID of samples&#8217; capabilities or unique (attributable) properties&#8230;"},"content":{"rendered":"\n<p>A few days ago <a href=\"https:\/\/x.com\/struppigel\">Karsten<\/a> asked me what tool did I use for GUID extraction. I <a href=\"https:\/\/x.com\/Hexacorn\/status\/1838982521054257232\">replied<\/a> that it was my own old tool written waaaay before yara&#8217;s birth.<\/p>\n\n\n\n<p>In this post I will elaborate on this bit a bit&#8230;<\/p>\n\n\n\n<p>That old GUID extraction tool was written in perl &#8211; yeah, I know&#8230; &#8230; and&#8230; it was basically reading the content of the whole sample to memory, and then, within that content, it was searching for&#8230; known GUIDs&#8230;. It was badly written, superslow, but&#8230; at that time&#8230; superuseful!<\/p>\n\n\n\n<p>Why?<\/p>\n\n\n\n<p>Because my short GUID list was curated. My tool looked only for GUIDs associated with known adware\/spyware + popular GUIDs associated with COM interfaces abused by malware at that time. So, it was very &#8216;focused&#8217; as it was helping me to quickly ID samples belonging to 180Solutions, Zango, BetterInternet, Ezula, Bonzi, ClearSearch, VirtuMonde and many others, and&#8230; was also highlighting to me some potentially interesting features of triaged samples like them including references to COM interfaces operating on shortcut files (IShellLink) or generic (IPersistFile) methods for saving files&#8230;<\/p>\n\n\n\n<p>A <a href=\"https:\/\/en.wikipedia.org\/wiki\/Universally_unique_identifier\">GUID<\/a> itself is a very interesting IOC on its own. In theory, it is supposed to act as a global, unique identifier. In practice, it is not only just an identifier, but also a capability determinant, amongst other things.<\/p>\n\n\n\n<p>in my <a href=\"https:\/\/www.hexacorn.com\/blog\/2022\/07\/22\/week-of-data-dumps-part-2-guids\/\" data-type=\"post\" data-id=\"8158\">old post<\/a> I dumped a lot of &#8216;GUID to &lt;something>&#8217; mappings that any data hoarder should find useful&#8230; For example, taking just that list, validating it (it actually had some bugs!), and converting it to a set of yara rules is a step we can take to kinda partially duplicate the features of my old perl tool.<\/p>\n\n\n\n<p>The conversion process walks through all GUIDs from the input file and creates a small yara rule for each of these GUIDs, where each of them is converted to 3 strings:<\/p>\n\n\n\n<ul>\n<li>GUID string as an ASCII<\/li>\n\n\n\n<li>GUID string as a Wide string (UTF16)<\/li>\n\n\n\n<li>binary representation of the GUID<\/li>\n<\/ul>\n\n\n\n<p>The resulting file looks like <a href=\"https:\/\/hexacorn.com\/d\/guids.yar\">this<\/a>.<\/p>\n\n\n\n<p>The rules written this way take care of any textual references to GUID present inside the sample (ASCII and Unicode\/Wide), plus it recognizes the most popular way of storing GUIDS &#8211; the 16-bytes long binary form. That is, it will pick up known GUID references inside the resources, embedded IDL files, as well as any actual code\/data strings and of course, the binary form of GUID that programmers (often unknowingly) introduce to their programs.<\/p>\n\n\n\n<p>Now that we have this yara file, we can test it by applying it to f.ex. win11&#8217;s Notepad.exe:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">yara guids.yar notepad.exe<\/pre>\n\n\n\n<p>The results are:<\/p>\n\n\n\n<pre class=\"wp-block-preformatted\">guid_IUnknown notepad.exe\nguid_IMarshal notepad.exe\nguid_IAsyncInfo notepad.exe\nguid___FIAsyncOperationCompletedHandler_1_Windows__CSystem__CLaunchQuerySupportStatus notepad.exe\nguid_IPropertyDescriptionList notepad.exe\nguid___FIAsyncOperationCompletedHandler_1_Windows__CSecurity__CEnterpriseData__CFileProtectionInfo notepad.exe\nguid___x_ABI_CWindows_CStorage_CIStorageItem notepad.exe\nguid_IFileDialog notepad.exe\nguid_IShellItem notepad.exe\nguid___x_ABI_CWindows_CFoundation_CIUriRuntimeClassFactory notepad.exe\nguid___FIEventHandler_1_Windows__CSecurity__CEnterpriseData__CProtectedContentRevokedEventArgs notepad.exe\nguid___x_ABI_CWindows_CSecurity_CEnterpriseData_CIFileProtectionManagerStatics notepad.exe\nguid___x_ABI_CWindows_CStorage_CIStorageFileStatics notepad.exe\nguid___x_ABI_CWindows_CSystem_CILauncherStatics2 notepad.exe\nguid_IAccPropServices notepad.exe\nguid_IFileSaveDialog notepad.exe\nguid_IAgileObject notepad.exe\nguid_CAccPropServices notepad.exe\nguid___x_ABI_CWindows_CSecurity_CEnterpriseData_CIProtectionPolicyManagerStatics2 notepad.exe\nguid_FileSaveDialog notepad.exe\nguid___x_ABI_CWindows_CSecurity_CEnterpriseData_CIProtectionPolicyManagerStatics notepad.exe\nguid___FIEventHandler_1_IInspectable notepad.exe\nguid___x_ABI_CWindows_CApplicationModel_CDataTransfer_CIClipboardStatics notepad.exe\nguid_IFileOpenDialog notepad.exe\nguid___x_ABI_CWindows_CApplicationModel_CDataTransfer_CIDataPackagePropertySetView3 notepad.exe\nguid_FileOpenDialog notepad.exe\nguid___FIAsyncOperationCompletedHandler_1_Windows__CStorage__CStorageFile notepad.exe\nguid_IFileDialogCustomize notepad.exe\nguid_LocalAppData notepad.exe<\/pre>\n\n\n\n<p>Even without a single second spent in a disassembler or decompiler we can already see what sort of GUIDs the Notepad.exe references. Some of them are related to COM functionality (f.ex. guid_IFileSaveDialog), some are just GUIDs used as function arguments to functions (f.ex. guid_LocalAppData).<\/p>\n\n\n\n<p>Is it very useful? <\/p>\n\n\n\n<p>I guess&#8230; it depends&#8230;.<\/p>\n\n\n\n<p>If you had a good adware\/spyware GUID database back in 2005-2008 you could quickly identify a lot of adware\/spyware samples w\/o even looking at their code. It worked really nicely.<\/p>\n\n\n\n<p>There are also existing plug-ins for disassembler\/decompilers that try to recognize existing GUIDs inside the code\/data and rename these data chunks that look like known GUIDs with appropriate names of classes\/interfaces or associated artifacts (f.ex. <a href=\"https:\/\/learn.microsoft.com\/en-us\/windows\/win32\/shell\/knownfolderid\">Known Folder IDs<\/a>).<\/p>\n\n\n\n<p>The GUID values are present inside the PDB \/ RSDS structure included inside some of the PE files &#8211; they link the .EXE file with the .PDB file. The Module Version ID (MVID) and TypeLib ID are both GUIDs that are present inside compiled .NET assemblies and can be extracted &amp; collected. Their unique values can be used to link samples coming from the same Visual Studio instance, and\/or build environment. Last, but not least &#8211; it was allegedly a GUID that linked the first iteration of Melissa virus to its author who eventually got arrested. <\/p>\n\n\n\n<p>GUIDs are great artifacts and it&#8217;s wise to both collect all the extractable instances of it, and look for the presence of the known ones in the analyzed samples.<\/p>\n\n\n\n<p><\/p>\n","protected":false},"excerpt":{"rendered":"<p>A few days ago Karsten asked me what tool did I use for GUID extraction. I replied that it was my own old tool written waaaay before yara&#8217;s birth. In this post I will elaborate on this bit a bit&#8230; &hellip; <a href=\"https:\/\/www.hexacorn.com\/blog\/2024\/10\/02\/using-guids-to-guide-the-id-of-samples-capabilities-or-unique-attributable-properties\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[53,28,39,9],"tags":[],"_links":{"self":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/9496"}],"collection":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/comments?post=9496"}],"version-history":[{"count":4,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/9496\/revisions"}],"predecessor-version":[{"id":9500,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/9496\/revisions\/9500"}],"wp:attachment":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/media?parent=9496"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/categories?post=9496"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/tags?post=9496"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}