One of the tools that caught my eye in that thread is DocFetcher. As per the web site:
DocFetcher is an Open Source desktop search application: It allows you to search the contents of files on your computer. — You can think of it as Google for your local files. The application runs on Windows, Linux and OS X, and is made available under the Eclipse Public License.
Sounds cool. been looking for something like this for ages. If you are a hoarder like me, you must have tones of docs in many formats all over the place and grepping through it is tiring. I always wanted to clean it up a bit, so learning about this tool was a a great opportunity to give both cleanup and the tool a try.
I ended up not liking this tool at all! It does its job and provides you a way to search through all these indexed documents, but somehow the usability factor is just not there 🙁
I then started looking for alternatives.
One that caught my eye was recoll.
After installing it, and seeing it in action I now like it quite a lot.
- It can index lots of files for you and in many different formats (.pdf, .doc, .xls, .epub, .mobi, .txt, etc.)
- The UI is simple, but very “result-oriented” (see below)
- As you type queries, they nicely autocomplete:
- You can run advanced queries:
- It presents results in a way that is customizable – you can modify the HTML-driven results page — in the example below I added <hr> and changed icons’ sizes to be smaller, as well as the font to be more readable
- You probably noticed it shows you snippets of text as well.
- When you open a doc of your choice, it will highlight the findings in the doc:
Now, you may be asking yourself why did I mention Threat Intelligence Analysts in the title.
Well, we all use search engines and it’s easier to just go and Google stuff. However, not all the stuff that is searchable is on the Internet. For instance, documents shared privately, customer reports, documents under NDA/TLP;RED;, etc. will not make it to the Internet (hopefully). Having a tool at hand that can index these documents in so many different formats and make them searchable in an instant makes it a very desirable tool for any report reader. That’s pretty much all of us in the infosec at this stage – we are all Threat Intel Analysts.