nsrlquery

Valid HTML 4.01 Strict Valid CSS! Creative Commons License

Quick links:

  1. News
  2. Downloads
  3. What’s nsrlquery?
  4. How do I install it?
  5. How do I use it?
  6. Why should I trust it?
  7. How is it licensed?
  8. Support
  9. Acknowledgments

News

  1. nsrlsvr-1.1 released! Get it while it’s hot. There are some significant performance improvements for this release: version 1.1 introduces TCP connection reuse, a new wire protocol that allows for a couple of interesting new things like querying the server status, and so forth.

  2. Bug in nsrllookup 1.1-1! A bug has been found in nsrllookup 1.1 and prior that could result in it not returning records. A new version, 1.1-2, has been released. This bug only affected UNIX machines: the Windows version did not have the offending code, and 1.1-1 is still the latest there.

    nsrllookup version 1.1 released! You won’t notice many changes, but under the hood there are quite a few. It now reuses TCP/IP connections whenever possible, which means you can send millions of queries to your server without needing to worry about port exhaustion. (The prior version limited itself to about 4,000 queries per connection.) This has made it a little faster.

  3. nsrlsvr version 1.1 coming soon! In fact, if you’re using Kyrus’ public NSRL server then you’re already using version 1.1; they’re running a pre-release snapshot. This version supports TCP/IP reuse, which helps with port exhaustion problems.

  4. Kyrus is hosting a public nsrlsvr! If you want to experiment around with these tools but don’t have a beefy server capable of handling the entire NSRL RDS, just point nsrllookup to query the server nsrl.kyr.us! (Neither I nor my employer are connected with Kyrus. We’re friends with people over there, though, and we think they’re cool.)

Downloads

The latest version of nsrlsvr is 1.1; nsrllookup is at 1.1. The version numbers do not track each other: if they happen to be in sync at any given time, it’s only by chance.

  1. nsrlsvr 1.1 downloads:
    1. Source code
  2. nsrllookup 1.1 downloads:
    1. Source code
    2. Windows binary
    3. Fedora 16 RPMs (i386, x86_64)

The binary packages are digitally signed: the Windows binaries with Authenticode signatures, and the Fedora packages with GnuPG (certificate 0xD6B98E10).

What’s nsrlquery?

nsrlquery is an umbrella project that’s home to two separate, distinct subprojects: nsrlsvr, which provides a server that yields NSRL RDS information on request, and nsrllookup, a simple command-line application that queries the server. The server is UNIX-only, but the client runs just fine on Windows.

But wait, what’s the NSRL RDS and why is it important? Glad you asked!

The National Institute of Standards and Technology (NIST) hosts the National Software Reference Library (NSRL). This is a set of millions of applications, libraries, common configuration files and every other thing imaginable that gets stored on a hard drive. As part of the NSRL, they’ve also published SHA-1 and MD5 hashes of everything in the NSRL. This list of hashes is called the Reference Data Set (RDS).

Many digital investigations are plagued by a needle and a haystack problem: out of terabytes of data the investigator may only be interested in a small fraction. One of the most important tasks in digital forensics is winnowing out what might be wheat from what is overwhelmingly likely to be chaff. Many forensics tools, such as md5deep, can vet the hashes they create against a known-good list — but these tools are often ill-suited to make use of the RDS, which is well over a gigabyte. Loading up a gigabyte of data every time one wishes to use md5deep is just not practical: a more pragmatic approach was needed.

How do I install it?

nsrllookup is just a ./configure && make && make install dance, like any other well-mannered application. For the Windows binaries it’s even easier: just drop the executable somewhere on your PATH and start having fun.

nsrlsvr requires a little more work. Read the included INSTALL file carefully.

How do I use it?

Once the server is built, starting it is as simple as launching it from the command line. Alternately, since it’s a well-behaving UNIX daemon it can be easily integrated into your particular UNIX’s daemon management system (launchctl, /etc/init.d, etc.).

Using the lookup tool is as simple as:

$ md5deep -r /path/to/mounted/disk | nsrllookup

It will print a list of all files that miss the NSRL RDS. You may invert the behavior (only listing hits) with the -k flag. Alternately, if you need to generate both hits and misses in a single pass, use both the -K and -U flags:

$ md5deep -r /path/to/mounted/disk | nsrllookup -K KNOWN -U UNKNOWN

Once it finishes, the file KNOWN will contain hits (hashes known to the NSRL RDS) and the file UNKNOWN will contain misses (hashes unknown).

Why should I trust it?

Right now you probably shouldn’t trust it — at least not without doing your own checks on its operation in order to ensure that it’s working correctly enough for you!

Although these tools are in use by real people doing real investigative tasks, that’s a pretty lousy reason to trust a piece of software. Real trust comes from having a codebase that’s small enough to read, well-written enough to be clear, and documented enough to accurately guide you through the code as you make your own decision of whether it’s trustworthy.

nsrlsvr is in the neighborhood of a thousand lines of well-written C++ code. It defines a grand total of one custom object which amounts to maybe twenty lines. Everything else is written in a very C-like dialect of C++ for ease of auditing, although it makes heavy use of C++’s superior memory management facilities, built-in data structures, and file I/O. Read it. I don’t think you’ll be disappointed!

nsrllookup is slightly smaller, but still in the neighborhood of a thousand lines of well-written C++ code. Like the server, the code is readable. Read it, and make your own decision about whether to trust it.

How is it licensed?

Under the ISC License, which is functionally equivalent to the two-clause BSD License. It is an FSF-approved Free License, an OSI-approved Open Source License, and meets the Debian Free Software Guidelines.

Support

The best place to get support is in the discussion forum on Sourceforge. Alternately, you can email me directly, although I vastly prefer forum posts. A question that gets publicly asked and answered is a question I won’t have to answer again later, and all that.

Acknowledgments

These tools were inspired by Jesse Kornblum, who mused “you know, there’s no good way to query the RDS with md5deep…” Edison famously declared genius to be one percent inspiration and ninety-nine percent perspiration, but the fact is that one percent is absolutely necessary. Without that, nsrlquery wouldn’t exist.

My employers, RedJack Security LLC, have graciously allowed me to work on nsrlquery during company time. It’s wonderful to work for people who believe in the value of research projects and let you open-source your results.

NIST is, of course, the bee’s knees. It took a lot of work to collect, collate and organize 78 million hashes. Thank you, NIST, for all your hard work.

Kyrus has graciously set up a public nsrlsvr, and have given a lot of valuable feedback and bug reports.

Adrian Preston and Technical Reinforcements were early adopters of the Windows binaries, and responsible for a couple of serious bugs being found and fixed.

Finally…

Good luck, and good hunting. I hope nsrlquery is useful to you in your pursuits. Regardless of whether it is or isn’t, I hope you’ll tell me.