MD5 is a one-way hashing algorithm for creating digest
"signatures" or checksums of strings (usually entire files).
Ruby’s standard library includes MD5 as part of the Digest::
set of extension classes.
Creating MD5 checksums is a simple matter of requiring the
digest/md5 library and using either the
Digest::MD5.digest or Digest::MD5.hexdigest class methods
to return the digest of a given string — we will use hexdigests in
this article as they are printable:
MD5 digests are 128 bit (16 byte) signatures and are the most common method
of providing checksums for files available on the net. To create a checksum
of an entire file you need only pass in the file as a string. The following
will print out the filename and md5 digest of all the files passed to it on
the command line:
ARGV.each do |f|
digest = Digest::MD5.hexdigest(File.read(f))
Sometimes you want to do more than just calculate the checksum of a single
string — maybe you have a large file and want to calculate the digest
in small, memory friendly chunks; or maybe you are calculating a digest
from a stream of input. In such cases you can create a Digest::MD5 object
and use the #update (alias: #<<), digest,
and #hexdigest methods.
For example purposes, we will create a digest by reading and adding one
line at a time from a test file as well as calculating the digest all at
once. I will use the source for this article as the test-file:
filename = 'MD5.rdoc'
all_digest = Digest::MD5.hexdigest(File.read(filename))
incr_digest = Digest::MD5.new()
file = File.open(filename, 'r')
file.each_line do |line|
incr_digest << line
Saving this as md.rb and running it produced the following output:
In addition to providing checksums for files you make available, or
checking files and packages you’ve downloaded, another use is
fingerprinting sensitive directories on your system. Creating a database of
MD5 digests for sensitive files or directories means you can periodically
cross-check your sensitive data against the database to see if anything has
been changed without your knowledge. This can provide you with a very
simple to implement addition to your intrusion detection tools.