MD5 Function

Returns the result of the message digest of a specified string.

Category: Character

Syntax

MD5(string)

Required Argument

string

specifies a character constant, variable, or expression.

Tip Enclose a literal string of characters in quotation marks.

Details

Length of Returned Variable

In a DATA step, if the MD5 function returns a value to a variable that has not previously been assigned a length, then that variable is given a length of 200 bytes.

The Basics

The MD5 function converts a string, based on the MD5 algorithm, into a 128-bit hash value. This hash value is referred to as a message digest (digital signature), which is nearly unique for each string that is passed to the function.
The MD5 function does not format its own output. You must specify a valid format (such as hex32. or binary128.) to view readable results.
Operating Environment Information: In the z/OS operating environment, the MD5 function produces output in EBCDIC rather than in ASCII. Therefore, the output will differ.

The Message Digest Algorithm

A message digest results from manipulating and compacting an arbitrarily long stream of binary data. An ideal message digest algorithm never generates the same result for two different sets of input. However, generating such a unique result would require a message digest as long as the input itself. Therefore, MD5 generates a message digest of modest size (16 bytes), created with an algorithm that is designed to make a nearly unique result.

Using the MD5 Function

You can use the MD5 function to track changes in your data sets. The MD5 function can generate a digest of a set of column values in a record in a table. This digest could be treated as the signature of the record, and be used to keep track of changes that are made to the record. If the digest from the new record matches the existing digest of a record in a table, then the two records are the same. If the digest is different, then a column value in the record has changed. The new changed record could then be added to the table along with a new surrogate key because it represents a change to an existing keyed value.
The MD5 function can be useful when developing shell scripts or Perl programs for software installation, for file comparison, and for detection of file corruption and tampering.
You can also use the MD5 function to create a unique identifier for observations to be used as the key of a hash object. For information about hash objects, see Introduction to DATA Step Component Objects in SAS Language Reference: Concepts.

Example

The following is an example of how to generate results that are returned by the MD5 function.
data _null_;
   y = md5('abc');
   z = md5('access method');
   put y= / y = hex32.;
   put z= / z = hex32.;
run;
The output from this program contains unprintable characters.