I couldn't find a thorough spec for the format called "unified diff" so I decided to research it. Here are my findings.
I haven't found a satisfactory specification of the unified diff
format (the one on the GNU website is hopelessly incomplete).
Here's what I've discovered by experimenting with diff(1) on Red Hat
Linux; this identifies itself as 'diff (GNU diffutils) 2.8.1'.
Hopefully this is useful for someone who needs to generate unified
diffs or who needs to parse them. (I had both needs recently. :-)
The header lines look like this:
indicator ' ' filename '\t' date ' ' time ' ' timezone
indicator is '---' for the old file and '+++' for the new
date has the form YYYY-MM-DD
time has the form hh:mm:ss.nnnnnnnnn on a 24-hour clock
timezone is has the form ('+'|'-') hhmm where hhmm is hours and
minutes east (if the sign is +) or west (if the sign is -) of
Each chunk starts with a line that looks like this:
'@@ -' range ' +' range ' @@'
where range is either one unsigned decimal number or two separated
by a comma. The first number is the start line of the chunk in the
old or new file. The second number is chunk size in that file; it
and the comma are omitted if the chunk size is 1.
(Email from a reader suggests that this omission is optional
and may be phased out.) If the chunk size is
0, the first number is one lower than one would expect (it is the
line number after which the chunk should be inserted or deleted; in
all other cases it gives the first line number or the replaced range
A chunk then continues with lines starting with ' ' (common line),
'-' (only in old file), or '+' (only in new file). If the last line
of a file doesn't end in a newline character, it is displayed with a
newline characer, and the following line in the chunk has the
literal text (starting in the first column):