The Artima Developer Community
Sponsored Link

All Things Pythonic
Unified Diff Format
by Guido van van Rossum
June 14, 2006
Summary
I couldn't find a thorough spec for the format called "unified diff" so I decided to research it. Here are my findings.

Advertisement

I haven't found a satisfactory specification of the unified diff format (the one on the GNU website is hopelessly incomplete). Here's what I've discovered by experimenting with diff(1) on Red Hat Linux; this identifies itself as 'diff (GNU diffutils) 2.8.1'. Hopefully this is useful for someone who needs to generate unified diffs or who needs to parse them. (I had both needs recently. :-)

The header lines look like this:

indicator ' ' filename '\t' date ' ' time ' ' timezone

where:

Each chunk starts with a line that looks like this:

'@@ -' range ' +' range ' @@'

where range is either one unsigned decimal number or two separated by a comma. The first number is the start line of the chunk in the old or new file. The second number is chunk size in that file; it and the comma are omitted if the chunk size is 1. (Email from a reader suggests that this omission is optional and may be phased out.) If the chunk size is 0, the first number is one lower than one would expect (it is the line number after which the chunk should be inserted or deleted; in all other cases it gives the first line number or the replaced range of lines).

A chunk then continues with lines starting with ' ' (common line), '-' (only in old file), or '+' (only in new file). If the last line of a file doesn't end in a newline character, it is displayed with a newline characer, and the following line in the chunk has the literal text (starting in the first column):

'\ No newline at end of file'

Talk Back!

Have an opinion? Readers have already posted 5 comments about this weblog entry. Why not add yours?

RSS Feed

If you'd like to be notified whenever Guido van van Rossum adds a new entry to his weblog, subscribe to his RSS feed.

About the Blogger

Guido van Rossum is the creator of Python, one of the major programming languages on and off the web. The Python community refers to him as the BDFL (Benevolent Dictator For Life), a title straight from a Monty Python skit. He moved from the Netherlands to the USA in 1995, where he met his wife. Until July 2003 they lived in the northern Virginia suburbs of Washington, DC with their son Orlijn, who was born in 2001. They then moved to Silicon Valley where Guido now works for Google (spending 50% of his time on Python!).

This weblog entry is Copyright © 2006 Guido van van Rossum. All rights reserved.

Sponsored Links



Google
  Web Artima.com   

Copyright © 1996-2019 Artima, Inc. All Rights Reserved. - Privacy Policy - Terms of Use