- Title: Is there a file comparison/diffing tool that considers whitespace less important while not ignoring it?
- Tags: diff
- In the codebase I work on, people take indentation very seriously; not just for control structures (if, while, functions, etc.) but also parameters of functions on multiple lines, content inside comments, etc. What I observe in the graphical diff and merging tools I have used is that they seem to work as follows:
- - do the usual line diffing
- - do character diffing within each difference hunk thus found
- While this works for the most part, when there is an indenting difference (say something has been wrapped in an if) I get suboptimal results, a bit as if the tool told me "Nope, these lines are different: in the second file there are four more spaces, they cannot possibly be related, same for this one… Ooh, shiny, two closing braces at the same indentation level! They must necessarily match! Let me move around the lines before and after these braces so that the braces can match, and prevent these (actually related) moved lines from being compared character by character! There, aren't you glad I did all the work for you?" This is even worse when you want to do a merge, good luck figuring out where a single-line change from base to right must land when the same section has been indented between base and left.
- Ignoring whitespace works only to a point here, I do want to review whitespace changes, and I do want the merged single-line change to end up as the right indentation level, for example.
- So my question is, is there a file comparison/diffing tool (graphical or not), or an algorithm, that would have more sensible behavior: take whitespace changes into account, but consider them much less important than other changes, so that matching works better with indentation or other whitespace-only changes? Seems to me I am not alone with this issue, and it could even be as simple as doing the line diffing while ignoring whitespace but then doing the character diffing while taking it into account.
- Answer 1:
- As a general rule, I don't think you are actually interested in the whitespace; the code (in many languages) mostly doesn't care. You want to compare the code structure. See http://stackoverflow.com/a/2828866
- Answer 2:
- Our program, [Beyond Compare], does that. It classifies text as important (strings, identifiers, operators, etc) and unimportant (comments, whitespace). The alignment is done on the important text first and the unimportant text afterwards, and it shows the important and unimportant differences in different colors so it's obvious what kind it is.
- : http://www.scootersoftware.com/
RAW Paste Data