|
Home > Archive > Unix Programming > November 2006 > line diff
You are viewing an archived Text-only version of the thread.
To view this thread in it's original format and/or if you want to reply to
this thread please [click here]
|
|
| Henry Townsend 2006-11-25, 7:22 pm |
| It's commonly said that the beauty of Unix is in part due to its large
collection of small utilities (cat/sed/grep/awk/cut/...) which can be
pasted together via pipelines to do just about anything. But there's one
thing that seems to be missing; the ability to find diffs *within* a
long line.
My use case: I'm using a build system which generates really long
command lines. I'm looking at one right now that's ~2500 characters,
which on an 80-column screen wraps around into 32 visual lines. I've got
one version that builds correctly and another that doesn't, so I have
two of these giant lines and need to analyze their differences. And it's
not just once, I'll be spending a *lot* of time doing this over the next
few months while working on a porting project. This is a very hard thing
to do visually, over and over.
So does anyone know of a tool which can highlight the parts of a line
which differ or remove the parts that don't? If not a utility, ideas on
a PERL script or similar?
TIA,
HT
| |
| Chris F.A. Johnson 2006-11-26, 1:28 am |
| On 2006-11-26, Henry Townsend wrote:
> It's commonly said that the beauty of Unix is in part due to its large
> collection of small utilities (cat/sed/grep/awk/cut/...) which can be
> pasted together via pipelines to do just about anything. But there's one
> thing that seems to be missing; the ability to find diffs *within* a
> long line.
>
> My use case: I'm using a build system which generates really long
> command lines. I'm looking at one right now that's ~2500 characters,
> which on an 80-column screen wraps around into 32 visual lines. I've got
> one version that builds correctly and another that doesn't, so I have
> two of these giant lines and need to analyze their differences. And it's
> not just once, I'll be spending a *lot* of time doing this over the next
> few months while working on a porting project. This is a very hard thing
> to do visually, over and over.
>
> So does anyone know of a tool which can highlight the parts of a line
> which differ or remove the parts that don't? If not a utility, ideas on
> a PERL script or similar?
The are probably too many variations to make it a standard
utility. What would you like it to do in each case?
The following script, which works in bash and ksh93, prints the
common words from two strings:
{
a='asd fgh jal zxc vbn '
b='asd fgh jkl zxc vbm '
comm -12 <(set -f; printf "%s\n" $a) <(set -f; printf "%s\n" $b) | tr '\012' ' '
echo
}
Change -12 for other results.
--
Chris F.A. Johnson, author <http://cfaj.freeshell.org/shell>
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
===== My code in this post, if any, assumes the POSIX locale
===== and is released under the GNU General Public Licence
| |
| Kenny McCormack 2006-11-26, 1:28 am |
| In article < B5ydnYc5mKN_efXYnZ2dnUVZ_t2dnZ2d@comcast
.com>,
Henry Townsend <henry.townsend@not.here> wrote:
>It's commonly said that the beauty of Unix is in part due to its large
>collection of small utilities (cat/sed/grep/awk/cut/...) which can be
>pasted together via pipelines to do just about anything. But there's one
>thing that seems to be missing; the ability to find diffs *within* a
>long line.
When I need to do this, I usually do something like:
gawk '{l=length($0);for (i=1; i<=l; i+=75) print substr($0,i,75)}' file.1 > file1
gawk '{l=length($0);for (i=1; i<=l; i+=75) print substr($0,i,75)}' file.2 > file2
diff file1 file2 (*)
(*) Or: gvim -d file1 file2
Not particularly high tech, and the hot dogs will probably come up with
tricky use of things like <( and so on, but the advantage of doing it
this way is that you get to see what you are doing as you're doing it.
And, of course, it can be wrapped up in a script when you get it all
working.
| |
| Ed Morton 2006-11-26, 1:28 am |
| Kenny McCormack wrote:
> In article < B5ydnYc5mKN_efXYnZ2dnUVZ_t2dnZ2d@comcast
.com>,
> Henry Townsend <henry.townsend@not.here> wrote:
>
>
>
> When I need to do this, I usually do something like:
>
> gawk '{l=length($0);for (i=1; i<=l; i+=75) print substr($0,i,75)}' file.1 > file1
> gawk '{l=length($0);for (i=1; i<=l; i+=75) print substr($0,i,75)}' file.2 > file2
> diff file1 file2 (*)
>
> (*) Or: gvim -d file1 file2
I was thinking "fold" instead of "gawk" and "tkdiff" (or "gtkdiff")
instead of "gvim", but same idea....
Ed.
| |
| Janis Papanagnou 2006-11-26, 1:16 pm |
| Chris F.A. Johnson wrote:
> On 2006-11-26, Henry Townsend wrote:
>
>
>
> The are probably too many variations to make it a standard
> utility. What would you like it to do in each case?
>
> The following script, which works in bash and ksh93, prints the
> common words from two strings:
>
> {
> a='asd fgh jal zxc vbn '
> b='asd fgh jkl zxc vbm '
> comm -12 <(set -f; printf "%s\n" $a) <(set -f; printf "%s\n" $b) | tr '\012' ' '
> echo
> }
>
> Change -12 for other results.
>
A variant for character oriented comparison (and if the different lines are
in a file, line1 resp. line2)...
comm -12 <( sed 's/./&\n/g' <line1 ) <( sed 's/./&\n/g' <line2 ) |
tr -d '\012'
Janis
| |
| William James 2006-11-26, 1:16 pm |
| Chris F.A. Johnson wrote:
> On 2006-11-26, Henry Townsend wrote:
>
> The are probably too many variations to make it a standard
> utility. What would you like it to do in each case?
>
> The following script, which works in bash and ksh93, prints the
> common words from two strings:
>
> {
> a='asd fgh jal zxc vbn '
> b='asd fgh jkl zxc vbm '
> comm -12 <(set -f; printf "%s\n" $a) <(set -f; printf "%s\n" $b) | tr '\012' ' '
> echo
> }
Using Ruby:
a = 'asd fgh jal zxc vbn '.split
b = 'asd fgh jkl zxc vbm '.split
puts a.zip(b).map{|x,y| (x==y) ? x : "(#{x}|#{y})" }.join(" ")
---> asd fgh (jal|jkl) zxc (vbn|vbm)
| |
| Icarus Sparry 2006-11-26, 1:16 pm |
| On Sat, 25 Nov 2006 19:50:45 -0500, Henry Townsend wrote:
> It's commonly said that the beauty of Unix is in part due to its large
> collection of small utilities (cat/sed/grep/awk/cut/...) which can be
> pasted together via pipelines to do just about anything. But there's one
> thing that seems to be missing; the ability to find diffs *within* a
> long line.
>
> My use case: I'm using a build system which generates really long
> command lines. I'm looking at one right now that's ~2500 characters,
> which on an 80-column screen wraps around into 32 visual lines. I've got
> one version that builds correctly and another that doesn't, so I have
> two of these giant lines and need to analyze their differences. And it's
> not just once, I'll be spending a *lot* of time doing this over the next
> few months while working on a porting project. This is a very hard thing
> to do visually, over and over.
>
> So does anyone know of a tool which can highlight the parts of a line
> which differ or remove the parts that don't? If not a utility, ideas on
> a PERL script or similar?
>
> TIA,
> HT
'wdiff' from the fsf will narrow the change down to the "word" level. You
can customise the output to output escape sequences if the "less" mode is
not enough for you.
| |
| phil-news-nospam@ipal.net 2006-11-26, 1:16 pm |
| In comp.unix.programmer Henry Townsend <henry.townsend@not.here> wrote:
| It's commonly said that the beauty of Unix is in part due to its large
| collection of small utilities (cat/sed/grep/awk/cut/...) which can be
| pasted together via pipelines to do just about anything. But there's one
| thing that seems to be missing; the ability to find diffs *within* a
| long line.
|
| My use case: I'm using a build system which generates really long
| command lines. I'm looking at one right now that's ~2500 characters,
| which on an 80-column screen wraps around into 32 visual lines. I've got
| one version that builds correctly and another that doesn't, so I have
| two of these giant lines and need to analyze their differences. And it's
| not just once, I'll be spending a *lot* of time doing this over the next
| few months while working on a porting project. This is a very hard thing
| to do visually, over and over.
|
| So does anyone know of a tool which can highlight the parts of a line
| which differ or remove the parts that don't? If not a utility, ideas on
| a PERL script or similar?
echo first big long command | tr ' ' '\012' > 1
echo other big long command | tr ' ' '\012' > 2
diff 1 2
rm 1 2
--
|---------------------------------------/----------------------------------|
| Phil Howard KA9WGN (ka9wgn.ham.org) / Do not send to the address below |
| first name lower case at ipal.net / spamtrap-2006-11-26-1225@ipal.net |
|------------------------------------/-------------------------------------|
| |
| Michael Paoli 2006-11-26, 7:18 pm |
| Henry Townsend wrote:
> It's commonly said that the beauty of Unix is in part due to its large
> collection of small utilities (cat/sed/grep/awk/cut/...) which can be
> pasted together via pipelines to do just about anything. But there's one
> thing that seems to be missing; the ability to find diffs *within* a
> long line.
>
> My use case: I'm using a build system which generates really long
> command lines. I'm looking at one right now that's ~2500 characters,
> which on an 80-column screen wraps around into 32 visual lines. I've got
> one version that builds correctly and another that doesn't, so I have
> two of these giant lines and need to analyze their differences. And it's
> not just once, I'll be spending a *lot* of time doing this over the next
> few months while working on a porting project. This is a very hard thing
> to do visually, over and over.
>
> So does anyone know of a tool which can highlight the parts of a line
> which differ or remove the parts that don't? If not a utility, ideas on
> a PERL script or similar?
With just standard tools, we can do something like this (where we've
put each of the lines into a separate file):
$ wc l?
1 304 2500 l1
1 304 2500 l2
2 608 5000 total
$ cmp l?
l1 l2 differ: char 1250, line 1
$ cut -c1240-1260 l1; cut -c1240-1260 l2
Afghanistan's Afghan
AfghanistAn's Afghan
$
widff(1) is also freely available, and has syntax similar to comm(1),
while being "word" oriented:
$ wdiff -3 l?
========================================
==============================
[-Afghanistan's-] {+AfghanistAn's+}
========================================
==============================
$
One could also create such a tool, e.g. by using some of these tools:
awk(1)
cmp(1)
comm(1)
cut(1)
diff(1)
sed(1)
sh(1)
tput(1)
tr(1)
tty(1)
or
perl(1)
Also, if feasible (depending on language/interpreter, etc.) it might
also be useful to generate command lines that are of a more human
friendly readable length, e.g. for sh(1):
a very very very ... very long command
can also typically be rewritten as:
a \
very \
very \
very \
.... \
very \
long \
command
(one can easily imagine how a ~2500 character command line could
typically be be split into more human readable <~=80 character lines)
| |
| John W. Krahn 2006-11-26, 7:18 pm |
| Henry Townsend wrote:
> It's commonly said that the beauty of Unix is in part due to its large
> collection of small utilities (cat/sed/grep/awk/cut/...) which can be
> pasted together via pipelines to do just about anything. But there's one
> thing that seems to be missing; the ability to find diffs *within* a
> long line.
>
> My use case: I'm using a build system which generates really long
> command lines. I'm looking at one right now that's ~2500 characters,
> which on an 80-column screen wraps around into 32 visual lines. I've got
> one version that builds correctly and another that doesn't, so I have
> two of these giant lines and need to analyze their differences. And it's
> not just once, I'll be spending a *lot* of time doing this over the next
> few months while working on a porting project. This is a very hard thing
> to do visually, over and over.
>
> So does anyone know of a tool which can highlight the parts of a line
> which differ
$ PERL -le'
use Term::ANSIColor q(:constants);
use Text::ParseWords;
my $cl1 = q(-d -f filename -L/lib/mylibrary --optionX --optionG "some text" -q
-vv --print);
my $cl2 = q(--print --optionG "other text" --optionY -d -q -vvv -o filename);
print for $cl1, $cl2, "";
my %hash1 = map { $_ => 1 } my @words1 = parse_line qr/\s+/, 1, $cl1;
my %hash2 = map { $_ => 1 } my @words2 = parse_line qr/\s+/, 1, $cl2;
print join q[ ], map { exists $hash2{ $_ } ? $_ : join q[], BOLD, BLUE, $_,
RESET } @words1;
print join q[ ], map { exists $hash1{ $_ } ? $_ : join q[], BOLD, RED, $_,
RESET } @words2;
'
-d -f filename -L/lib/mylibrary --optionX --optionG "some text" -q -vv --print
--print --optionG "other text" --optionY -d -q -vvv -o filename
-d -f filename -L/lib/mylibrary --optionX --optionG "some text" -q -vv --print
--print --optionG "other text" --optionY -d -q -vvv -o filename
* Note that the colours won't show here but it should work on a terminal. :-)
If you replaced the constants BOLD, BLUE, RED and RESET with the strings
'<BOLD>', '<BLUE>', '<RED>' and '<RESET>' you would get this output:
-d -f filename -L/lib/mylibrary --optionX --optionG "some text" -q -vv --print
--print --optionG "other text" --optionY -d -q -vvv -o filename
-d <BOLD><BLUE>-f<RESET> filename <BOLD><BLUE>-L/lib/mylibrary<RESET>
<BOLD><BLUE>--optionX<RESET> --optionG <BOLD><BLUE>"some text"<RESET> -q
<BOLD><BLUE>-vv<RESET> --print
--print --optionG <BOLD><RED>"other text"<RESET> <BOLD><RED>--optionY<RESET>
-d -q <BOLD><RED>-vvv<RESET> <BOLD><RED>-o<RESET> filename
> or remove the parts that don't?
$ PERL -le'
use Text::ParseWords;
my $cl1 = q(-d -f filename -L/lib/mylibrary --optionX --optionG "some text" -q
-vv --print);
my $cl2 = q(--print --optionG "other text" --optionY -d -q -vvv -o filename);
print for $cl1, $cl2, "";
my %hash1 = map { $_ => 1 } my @words1 = parse_line qr/\s+/, 1, $cl1;
my %hash2 = map { $_ => 1 } my @words2 = parse_line qr/\s+/, 1, $cl2;
print join q[ ], grep { not exists $hash2{ $_ } } @words1;
print join q[ ], grep { not exists $hash1{ $_ } } @words2;
'
-d -f filename -L/lib/mylibrary --optionX --optionG "some text" -q -vv --print
--print --optionG "other text" --optionY -d -q -vvv -o filename
-f -L/lib/mylibrary --optionX "some text" -vv
"other text" --optionY -vvv -o
John
--
Perl isn't a toolbox, but a small machine shop where you can special-order
certain sorts of tools at low cost and in short order. -- Larry Wall
|
|
|
|
|