Unix Shell - How to sort block of text

This is Interesting: Free IT Magazines  
Home > Archive > Unix Shell > February 2006 > How to sort block of text





You are viewing an archived Text-only version of the thread. To view this thread in it's original format and/or if you want to reply to this thread please [click here]

Author How to sort block of text
Jeff

2006-02-22, 6:04 pm

Hello all,

Can someone gives me hints on how to sort block of text like the one below?
In other words, I would like to sort numerically by paragraph. Are there
any tools or scripts already existing for that type of sorting?


2189 Specify a remote server.
$ error string displayed when no remote server is specified.

2188 Specify a mount point.
$ error string displayed when no mount point is specified.

2190 Specify a remote directory.
$ error string displayed when no remote directory is specified.

112 Defragment Directories: %s
113
$ Title of the form. Will appear at top of screen when a file
$ system is selected from the file systems tab and the defrag action
$ is clicked.


Thanks a bunch!
Jeff
William James

2006-02-22, 6:04 pm

Jeff wrote:
> Hello all,
>
> Can someone gives me hints on how to sort block of text like the one below?
> In other words, I would like to sort numerically by paragraph. Are there
> any tools or scripts already existing for that type of sorting?
>
>
> 2189 Specify a remote server.
> $ error string displayed when no remote server is specified.
>
> 2188 Specify a mount point.
> $ error string displayed when no mount point is specified.
>
> 2190 Specify a remote directory.
> $ error string displayed when no remote directory is specified.
>
> 112 Defragment Directories: %s
> 113
> $ Title of the form. Will appear at top of screen when a file
> $ system is selected from the file systems tab and the defrag action
> $ is clicked.



ruby -e 'puts gets(nil).split(/\n\n/).sort_by{|s|s.to_i}.join("\n\n")'

Jeff

2006-02-22, 6:04 pm

William James wrote:

> Jeff wrote:
>
>
> ruby -e 'puts gets(nil).split(/\n\n/).sort_by{|s|s.to_i}.join("\n\n")'


I am sorry, I do not know anything about ruby (I believe I should...). When
I execute the command above, the result is not sorted. I want it sorted
numerically by paragraph so the output would be:

112 Defragment Directories: %s
113
$ Title of the form. Will appear at top of screen when a file
$ system is selected from the file systems tab and the defrag action
$ is clicked.

2188 Specify a mount point.
$ error string displayed when no mount point is specified.

2189 Specify a remote server.
$ error string displayed when no remote server is specified.

2190 Specify a remote directory.
$ error string displayed when no remote directory is specified.


Xicheng

2006-02-22, 6:04 pm

Jeff wrote:
> William James wrote:
>
>
> I am sorry, I do not know anything about ruby (I believe I should...). When
> I execute the command above, the result is not sorted. I want it sorted
> numerically by paragraph so the output would be:


how about using PERL (if your file is not very huge):

perl -n00e '$h{$_}++ }{ print sort {$a <=> $b} keys %h' myfile.txt

Xicheng

> 112 Defragment Directories: %s
> 113
> $ Title of the form. Will appear at top of screen when a file
> $ system is selected from the file systems tab and the defrag action
> $ is clicked.
>
> 2188 Specify a mount point.
> $ error string displayed when no mount point is specified.
>
> 2189 Specify a remote server.
> $ error string displayed when no remote server is specified.
>
> 2190 Specify a remote directory.
> $ error string displayed when no remote directory is specified.


Jeff

2006-02-22, 6:04 pm


I want to apologize. Both ruby and PERL scripts work like a charm.
My "real" file had CR/LF.

Thank you so much!
Jeff

William James

2006-02-22, 6:04 pm

Jeff wrote:
> William James wrote:
>
>
> I am sorry, I do not know anything about ruby (I believe I should...). When
> I execute the command above, the result is not sorted. I want it sorted
> numerically by paragraph so the output would be:
>
> 112 Defragment Directories: %s
> 113
> $ Title of the form. Will appear at top of screen when a file
> $ system is selected from the file systems tab and the defrag action
> $ is clicked.
>
> 2188 Specify a mount point.
> $ error string displayed when no mount point is specified.
>
> 2189 Specify a remote server.
> $ error string displayed when no remote server is specified.
>
> 2190 Specify a remote directory.
> $ error string displayed when no remote directory is specified.


Perhaps the lines that appear to be empty actually contain
spaces or tabs. If so, "split(/\n\n/)" should be changed to
"split(/\n[ \t]*\n/)". Also, let's remove any carriage-returns.

ruby -e 'puts gets(nil).chomp.delete("\r").
split(/\n[ \t]*\n/).sort_by{|s|s.to_i}.join("\n\n")' myfile

Explanation:
gets(nil) reads the whole file
chomp removes linefeed at end
delete("\r") removes carriage returns that windhose may have inserted
split(/\n[ \t]*\n/) creates an array by splitting on blank lines
sort_by{|s| s.to_i } sorts the array on the integer value of each
element
join("\n\n") converts the array back into a string with a blank line
between each pair of elements

William James

2006-02-22, 6:04 pm


Jeff wrote:
> I want to apologize. Both ruby and PERL scripts work like a charm.
> My "real" file had CR/LF.


Mickeysoft strikes again.

William James

2006-02-22, 6:04 pm

Xicheng wrote:
> Jeff wrote:
>
> how about using PERL (if your file is not very huge):
>
> PERL -n00e '$h{$_}++ }{ print sort {$a <=> $b} keys %h' myfile.txt


ruby -00e 'puts $<.sort_by{|s|s.to_i}' myfile.txt

Jeff

2006-02-22, 8:51 pm

William James wrote:

> Xicheng wrote:
>
> ruby -00e 'puts $<.sort_by{|s|s.to_i}' myfile.txt


I am impressed. Let say that the file would be in this format instead:
comments first, message second. I need to understand more the sort_by
syntax.

$ error string displayed when no remote server is specified.
2189 Specify a remote server.

$ error string displayed when no mount point is specified.
2188 Specify a mount point.

$ error string displayed when no remote directory is specified.
2190 Specify a remote directory.

$ Title of the form. Will appear at top of screen when a file
$ system is selected from the file systems tab and the defrag action
$ is clicked.
112 Defragment Directories: %s
113

Thanks to both of you. This is really interesting and helpful.
Jeff
William James

2006-02-23, 2:54 am

Jeff wrote:
> William James wrote:
>
>
> I am impressed. Let say that the file would be in this format instead:
> comments first, message second. I need to understand more the sort_by
> syntax.
>
> $ error string displayed when no remote server is specified.
> 2189 Specify a remote server.
>
> $ error string displayed when no mount point is specified.
> 2188 Specify a mount point.
>
> $ error string displayed when no remote directory is specified.
> 2190 Specify a remote directory.
>
> $ Title of the form. Will appear at top of screen when a file
> $ system is selected from the file systems tab and the defrag action
> $ is clicked.
> 112 Defragment Directories: %s
> 113
>
> Thanks to both of you. This is really interesting and helpful.
> Jeff


ruby -00e 'puts $<.sort_by{|s| s[/^\d+/].to_i }' myfile

Explanation:
$<
Sort of a file-handle to all of the files on the command line.
Synonym for ARGF.
sort_by{ |s|
s represents one element of the array being sorted: a string
containing a paragraph. We have to specify what part of that
string should be used for determining order. The first number
that is found at the start of a line is what we want.
s[/^\d+/]
In Ruby you can grab a part of a string by appending square brackets
with a regular expression inside. The ^ represents the start of a
line;
\d means a digit; + means one or more of the preceding item.
.to_i
Convert the string to an integer.

Stephane CHAZELAS

2006-02-23, 2:54 am

2006-02-22, 15:53(-05), Jeff:
> Hello all,
>
> Can someone gives me hints on how to sort block of text like the one below?
> In other words, I would like to sort numerically by paragraph. Are there
> any tools or scripts already existing for that type of sorting?
>
>
> 2189 Specify a remote server.
> $ error string displayed when no remote server is specified.
>
> 2188 Specify a mount point.
> $ error string displayed when no mount point is specified.
>
> 2190 Specify a remote directory.
> $ error string displayed when no remote directory is specified.
>
> 112 Defragment Directories: %s
> 113
> $ Title of the form. Will appear at top of screen when a file
> $ system is selected from the file systems tab and the defrag action
> $ is clicked.

[...]

POSIXly (though you may encounter some awk record size limits
that POSIX allows to be very low (2048 characters)), you could
do:

< file awk -vRS= '
{
gsub("_", "_u")
gsub("/", "_s")
gsub("\n", "/")
print
}' | sort -n | awk '
NR > 1 {print ""}
{
gsub("/", "\n")
gsub("_s", "/")
gsub("_u", "_")
print
}'


--
Stéphane
Jeff

2006-02-26, 10:18 am

William James wrote:

>
> ruby -00e 'puts $<.sort_by{|s| s[/^\d+/].to_i }' myfile


Thanks a lot. I really appreciate your time. I will definitely keep Ruby in
mind.

Jeff

> Explanation:
> $<
> Sort of a file-handle to all of the files on the command line.
> Synonym for ARGF.
> sort_by{ |s|
> s represents one element of the array being sorted: a string
> containing a paragraph. We have to specify what part of that
> string should be used for determining order. The first number
> that is found at the start of a line is what we want.
> s[/^\d+/]
> In Ruby you can grab a part of a string by appending square brackets
> with a regular expression inside. The ^ represents the start of a
> line;
> \d means a digit; + means one or more of the preceding item.
> .to_i
> Convert the string to an integer.


William James

2006-02-26, 10:18 am

Jeff wrote:[vbcol=seagreen]
> William James wrote:
>
>
> Thanks a lot. I really appreciate your time. I will definitely keep Ruby in
> mind.
>
> Jeff
>

Since sorting is the topic, I should have written
"Something like a file-handle" to avoid confusion.
[vbcol=seagreen]

Sponsored Links






Free braindumps | Software forum | Database administration forum

Copyright 2003 - 2008 webservertalk.com