René Nyffenegger's collection of things on the web
René Nyffenegger on Oracle - Most wanted - Feedback -
 

Sorting with Perl

The sort operator

Sorting strings

The operator in Perl to sort a list is sort, which seems sort of logical.
my @strings = qw (delaware ape citron blue);

my @sorted_strings= sort @strings;
print "\nStrings, correctly sorted\n";
print join "\n",@sorted_strings;
The output of this Perl script is:
ape
blue
citron
delaware

Sorting numbers

The problem with sort's default behaviour is that it sorts alphabetically. Here's snippet that incorrectly sorts numbers:
my @numbers = (20,1,10,2);
my @incorrectly_sorted_numbers=sort @numbers;
print "\n\nNumbers, incorrectly sorted\n";
print join "\n",@incorrectly_sorted_numbers;
Here's the output of this Perl script. Probably not what was intended.
1
10
2
20
If you want to have the numbers sorted by their value, you need use the <=> (aka Spaceship Operator):
my @numbers = (20,1,10,2);
my @correctly_sorted_numbers = sort {$a <=> $b} @numbers;
print "\n\nNumbers, correctly sorted\n";
print join "\n",@correctly_sorted_numbers;
The variables $a and $b are automatically set by Perl. The output is now:
1
2
10
20

User defined sorting

Perl's sort isn't limited to Perl's in-built operators, you can also define a function to be called for the sort criterium. Here, this function is lowest_letter. It is given a string and it finds the letter with the least value in it and returns it. So, after the sort, zebra will be first because it conains an A, bostich will be second because it doesn't contain an A, but a B.
my @strings_ll = qw(why bostich socket zebra);
my @sorted_strings_ll = sort {lowest_letter($a) cmp lowest_letter($b)} @strings_ll;
print "\n\nStrings, sorted for their lowest letter\n";
print join "\n",@sorted_strings_ll;

sub lowest_letter {
  my $string = shift;
  my $lowest_letter = chr(255);
  foreach my $letter (split //, $string) {
    $lowest_letter = $letter if $letter lt $lowest_letter;
  }
  return $lowest_letter;
}
The output...
Strings, sorted for their lowest letter
zebra
bostich
socket
why
It is also possible to give a functions name directly to sort. $a and $b are set automatically here, as well.
my @strings = qw(why bostich socket zebra);

my @sorted_strings_3rd = sort third_letter @strings;
print "\n\nStrings, sorted by their third letter\n";
print join "\n",@sorted_strings_3rd;

sub third_letter {
  return substr($a,2,1) cmp substr($b,2,1);
}
Strings, sorted by their second letter
zebra
socket
bostich
why

Thanks

Thanks to James Sullivan who notified my of an error on this page which is now corrected.