« Apple's Compiler Idiocy | Main | w3m and MacPorts »

Sorting Spaces

There seems to be some disagreement, at Apple Computer, about exactly what the definition of the word “ignore” is. From the “sort” man page:

-d Sort in `phone directory’ order: ignore all characters except letters, digits and blanks when sorting.

What does that suggest to you? Well, let’s compare it to the GNU “sort” man page:

-d, —dictionary-order
consider only blanks and alphanumeric characters

So you’d THINK, right, that sorting with these two options would be equivalent, right?

Nope!

Here’s a simple list:

- 192.168.2.4 foo
- 192.168.2.42 foo

How should these things be sorted when the -d option is in effect? You’ve got a conundrum: is a space sorted BEFORE a number or AFTER a number?

Curse you, alphabet! You’re never around when I need you!

And, of course, BSD and GNU answer that question differently. On GNU, the answer is AFTER, on BSD the answer is BEFORE! Oh goody.

Here’s a better way if you need the sorting results to be the same on both BSD and GNU: replace all spaces with something else non-alpha-numeric that isn’t used in the file (such as an underscore, or an ellipsis, or an em-dash). Then sort with -ds (no last-minute saving throws!), then replace the underscore (or whatever) with a space again.

And if you need it to be consistent on OSX platforms too, make it a -dfs sort (so that capitals and lower-case are considered the same).

TrackBack

TrackBack URL for this entry:
https://www.we-be-smart.org/mt/mt-tb.cgi/720

Comments (1)

Mr. Jaggs:

Brilliant.
Never thought of replacing all spaces with a more manageable character.

Thanks

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About

This page contains a single entry from the blog posted on March 12, 2008 6:08 PM.

The previous post in this blog was Apple's Compiler Idiocy.

The next post in this blog is w3m and MacPorts.

Many more can be found on the main index page or by looking through the archives.

Creative Commons License
This weblog is licensed under a Creative Commons License.
Powered by
Movable Type 3.34