Sorting Spaces

There seems to be some disagreement, at Apple Computer, about exactly what the definition of the word “ignore” is. From the “sort” man page:

-d Sort in `phone directory’ order: ignore all characters except letters, digits and blanks when sorting.

What does that suggest to you? Well, let’s compare it to the GNU “sort” man page:

-d, —dictionary-order
consider only blanks and alphanumeric characters

So you’d THINK, right, that sorting with these two options would be equivalent, right?


Here’s a simple list:

- foo
- foo

How should these things be sorted when the -d option is in effect? You’ve got a conundrum: is a space sorted BEFORE a number or AFTER a number?

Curse you, alphabet! You’re never around when I need you!

And, of course, BSD and GNU answer that question differently. On GNU, the answer is AFTER, on BSD the answer is BEFORE! Oh goody.

Here’s a better way if you need the sorting results to be the same on both BSD and GNU: replace all spaces with something else non-alpha-numeric that isn’t used in the file (such as an underscore, or an ellipsis, or an em-dash). Then sort with -ds (no last-minute saving throws!), then replace the underscore (or whatever) with a space again.

And if you need it to be consistent on OSX platforms too, make it a -dfs sort (so that capitals and lower-case are considered the same).


Mr. Jaggs:

Never thought of replacing all spaces with a more manageable character.


