Breakthrough Motor!

I recently found this article: The Techno Maestro’s Amazing Machine, which seems like the coolest thing since sliced bread. A quick summary:

A fellow in Japan named Minato has invented an electric motor that is 80% efficient. As a demonstration of what this means, he can run a standard washer/drier that normally needs a 220 watt outlet can run on 16 watts (that’s a 35kg motor).

Yes, I know it’s impossible. It’s still cool. :)

Continue reading "Breakthrough Motor!" »

Posted by Kyle Wheeler on April 17, 2004 4:03 PM | Permalink | Comments (0)

In Defense of Macs

Everybody knows that Macs are more expensive than PCs. Except Paul Murphy, who begs to differ. He says:

At the low end, therefore, the PC desktops are marginally less expensive than the Macs—if you can do without their connectivity and multimedia capabilities—and considerably more expensive if you can’t. At the very high end, however, all of the design focus is on multimedia processing and the PCs simply aren’t competitive from either hardware or cost perspectives.

But, as he says, the mitigating factor that comes up in most people’s minds is:

The PC community response is, first, that the multimedia features distinguishing the Mac aren’t necessary and, secondly, that the PC is so far ahead of the Mac on speed that the comparisons are pointless anyway.

Paul Murphy discusses this thing as well. (Hint: Macs Kick PC Ass)

Posted by Kyle Wheeler on October 9, 2004 6:57 PM | Permalink | Comments (0) | TrackBacks (0)

Predicting the Future

I just found this link off of slashdot. The summary is this: a scientist has developed a machine for generating random numbers. This machine appears to be able to be influenced by human thought, and appears to be able to predict (when several are aggregated from around the world) catastrophes several hours before they happen.

I, for one, would love to know how these things work. Random number generation is kinda difficult, and the first of these devices supposedly was developed back in the 70’s. I say: show me the money. But still, the article (mostly fluff, really) claims they have the power of statistics and a lot of respected mathematicians and other scientists behind them. I’m skeptical, but… what a fascinating concept, if true.

Posted by Kyle Wheeler on February 13, 2005 12:10 PM | Permalink | Comments (0) | TrackBacks (0)

Aluminum

Factoids from the history channel:

3rd most common element
usually found as aluminum oxide, and very very VERY hard to separate from the oxygen, thus:
used to be more expensive than gold
the Washington Monument was capped with aluminum, because at the time, it was one of the most expensive metals we could lay hands on, but not any more
how to separate from oxide: mix with molten metal, and run GIGANTIC amounts of electricity through it

Posted by Kyle Wheeler on May 12, 2005 12:51 AM | Permalink | Comments (0) | TrackBacks (0)

History of Childhood

I just thought this take on childhood was fascinating. It talks about the historical context of the concept of childhood. For example:

It is around this same time that another change was occurring - this one a shift in attitude. In the early seventeenth century, there did not seem to be the same emphasis on sexual decency and propriety that we see in later years (Aries, 1962). People were not shocked by coarse jokes and sexual games with children (e.g., attempting to grab a child’s genitals). Children under the age of puberty were simply not considered aware of sexual matters. Nobody thought that sexual references could corrupt a child’s innocence because the idea of childish innocence did not yet exist (Aries, 1962).

Posted by Kyle Wheeler on June 6, 2005 2:45 AM | Permalink | Comments (0) | TrackBacks (0)

A Spam Idea

Something I’ve been nothing, and this obviously isn’t a serious problem on a system as small as the one I administer, is that spam bounces frequently fill up the queue.

What happens is that some spammer sends spam to one of my legitimate users (illegitimate users are rejected right away, which may make gathering legitimate names easy, but avoids illegitimate bounces). Some of my legitimate users have their mail forwarded to verizon.com, which enjoys refusing to accept email. Of course, my policy on MY email server is to accept all mail: my spam filters (e.g. spamassassin) can be wrong and therefore are only advisory. When verizon refuses to accept the mail, my mail server is stuck holding the bag: I can’t deliver it, and I usually can’t bounce it either. I’d like to just dispose of it, but my queue lifetime is 7 days, so I have to keep it in the queue for 7 days while qmail realizes that verizon is never going to accept it, and the return address doesn’t exist.

So here’s my thought: have two qmail installs, one with a queue lifetime of 7 days, one with a queue lifetime of 1 day. Then, put a script in the QMAILQUEUE chain that decides which qmail-queue to use (the 7-day one or the 1-day one) based on whatever I want (i.e. the X-Spam-Score header, or the $RECIPIENT envariable).

Probably won’t happen on the WOPR here, because, of course, people are skittish about that kind of fiddling (read: it’s a production machine). which means it’ll basically never get done. Oh well - but it was a thought.

Posted by Kyle Wheeler on June 27, 2005 4:40 PM | Permalink | Comments (0) | TrackBacks (0)

Fluffy

Posted by Kyle Wheeler on July 25, 2005 1:58 AM | Permalink | Comments (0) | TrackBacks (0)

The Big One

She Said YES!

or, to put it another way…

Emily and I Are Getting Married!

So, last Friday (the 29th of July), at 11pm, I said to myself “why am I still in South Bend? I could be proposing to Emily right now.” Being unable to come up with a satisfactory answer, I threw a few things in a bag, hopped in my car, and drove through the night, stopping only at McDonald’s and gas stations. (Oh, and I stopped at Meijer’s to pick up 2 dozen roses).

I arrived at Emily’s front door at 1:30 (2:30 her time) the next day (Saturday). She was planning on heading out to the library and then off to her friend Kate’s house to accompany her to a different friend’s birthday party. To keep her in her apartment until I could arrive, I enlisted the help of Rich to keep her online! I arrived seconds before she was going to leave. I knocked on the door, and presented a thoroughly flabbergasted Emily with 2 dozen roses. When we got inside, in the front hallway, I kneeled down, presented the ring, and asked her to marry me. She cried, and said yes! Then we went to her friend’s house.

The night I left, she had said over instant messenger that when I came and visited next (which was planned for two weeks time), we would need to “talk about where this relationship is headed”. After she said yes, I asked her if she was okay with where the relationship was headed. (She just smiled and cried a little more.)

So far, we’re thinking May of 2007. I’m leaning towards SB, but that far out, it may have to move.

Posted by Kyle Wheeler on August 2, 2005 8:52 AM | Permalink | Comments (0) | TrackBacks (0)

Really Hot

Posted by Kyle Wheeler on August 5, 2005 10:34 AM | Permalink | Comments (0) | TrackBacks (0)

Fonts

I found some ancient fonts I created once. Be a shame for them to be forgotten in the mists of time. So, here they are:

(Mac Win)

I also found my old “Poetry & Prose Archive”… it’s childish in places, but still was something I invested a lot of time in back in the day. Rather than let it fade, I’m making it live forever, here, unaltered from exactly how it was back in 1998. Let’s pretend it’s not embarrassing. :) By the way, I highly doubt any of the email addresses on that site still work.

Finally, an old MOD site I put together. No comment.

Posted by Kyle Wheeler on December 27, 2005 10:26 AM | Permalink | Comments (0) | TrackBacks (0)

Qmail Quickstarter

Well, it’s official, I am now a published book author! Of what do I speak? Qmail Quickstarter Yep, seriously, that’s me. It’s even on Amazon!

This has been a kinda under-ground writing project for me so far. I haven’t told a lot of people (including most of my friends) what I have been up to. I think originally I had some reasons for that; stuff like I was unsure whether it would actually pan out (several other qmail-related books have started and died in recent years, and I didn’t want to jinx this one), and I didn’t want to bother with answering questions like “How will you have time for that between writing your proposal, getting married, and doing work for Sandia?” (because of course the answer would have been “I haven’t a clue, but it’s an opportunity I can’t pass up!” — which isn’t a particularly good answer). In retrospect, though, I guess mostly it just felt like bragging.

This project got started back in… wow, apparently, back in August. I got an email out of the blue from a fellow named Nanda Padmanabhan, a Development Editor for Packt Publishing (a company I hadn’t heard of before) in England. He had apparently been subscribed to the qmail mailing list for a while, and wanted to know whether I’d be interested in contributing to a qmail book project that their company was planning on embarking on. Riiiiiiight, I thought. The Nigerian scammers have thought up a new one.

Doing a little research, it turns out that Packt is a real company (surprise!) and has been at least somewhat active in the nerdly fields before: they were mentioned on Slashdot for some coding competitions with cash prizes that they did.

So I exchanged a few emails with Nanda. They were interested in putting together a qmail book that would get a person (of unknown experience) up and running quickly (i.e. in around 150 pages or so). I suggested a few other folks I know that are very qmail-savvy, but was curious why they were interested. After all, there’s several other books out there, including the exhaustive The qmail Handbook by Dave Sill, O’Reilly’s book on the subject by John Levine, and even another (less well-regarded by the qmail community) book by Richard Blum. In fact, on the qmail mailing list, people are often referred to Dave Sill’s book and online reference, http://www.lifewithqmail.org, because they are so thorough and well-done. Who is not being served by these books? Like any good psychologist, Nanda turned that question to me: what do I think might be missing?

My answer was a book with a better sense of the “why” of qmail (i.e. what is qmail’s internal architecture, why is it that way, and how can you use this knowledge?), and examples of the cool tricks you can do with qmail once you understand what’s going on. I had the thought that maybe a book titled “Things I Wish I’d Known About Qmail” or something like that. Dave’s book gives you all you ever wanted to know about the “what” and the “how” of qmail, and is more like a book from the “missing manual” series of books (by Pogue). Nanda also said that people are a bit turned off by the sheer volume of some of these books. They want something quick and light and useful, rather than an exhaustive catalog of every config file and option that qmail has.

This, of course, then spiraled into me brainstorming an outline for what I wished a book would look like, which turned into an official contract shockingly fast (well, shocking for me; it was in October when I got the official contract).

And thus, I began writing. Mostly I packed writing into the weekends, when I could spend all day writing and editing and not be bothered by anyone or anything. This made for a pretty tight schedule. The publisher wanted chapters ASAP (of course), but in general, I got about three weeks for each chapter (according to the contract)… which, because of my goofy schedule of visiting Emily, going to weddings, and holidays, typically meant I got a single weekend to pound out each chapter. Thankfully that was just for the first draft; I got a little more time to race through doing all the editing… but not a lot!

Finally, now, it is done! And you can buy it! (Please do! Buy several!)

I really could not be prouder.

Time to go get married! :D

Posted by Kyle Wheeler on June 17, 2007 9:48 PM | Permalink | Comments (2) | TrackBacks (0)

Book Reviews

I’ve been stumbling across a few reviews of my book! Here they are:

Free Software Magazine

I liked many parts of the book: firstly, the diagrams dividing down the underlying responsibilities in a crisp no nonsense approach as exampled by the queuing diagram on page 39. Secondly, the book is not verbose and does what it has to do with no extra embellishments. For a busy system administrator the book is thus more viable than a 500-page manual. Thirdly, I enjoyed the discussion of the Sender Policy Framework (SPF) and also DomainKeys contained within the pages 93-97. Finally, the mentioning of Silly Qmail Syndrome (page 132) and a patch solution should for some of you short cut a degree of pain and potential embarrassment.

AdminSpotting

qmail is a great email system. It’s been around forever — the latest stable version is almost 10 years old. I considered The qmail Handbook to be the only quality book on qmail available. However, it’s now going to have to share the spotlight with the release of Qmail Quickstarter.

As mentioned before, the latest qmail release is almost 10 years old. A lot has happened with email administration since then — for instance with security and filtering. Not only has the qmail community adapted to these changes, but they’re also all covered in Qmail Quickstarter. This book also does a great job at describing the unique architecture to qmail. qmail takes the age-old Unix approach of using small programs to do one thing well. Each of the pieces that make up qmail are explained and illustrated here.

Though I haven’t used qmail in years, I’m glad to see that it hasn’t fallen behind in the world of email servers. On the contrary, it’s alive and well and this book does a great job showing that off. I give it a 10/10.

There are others… one was submitted to slashdot but appears to have been rejected (I may post it in its entirety later; the author sent me a copy).

Posted by Kyle Wheeler on July 7, 2007 9:33 AM | Permalink | Comments (0) | TrackBacks (0)

Another Review!

I just got a link to another review of my book! This time, by a fellow named James Craig Burley, who also frequents the qmail mailing list. Here’s his intro and conclusion:

Email servers, also called Mail Transfer Agents (MTAs), today do much of the heavy lifting required to transport email from sender to recipient, ideally without the sender or recipient being particularly aware of them. They store (queue) incoming email, then forward it to user’s mailboxes, sometimes via other email servers, while often trying to avoid accepting or sending out spam or viruses. They also allow users to read email waiting for them in their mailboxes. qmail is a popular email server for Unix-based systems; “Qmail Quickstarter: Install, Set Up, and Run your own Email Server” introduces the reader to qmail as an email-delivery architecture that provides the building blocks for an email server.

“Qmail Quickstarter” is a good book for anyone wanting to come up to speed on qmail, whether as their email server of choice or as a means to better choose among the many email servers available. I give this book a 7 out of 10.

Posted by Kyle Wheeler on July 10, 2007 7:44 PM | Permalink | Comments (0) | TrackBacks (0)

I wish I had a C-based lock-free hash table...

I recently stumbled across a google tech talk video by a man named Cliff Click Jr. He works for a company named Azul, and he has a blog here This tech talk was all about a lock-free hash table that he’d invented for Azul. The video is here

Lock free hash table? Heck yeah I want a lock free hash table!

Sadly, Cliff’s implementation is Java-only, and relies on some Java memory semantics, but it’s on sourceforge if anyone’s interested.

So, as I began reading up on the subject, I discovered that he’s not the only one interested. In fact, there’s another fellow who has a C-based library here. Only problem? IT’S NOT ACTUALLY LOCK FREE!!! At least, not yet. At the moment it’s a pthread/mutex-based hash table that happens to have all the pthreads stuff ifdef’d out (joy). There are other people out there who talk about it. A fellow from IBM named Maged M. Michael has a paper about how to do lock-free hash tables, and he even has a patent on his particular method, but no implementations appear to be available. Chris Purcell wrote a paper on the topic, which contains pseudocode, but yet again, no implementation.

So it would appear that if I want a lock-free hash table, I’m going to have to implement it myself. But boy, it gets me giddy just thinking about it. :) Pthreads, you’re going down!

Posted by Kyle Wheeler on September 20, 2007 2:46 PM | Permalink | Comments (3) | TrackBacks (0)

Concurrent Hash Table Tricks

So, I’m working on qthreads (which is open-sourced, but currently lacks a webpage), and thinking about its Unix implementation.

The Unix implementation emulates initialization-free synchronization (address locks and FEBs) by storing addresses in a hash table (well, okay, a striped hash table, but if we make the stripe 1, then it’s just a hash table). Let’s take the simplest: address locks. The semantics of the hash table at the moment are really pretty basic: if an address is in the hash, it’s locked. If it’s not in the hash, it’s not locked. The hash is the cp_hashtable from libcprops, a library which I appreciate greatly for giving C programmers easy access to high-quality basic data structures (I’ve contributed some significant code to it as well). Anyway, the downside of using this hash table is that it’s a bottleneck. The table is made thread-safe by simply wrapping it in a lock, and every operation (lock and unlock) requires locking the table to either insert an entry or remove an entry.

So how could we do this with a more concurrent hash table? I’ve seen two hash table APIs that are concurrent: the lock-free hash in Java that I talked about previously, and the concurrent_hash_map from Intel’s Thread Building Blocks library (which, given that it’s in C++, is something I can actually use).

The way the TBB hash works is that you can perform three basic operations on your hash: find(), insert(), and erase(). When you do either of the first two operations, you can lock that entry in the hash and prevent others from getting at it, or you can access it read-only. The erase function merely takes a key and removes it from the hash table, giving you no access to whatever might have been deleted from the hash table. Worse yet, you cannot erase something that you currently have a lock on, even if it’s a write lock!

Using this hash the way that I currently use the cprops hash is thus impossible. Why? Because erasing things from the TBB hash is ALWAYS a race condition. Put another way, all TBB hash erase operations are “blind erase” operations, when what you really want is “erase if it’s still in an erasable state”. You can never be certain that erasing an entry from the hash table is a good idea, because you can never be certain that someone else didn’t add something important to that entry in between the time that you decided the entry was erasable and the time you actually erased it. If I insert a value (to “lock” an address, say), I can associate that value with a queue of waiting threads (i.e. other threads that also want to lock that address), but I can never erase that entry in the hash table! The reason is that since I can’t erase something that I have access to (i.e. have a write-lock on), there’s a race condition between me fetching the contents of that hash table entry and me removing that entry from the hash table.

A different approach to this might be to simply never remove entries from the hash table, and to simply say that if the associated list of threads is empty (or NULL), then the lock is unlocked. That would work well, except for that tiny little problem of the hash table eternally growing and never reclaiming memory from unused entries. So, if I had an application that created lots of locks all over the place (i.e. inserted lots of different entries into the hash), but never had more than a handful locked (i.e. in the hash) at a time, I’d be wasting memory (and possibly, LOTS of it).

Is there another way to use such a hash table to implement locks more efficiently? I don’t know, but I don’t think so (I’d love to be proved wrong). Any way you slice it, you come back to the problem of deleting things that are in a deletable state, but not knowing if it’s safe to do so.

The Azul Java-only hash is an interesting hash that behaves differently. It is based upon compare-and-swap (CAS) atomic operations. Thus, for a given key, you can atomically read the contents of a value, but there’s no guarantee that that value isn’t changed the MOMENT you read it. Deleting an entry, in this case, means swapping a tombstone marker into place where the entry’s value is supposed to be, which you can avoid doing if that value changed before you got to the swap part (the C of the CAS). Thus, after you’ve extracted the last thread that’d locked that address (i.e. you’ve set the value to NULL) you can avoid marking a thing as “deleted” when it has really just been re-locked because if the value changed to non-NULL (and the compare part of the CAS fails), you can simply ignore the failure and assume that whoever changed it knew what they were doing. Thus, you CAN safely delete elements from the hash table. Better still, it easily integrates with (and may even require) a lock-free CAS-based linked list for queueing blocked threads. (You may be saying to yourself “um, dude, a hash table entry with a tombstone as a value is still taking up memory”, and I say to you: yeah? so? they get trimmed out of the hash table whenever the hash table is resized, thereby being an awesome idea.)

And, as I think about it, forcing users to do blind erases makes Intel TBB hash tables ALMOST unusable for an entire class of problems and/or algorithms. That category of algorithms is any algorithm that needs to delete entries that could potentially be added back at any time. They really ought to provide an equivalent of a CAS: let the user say “delete this hash entry if the value is equal to this”.

I say “ALMOST unusable” because it’s fixable. Consider the ramifications of having control over the comparison and equivalence functions: a key can be associated with a “deletable” flag that provides much of the needed functionality. With such a flag, the result of any find() operation can be considered invalid not only if it returns false but also if the deletable flag associated with the result’s key is true. Essentially, finding something in the hash becomes:

while (hash.find(result, &findme) && result->first->deletable) {
    result->release();
}

It’s an extra layer of indirection, and can cause something to spin once or twice, but it works. Your comparison struct functions must then be something like this:

typedef struct evilptrabstraction {
    bool deletable;
    void * key;
} epa_s;

typedef epa_s * epa;

struct EPAHashCompare {
    static size_t hash(const epa &x) {
        return (size_t)x->key; // or a more complex hash
    }
    static bool equal (const epa &x, const epa &y) {
        if (x->deletable && y->deletable) return true;
        if (x->deletable || y->deletable) return false;
        return x->key == y->key;
    }
};

Note that anything marked deletable is equivalent, but doesn’t match anything non-deletable. Thus, safely deleting something becomes the following (assuming findme is a version of the epa struct not marked deletable):

accessor *result = new accessor();

bool found = hash.find(*result, &findme);
while (found && (*result)->first->deletable)  {
    (*result)->release();
    found = hash.find(*result, &findme);
}

if (found) {
    (*result)->first->deletable = true;
    delete result; // release the lock
    findme.deletable = true;
    hash.erase(&findme);
} else {
    delete result;
}

This opens the question of inserting safely, though, because during the insertion process, your inserted object might have already existed, and if it already existed, it may have been in the process of being deleted (i.e. it might have been marked as deleted). There’s the potential that your “freshly-inserted” item got marked deletable if it was already in the hash. So how do you insert safely?

bool inserted = hash.insert(result, insertme);
// !inserted means insertme was already in the hash
while (!inserted && result->first->deletable) {
    result.release();
    inserted = hash.insert(result, insertme);
}
if (!inserted) delete insertme;

Note that we can’t simply toggle the deletable mark, because an erase() operation may already be waiting for the hash value, and it doesn’t expect that the key for the item may have changed while it was waiting for the item to be locked (so changing the deletable flag won’t stop it from being erased). The downside, of course, is that popularly erased/re-inserted items may cause a fair bit of memory churn, but that’s unavoidable with the TBB’s bare-bones erase() interface.

Posted by Kyle Wheeler on October 9, 2007 2:50 PM | Permalink | Comments (3) | TrackBacks (0)

My Bashrc

There are few things that, over my time using Unix-like systems, I have put more cumulative effort into than into my configuration files. I've been tweaking them since the day I discovered them, attempting to make my environment more and more to my liking. I have posted them on my other website (here), but it occurred to me that they've gotten sufficiently hoary and complex that a walkthrough might help someone other than myself.

Anyway, my bashrc is first on the list. (Or, if you like, the pure text version.)

The file is divided into several (kinda fuzzy) sections:
- Initialization & Other Setup
- Useful Functions
- Loading System-wide Bashrc
- Behavioral Settings
- Environment Variables
- Character Set Detection
- Aliases
- Tab-completion Options
- Machine-local settings
- Auto-logout

Let's take them one at a time.

Initialization & Other Setup

Throughout my bashrc, I use a function I define here ( dprint ) to allow me to quickly turn on debugging information, which includes printing the seconds-since-bash-started variable ( SECONDS ) in case something is taking too long and you want to find the culprit. Yes, my bashrc has a debug mode. This is essentially controlled by the KBWDEBUG environment variable. Then, because this has come in useful once or twice, I allow myself to optionally create a ~/.bashrc.local.preload file which is sourced now, before anything else. Here's the code:

KBWDEBUG=${KBWDEBUG:-no}

function dprint {
if [[ "$KBWDEBUG" == "yes" && "$-" == *i* ]]; then
    #date "+%H:%M:%S $*"
    echo $SECONDS $*
fi
}
dprint alive
if [ -r "${HOME}/.bashrc.local.preload" ]; then
    dprint "Loading bashrc preload"
    source "${HOME}/.bashrc.local.preload"
fi

Useful Functions

This section started with some simple functions for PATH manipulation. Then those functions got a little more complicated, then I wanted some extra functions for keeping track of my config files (which were now in CVS), and then they got more complicated...

You'll notice something about these functions. Bash (these days) will accept function declarations in this form:

function fname()
{
    do stuff
}

But that wasn't always the case. To maintain compatability with older bash versions, I avoid using the uselessly cosmetic parens and I make sure that the curly-braces are on the same line, like so:

function fname \
{
    do stuff
}

Anyway, the path manipulation functions are pretty typical — they're similar to the ones that Fink uses, but slightly more elegant. The idea is based on these rules of PATH variables:

Paths must not have duplicate entries
Paths are faster if they don't have symlinks in them
Paths must not have "." in them
All entries in a path must exist (usually)

There are two basic path manipulation functions: add_to_path and add_to_path_first. They do predictable things — the former appends something to a given path variable (e.g. PATH or MANPATH or LD_LIBRARY_PATH ) unless it's already in that path, and the latter function prepends something to the given PATH variable (or, if it's already in there, moves it to the beginning). Before they add a value to a path, they first check it to make sure it exists, is readable, that I can execute things that are inside it, and they resolve any symlinks in that path (more on that in a moment). Here's the code (ignore the reference to add_to_path_force in add_to_path for now; I'll explain shortly):

function add_to_path \
{
    local folder="${2%%/}"
    [ -d "$folder" -a -x "$folder" ] || return
    folder=`( cd "$folder" ; \pwd -P )`
    add_to_path_force "$1" "$folder"
}

function add_to_path_first \
{
    local folder="${2%%/}"
    [ -d "$folder" -a -x "$folder" ] || return
    folder=`( cd "$folder" ; \pwd -P )`
    # in the middle, move to front
    if eval '[[' -z "\"\${$1##*:$folder:*}\"" ']]'; then
        eval "$1=\"$folder:\${$1//:\$folder:/:}\""
        # at the end
    elif eval '[[' -z "\"\${$1%%*:\$folder}\"" ']]'; then
        eval "$1=\"$folder:\${$1%%:\$folder}\""
        # no path
    elif eval '[[' -z "\"\$$1\"" ']]'; then
        eval "$1=\"$folder\""
        # not in the path
    elif ! eval '[[' -z "\"\${$1##\$folder:*}\"" '||' \
      "\"\$$1\"" '==' "\"$folder\"" ']]'; then
        eval "export $1=\"$folder:\$$1\""
    fi
}

Then, because I was often logging into big multi-user Unix systems (particularly Solaris systems) with really UGLY PATH settings that had duplicate entries, often included ".", not to mention directories that either didn't exist or that I didn't have sufficient permissions to read, I added the function verify_path. All this function does is separates a path variable into its component pieces, eliminates ".", and then reconstructs the path using add_to_path, which handily takes care of duplicate and inaccessible entries. Here's that function:

function verify_path \
{
    # separating cmd out is stupid, but is compatible
    # with older, buggy, bash versions (2.05b.0(1)-release)
    local cmd="echo \$$1"
    local arg="`eval $cmd`"
    eval "$1=\"\""
    while [[ $arg == *:* ]] ; do
        dir="${arg%:${arg#*:}}"
        arg="${arg#*:}"
        if [ "$dir" != "." -a -d "$dir" -a \
          -x "$dir" -a -r "$dir" ] ; then
            dir=`( \cd "$dir" ; \pwd -P )`
            add_to_path "$1" "$dir"
        fi
    done
    if [ "$arg" != "." -a -d "$arg" -a -x "$arg" -a -r "$arg" ] ;
    then
        arg=`( cd "$arg" ; \pwd -P )`
        add_to_path "$1" "$arg"
    fi
}

Finally, I discovered XFILESEARCHPATH — a path variable that requires a strange sort of markup (it's for defining where your app-defaults files are for X applications). This wouldn't work for add_to_path, so I created add_to_path_force that still did duplicate checking but didn't do any verification of the things added to the path.

function add_to_path_force \
{
    if eval '[[' -z "\$$1" ']]'; then
        eval "export $1='$2'"
    elif ! eval '[[' \
        -z "\"\${$1##*:\$2:*}\"" '||' \
        -z "\"\${$1%%*:\$2}\"" '||' \
        -z "\"\${$1##\$2:*}\"" '||' \
        "\"\${$1}\"" '==' "\"$2\"" ']]'; then
        eval "export $1=\"\$$1:$2\""
    fi
}

I mentioned that I resolved symlinks before adding directories to path variables. This is a neat trick I discovered due to the existence of pwd -P and subshells. pwd -P will return the "real" path to the folder you're in, with all symlinks resolved. And it does so very efficiently (without actually resolving symlinks — it just follows all the ".." records). Since you can change directories in a subshell (i.e. between parentheses) without affecting the parent shell, a quick way to transform a folder's path into a resolved path is this: ( \cd "$folder"; pwd -P). I put the backslash in there to use the shell's builtin cd, just in case I'd somehow lost my mind and aliased cd to something else.

And then, just because it was convenient, I added another function: have, which detects whether a binary is accessible or not:

function have { type "$1" &>/dev/null ; }

Then I had to confront file paths, such as the MAILCAP variable. A lot of the same logic (i.e. add_to_path_force), but entry validation is different:

function add_to_path_file \
{
    local file="${2}"
    [ -f "$file" -a -r "$file" ] || return
    # realpath alias may not be set up yet
    file=`realpath_func "$file"`
    add_to_path_force "$1" "$file"
}

You'll note the realpath_func line in there. realpath is a program that takes a filename or directory name and resolves the symlinks in it. Unfortunately, realpath is a slightly unusual program; I've only ever found it on OSX (it may be on other BSDs). But, with the power of my pwd -P trick, I can fake most of it. The last little piece (resolving a file symlink) relies on a tool called readlink ... but I can fake that too. Here are the two functions:

function readlink_func \
{
    if have readlink ; then
        readlink "$1"
    #elif have perl ; then # seems slower than alternative
    #    perl -e 'print readlink("'"$1"'") . "\n"'
    else
        \ls -l "$1" | sed 's/[^>]*-> //'
    fi
}

function realpath_func \
{
    local input="${1}"
    local output="/"
    if [ -d "$input" -a -x "$input" ] ; then
        # All too easy...
        output=`( cd "$input"; \pwd -P )`
    else
        # sane-itize the input to the containing folder
        input="${input%%/}"
        local fname="${input##*/}"
        input="${input%/*}"
        if [ ! -d "$input" -o ! -x "$input" ] ; then
            echo "$input is not an accessible directory" >&2
            return
        fi
        output="`( cd "$input" ; \pwd -P )`/"
        input="$fname"
        # output is now the realpath of the containing folder
        # so all we have to do is handle the fname (aka "input)
        if [ ! -L "$output$input" ] ; then
            output="$output$input"
        else
            input="`readlink_func "$output$input"`"
            while [ "$input" ] ; do
                if [[ $input == /* ]] ; then
                    output="$input"
                    input=""
                elif [[ $input == ../* ]] ; then
                    output="${output%/*/}/"
                    input="${input#../}"
                elif [[ $input == ./* ]] ; then
                    input="${input#./}"
                elif [[ $input == */* ]] ; then
                    output="$output${input%${input#*/}}"
                    input="${input#*/}"
                else
                    output="$output$input"
                    input=""
                fi
                if [ -L "${output%%/}" ] ; then
                    if [ "$input" ] ; then
                        input="`readlink_func "${output%%/}"`/$input"
                    else
                        input="`readlink_func "${output%%/}"`"
                    fi
                    output="${output%%/}"
                    output="${output%/*}/"
                fi
            done
        fi
    fi
    echo "${output%%/}"
}

Loading System-wide Bashrc

This section isn't too exciting. According to the man page:

When bash is invoked as an interactive login shell, or as a non-interactive shell with the --login option, it first reads and executes commands from the file /etc/profile, if that file exists. After reading that file, it looks for ~/.bash_profile, ~/.bash_login, and ~/.profile, in that order, and reads and executes commands from the first one that exists and is readable.

SOME systems have a version of bash that appears not to obey this rule. And some systems put crucial configuration settings in /etc/bashrc (why?!?). And some systems even do something silly like use /etc/bashrc to source ~/.bashrc (I did this myself, once upon a time, when I knew not-so-much). I've decided that this behavior cannot be relied upon, so I explicitly source these files myself. The only interesting bit is that I added a workaround so that systems that use /etc/bashrc to source ~/.bashrc won't get into an infinite loop. There's probably a lot more potential trouble here that I'm ignoring. But here's the code:

if [[ -r /etc/bashrc && $SYSTEM_BASHRC != 1 ]]; then
    dprint " - loading /etc/bashrc"
    . /etc/bashrc
    export SYSTEM_BASHRC=1
fi

Behavioral Settings

This is basic stuff, but after you get used to certain behaviors (such as whether * matches . and ..), you often get surprised when they don't work that way on other systems. Some of this is because I found a system that did it another way by default; some is because I decided I like my defaults and I don't want to be surprised in the future.

The interactive-shell-detection here is nice. $- is a variable set by bash containing a set of letters indicating certain settings. It always contains the letter i if bash is running interactively. So far, this has been quite backwards-compatible.

shopt -s extglob # Fancy patterns, e.g. +()
# only interactive
if [[ $- == *i* ]]; then
    dprint setting the really spiffy stuff
    shopt -s checkwinsize # don't get confused by resizing
    shopt -s checkhash # if hash is broken, doublecheck it
    shopt -s cdspell # be tolerant of cd spelling mistakes
fi

Environment Variables

There are a slew of standard environment variables that bash defines for you (such as HOSTNAME). There are even more standard environment variables that various programs pay attention to (such as EDITOR and PAGER). And there are a few others that are program-specific (such as PARINIT and CVSROOT).

Before I get going, though, let me show you a secret. Ssh doesn't like transmitting information from client to server shell... the only reliable way to do it that I've found is the TERM variable. So... I smuggle info through that way, delimited by colons. Before I set any other environment variables, first, I find my smuggled information:

if [[ $TERM == *:* && ( $SSH_CLIENT || $SSH_TTY || $SSH_CLIENT2 ) ]] ; then
    dprint "Smuggled information through the TERM variable!"
    term_smuggling=( ${TERM//:/ } )
    export SSH_LANG=${term_smuggling[1]}
    TERM=${term_smuggling[0]}
    unset term_smuggling
fi

I begin by setting GROUPNAME and USER in a standard way:

if [[ $OSTYPE == solaris* ]] ; then
    idout=(`/bin/id -a`)
    USER="${idout[0]%%\)*}"
    USER="${USER##*\(}"
    [[ $USER == ${idout[0]} ]] && USER="UnknownUser"
    GROUPNAME="UnknownGroup"
    unset idout
else
    [[ -z $GROUPNAME ]] && GROUPNAME="`id -gn`"
    [[ -z $USER ]] && USER="`id -un`"
fi

Then some standard things (MAILPATH is used by bash to check for mail, that kind of thing), including creating OS_VER and HOST to allow me to identify the system I'm running on:

# I tote my own terminfo files around with me
[ -d ~/.terminfo ] && export TERMINFO=~/.terminfo/
[ "$TERM_PROGRAM" == "Apple_Terminal" ] && \
    export TERM=nsterm-16color

MAILPATH=""
MAILCHECK=30
add_to_path_file MAILPATH /var/spool/mail/$USER
add_to_path MAILPATH $HOME/Maildir/
[[ -z $MAILPATH ]] && unset MAILCHECK
[[ -z $HOSTNAME ]] && \
    export HOSTNAME=`/bin/hostname` && echo 'Fake Bash!'
HISTSIZE=1000
HOST=${OSTYPE%%[[:digit:]]*}
OS_VER=${OSTYPE#$HOST}
[ -z "$OS_VER" ] && OS_VER=$( uname -r )
OS_VER=(${OS_VER//./ })
TTY=`tty`
PARINIT="rTbgq B=.,?_A_a P=_s Q=>|}+"

export USER GROUPNAME MAILPATH HISTSIZE OS_VER HOST TTY PARINIT

I've also gotten myself into trouble in the past with UMASK being set improperly, so it's worth setting manually. Additionally, to head off trouble, I make it hard to leave myself logged in as root on other people's systems accidentally:

if [[ $GROUPNAME == $USER && $UID -gt 99 ]]; then
    umask 002
else
    umask 022
fi

if [[ $USER == root ]] ; then
    [[ $SSH_CLIENT || $SSH_TTY || $SSH_CLIENT2  ]] && \
        export TMOUT=600 || export TMOUT=3600
fi

if [[ -z $INPUTRC && ! -r $HOME/.inputrc && -r /etc/inputrc ]];
then
    export INPUTRC=/etc/inputrc
fi

It is at this point that we should pause and load anything that was in /etc/profile, just in case it was left out (and, if its in there, maybe it should override what we've done so far):

export BASHRCREAD=1

if [[ -r /etc/profile && -z $SYSTEM_PROFILE ]]; then
    dprint "- loading /etc/profile ... "
    . /etc/profile
    export SYSTEM_PROFILE=1
fi

Now I set my prompt (but only if this is an interactive shell). There are several details here (obviously). The first is that, if I'm logged into another system, I want to see how long I've been idle. So I include a timestamp whenever I'm logged into a remote system. I also added color to my prompt in two ways, which has been very useful. First, it changes the color of the $ at the end of the prompt to red if the last command didn't exit cleanly. Second, remote systems have yellow prompts, whenever I'm root I have a red prompt, and I created commands to flip between a few other colors (blue, purple, cyan, green, etc.) in case I find that useful to quickly distinguish between terminals. Anyway, here's the code:

if [[ $- == *i* ]]; then
    if [[ $TERM == xterm* || $OSTYPE == darwin* ]]; then
        # This puts the term information into the title
        PSterminfo='\[\e]2;\u@\h: \w\a\]'
    fi
    PSparts[3]='(\d \T)\n'
    PSparts[2]='[\u@\h \W]'
    PSparts[1]='\$ '
    PScolors[2]='\[\e[34m\]' # Blue
    PScolors[3]='\[\e[35m\]' # Purple
    PScolors[4]='\[\e[36m\]' # Cyan
    PScolors[5]='\[\e[32m\]' # Green
    PScolors[6]='\[\e[33m\]' # Yellow
    PScolors[100]='\[\e[31m\]' # Badc
    PScolors[0]='\[\e[0m\]' # Reset
    if [[ $USER == root ]] ; then
        PScolors[1]='\[\e[31m\]' # Red
    elif [[ $SSH_CLIENT || $SSH_TTY || $SSH_CLIENT2 ]] ; then
        PScolors[1]="${PScolors[6]}" # yellow
        if [[ $HOSTNAME == marvin ]] ; then
            PScolors[1]="${PScolors[5]}" # green
        fi
    else
        unset PSparts[3]
        PScolors[1]=""
    fi
    function bashrc_genps {
        if [ "$1" -a "${PScolors[$1]}" ] ; then
            PSgood="$PSterminfo${PSparts[3]}${PScolors[$1]}${PSparts[2]}${PSparts[1]}${PScolors[0]}"
        else
            PSgood="$PSterminfo${PSparts[3]}${PSparts[2]}${PSparts[1]}"
        fi
        PSbad="$PSterminfo${PSparts[3]}${PScolors[$1]}${PSparts[2]}${PScolors[100]}${PSparts[1]}${PScolors[0]}"
    }
    bashrc_genps 1
    function safeprompt {
        export PS1='{\u@\h \W}\$ '
        unset PROMPT_COMMAND
    }
    alias stdprompt='bashrc_genps 1'
    alias blueprompt='bashrc_genps 2'
    alias purpleprompt='bashrc_genps 3'
    alias cyanprompt='bashrc_genps 4'
    alias greenprompt='bashrc_genps 5'
    alias whiteprompt='bashrc_genps'
    # this is executed before every prompt is displayed
    # it changes the prompt based on the preceeding command
    export PROMPT_COMMAND='[ $? = 0 ] && PS1=$PSgood || PS1=$PSbad'
fi

Now I set up the various paths. Note that it doesn't matter if these paths don't exist; they'll be checked and ignored if they don't exist:

verify_path PATH
add_to_path PATH "/usr/local/sbin"
add_to_path PATH "/usr/local/teTeX/bin"
add_to_path PATH "/usr/X11R6/bin"
add_to_path PATH "$HOME/bin"
add_to_path_first PATH "/sbin"

add_to_path_first PATH "/bin"
add_to_path_first PATH "/usr/sbin"
add_to_path_first PATH "/opt/local/bin"
add_to_path_first PATH "/usr/local/bin"

if [[ $OSTYPE == darwin* ]] ; then
    add_to_path PATH "$HOME/.conf/darwincmds"

    # The XFILESEARCHPATH (for app-defaults and such)
    # is a wonky kind of path
    [ -d /opt/local/lib/X11/app-defaults/ ] && \
        add_to_path_force XFILESEARCHPATH \
            /opt/local/lib/X11/%T/%N
    [ -d /sw/etc/app-defaults/ ] && \
        add_to_path_force XFILESEARCHPATH /sw/etc/%T/%N
    add_to_path_force XFILESEARCHPATH /private/etc/X11/%T/%N
fi

verify_path MANPATH
add_to_path MANPATH "/usr/man"
add_to_path MANPATH "/usr/share/man"
add_to_path MANPATH "/usr/X11R6/man"
add_to_path_first MANPATH "/opt/local/share/man"
add_to_path_first MANPATH "/opt/local/man"
add_to_path_first MANPATH "/usr/local/man"
add_to_path_first MANPATH "/usr/local/share/man"

verify_path INFOPATH
add_to_path INFOPATH "/usr/share/info"
add_to_path INFOPATH "/opt/local/share/info"

And now there are STILL MORE environment variables to set. This final group may rely on some of the previous paths being set (most notably, PATH).

export PAGER='less'
have vim && export EDITOR='vim' || export EDITOR='vi'
if [[ -z $DISPLAY && $OSTYPE == darwin* ]]; then
    processes=`ps ax`
    # there are double-equals here, even though they don't show
    # on the webpage
    if [[ $processes == *xinit* || $processes == *quartz-wm* ]]; then
        export DISPLAY=:0
    else
        unset DISPLAY
    fi
fi
if [[ $HOSTNAME == wizard ]] ; then
    dprint Wizards X forwarding is broken
    unset DISPLAY
fi
export TZ="US/Central"
if [ "${BASH_VERSINFO[0]}" -le 2 ]; then
    export HISTCONTROL=ignoreboth
else
    export HISTCONTROL="ignorespace:erasedups"
fi
export HISTIGNORE="&:ls:[bf]g:exit"
export GLOBIGNORE=".:.."
export CVSROOT=kyle@cvs.memoryhole.net:/home/kyle/cvsroot
export CVS_RSH=ssh
export BASH_ENV=$HOME/.bashrc
add_to_path_file MAILCAPS $HOME/.mailcap
add_to_path_file MAILCAPS /etc/mailcap
add_to_path_file MAILCAPS /usr/etc/mailcap
add_to_path_file MAILCAPS /usr/local/etc/mailcap
export EMAIL='kyle-envariable@memoryhole.net'
export GPG_TTY=$TTY
export RSYNC_RSH="ssh -2 -c arcfour -o Compression=no -x"
if [ -d /opt/local/include -a -d /opt/local/lib ] ; then
    export CPPFLAGS="-I/opt/local/include $CPPFLAGS"
    export LDFLAGS="-L/opt/local/lib $LDFLAGS"
fi
if have glibtoolize ; then
    have libtoolize || export LIBTOOLIZE=glibtoolize
fi

One little detail that I rather like is the fact that xterm's window title often tells me exactly what user I am on what machine I am, particularly when I'm ssh'd into another host. This little bit of code ensures that this happens:

if [[ $TERM == xterm* || $OSTYPE == darwin* ]]; then
    export PROMPT_COMMAND='echo -ne "\033]0;${USER}@${HOSTNAME/.*/}: ${PWD/${HOME}/~}\007"'
else
    unset PROMPT_COMMAND
fi

Character Set Detection

I typically work in a UTF-8 environment. MacOS X (my preferred platform for day-to-day stuff) has made this pretty easy with really excellent UTF-8 support, and Linux has come a long way (to parity, as far as I can tell) in the last few years. Most of my computing is done via a uxterm (aka. xterm with UTF-8 capability turned on), but I also occasionally dabble in other terminals (sometimes without realizing it). Despite the progress made, however, not all systems support UTF-8, and neither do all terminals. Some systems, including certain servers I've used, simply don't have UTF-8 support installed, even though they're quite capable of it.

The idea is that the LANG environment variable is supposed to reflect the language you want to use and character set your terminal can display. So, this is where I try and figure out what LANG should be.

The nifty xprop trick here is from a vim hint I found. I haven't used it for very long, but so far it seems to be a really slick way of finding out what sort of environment your term is doing, even if it hasn't set the right environment variables (e.g. LANG).

One of the more annoying details of this stuff is that ssh doesn't pass LANG (or any other locale information) along when you connect to a remote server. Granted, there are good reasons for this (just because my computer is happy when LANG=en_US.utf-8 doesn't mean any server I connect to would be), but at the same time, shouldn't the remote server be made aware of my local terminal's capabilities? Imagine if I connected to a server that defaults to Japanese, but I want it to know that I use English! Remember how I smuggled that information through in TERM and stuck it in the SSH_LANG variable? Here's where it becomes important.

I've also fiddled with different variations of this code to make it as compatible as possible. So far, this should work with Bash 2.05b and up... though that makes it slightly awkward-looking.

As a final note here, I discovered that less is capable of handling multibyte charsets (at least, recent versions of it are), but for whatever reason it doesn't always support LANG and other associated envariables. It DOES however support LESSCHARSET...

Anyway, here's the code:

if [[ -z $LANG ]] ; then
    dprint no LANG set
    if [[ $WINDOWID ]] && have xprop ; then
        dprint querying xprop
        __bashrc__wmlocal=(`xprop -id $WINDOWID -f WM_LOCALE_NAME 8s ' $0' -notype WM_LOCALE_NAME`)
        export LANG=`eval echo ${__bashrc__wmlocal[1]}`
        unset __bashrc__wmlocal
    elif [[ $OSTYPE == darwin* ]] ; then
        dprint "I'm on Darwin"
        if [[ ( $SSH_LANG && \
            ( $SSH_LANG == *.UTF* || $SSH_LANG == *.utf* ) || \
            $TERM_PROGRAM == Apple_Terminal ) && \
            -d "/usr/share/locale/en_US.UTF-8" ]] ; then
            export LANG='en_US.UTF-8'
        elif [ -d "/usr/share/locale/en_US" ] ; then
            export LANG='en_US'
        else
            export LANG=C
        fi
    elif [[ $TERM == linux || $TERM_PROGRAM == GLterm ]] ; then
        if [ -d "/usr/share/locale/en_US" ] ; then
            export LANG='en_US'
        else
            export LANG=C # last resort
        fi
    else
        if [[ $SSH_LANG == C ]] ; then
            export LANG=C
        elif have locale ; then
            dprint "checking locale from big list (A)"
            locales=`locale -a`
            locales="${locales//[[:space:]]/|}" # not +() because that's slow
            if [[ en_US.utf8 == @($locales) ]] ; then
                export LANG='en_US.utf8'
            elif [[ en_US.utf-8 == @($locales) ]] ; then
                export LANG='en_US.utf-8'
            elif [[ en_US.UTF8 == @($locales) ]] ; then
                export LANG='en_US.UTF8'
            elif [[ en_US.UTF-8 == @($locales) ]] ; then
                export LANG='en_US.UTF-8'
            elif [[ en_US == @($locales) ]] ; then
                export LANG='en_US'
            else
                export LANG=C
            fi
            unset locales
        fi
    fi
else
    dprint "- LANG IS ALREADY SET! ($LANG)"
    if [[ $SSH_LANG && $SSH_LANG != $LANG ]]; then
        if [[ $SSH_LANG == C ]] ; then
            export LANG=C
        else
            dprint "checking locale from big list (B)"
            locales=`locale -a`
            locales="${locales//[[:space:]]/|}" # not +() because that's slow
            if [[ $SSH_LANG == @(${locales}) ]] ; then
                dprint "- SSH_LANG is a valid locale, resetting LANG"
                LANG=$SSH_LANG
            else
                dprint "- SSH_LANG is NOT a valid locale"
                wantutf8=no
                if [[ $SSH_LANG == *.(u|U)(t|T)@(f|F)?(-)8 ]] ; then
                    wantutf8=yes
                    if [[ ! $LANG == *.(u|U)(t|T)@(f|F)?(-)8 ]] ; then
                        dprint "- want utf-8, but LANG is not utf8, unsetting"
                        unset LANG
                    fi
                else
                    dprint "- don't want utf-8"
                fi
                if [[ ! $LANG || ! $LANG == @($locales) ]] ; then
                    if [ "$wantutf8" = yes ] ; then
                        dprint "- finding a utf8 LANG"
                        if [[ en_US.utf8 == @($locales) ]] ; then
                            export LANG='en_US.utf8'
                        elif [[ en_US.utf-8 == @($locales) ]] ; then
                            export LANG='en_US.utf-8'
                        elif [[ en_US.UTF8 == @($locales) ]] ; then
                            export LANG='en_US.UTF8'
                        elif [[ en_US.UTF-8 == @($locales) ]] ; then
                            export LANG='en_US.UTF-8'
                        elif [[ en_US == @($locales) ]] ; then
                            export LANG='en_US'
                        else
                            export LANG=C
                        fi
                    else
                        dprint "- finding a basic LANG"
                        if [[ en_US == @($locales) ]] ; then
                            export LANG='en_US'
                        else
                            export LANG=C
                        fi
                    fi
                fi
                unset wantutf8
            fi
            unset locales
        fi
    else
        dprint "- ... without SSH_LANG, why mess with it?"
    fi
fi
dprint - LANG is $LANG
unset LESSCHARSET
if [[ $LANG == *.(u|U)(t|T)@(f|F)?(-)8 ]] ; then
    export LESSCHARSET=utf-8
fi

Aliases

This is where a lot of the real action is, in terms of convenience settings. Like anyone who uses a computer every day, I type a lot; and if I can avoid it, so much the better. (I'm a lazy engineer.)

Sometimes I can't quite get what I want out of an alias. In csh aliases can specify what to do with their arguments. In bash, aliases are really more just shorthand — "pretend I really typed this" kind of stuff. Instead, if you want to be more creative with argument handling, you have to use functions (it's not a big deal, really). Here's a few functions I added just because they're occasionally handy to have the shell do for me:

function exec_cvim {
/Applications/Vim.app/Contents/MacOS/Vim -g "$@" &
}

function darwin_locate { mdfind "kMDItemDisplayName == '$@'wc"; }
if [[ $- == *i* && $OSTYPE == darwin* && ${OS_VER[0]} -ge 8 ]] ;
then
alias locate=darwin_locate
fi

function printargs { for F in "$@" ; do echo "$F" ; done ; }
function psq { ps ax | grep -i $@ | grep -v grep ; }
function printarray {
for ((i=0;$i<`eval 'echo ${#'$1'[*]}'`;i++)) ; do
    echo $1"[$i]" = `eval 'echo ${'$1'['$i']}'`
done
}
alias back='cd $OLDPWD'

There are often a lot of things that I just expect to work. For example, when I type "ls", I want it to print out the contents of the current directory. In color if possible, without if necessary. It often annoys me, on Solaris systems, when the working version of ls is buried in the path, while a really lame version is up in /bin for me to find first. Here's how I fix that problem:

# GNU ls check
if [[ $OSTYPE == darwin* ]]; then
    dprint "- DARWIN ls"
    alias ls='/bin/ls -FG'
    alias ll='/bin/ls -lhFG'
elif have colorls ; then
    dprint "- BSD colorls"
    alias ls='colorls -FG'
    alias ll='colorls -lhFG'
else
    __kbwbashrc__lsarray=(`\type -ap ls`)
    __kbwbashrc__lsfound=no
    for ((i=0;$i<${#__kbwbashrc__lsarray[*]};i=$i+1)) ; do
        if ${__kbwbashrc__lsarray[$i]} --version &>/dev/null ;
        then
            dprint "- found GNU ls: ${__kbwbashrc__lsarray[$i]}"
            alias ls="${__kbwbashrc__lsarray[$i]} --color -F"
            alias ll="${__kbwbashrc__lsarray[$i]} --color -F -lh"
            __kbwbashrc__lsfound=yes
            break
        fi
    done
    if [ "$__kbwbashrc__lsfound" == no ] ; then
        if ls -F &>/dev/null ; then
            dprint "- POSIX ls"
            alias ls='ls -F'
            alias ll='ls -lhF'
        else
            alias ll='ls -lh'
        fi
    fi
    unset __kbwbashrc__lsarray __kbwbashrc__lsfound
fi

Similar things are true of make and sed and such. I've gotten used to GNU's version, and if they exist on the machine I'd much rather automatically use them than have to figure out whether it's really called gnused or gsed or justtowasteyourtimesed all by myself:

if [[ $OSTYPE == linux* ]] ; then
    # actually, just Debian, but this works for now
    alias gv="gv --watch --antialias"
else
    alias gv="gv -watch -antialias"
fi
if have gsed ; then
    alias sed=gsed
elif have gnused ; then
    alias sed=gnused
fi
if have gmake ; then
    alias make=gmake
elif have gnumake ; then
    alias make=gnumake
fi

The rest of them are mostly boring, with one exception:

alias macfile="perl -e 'tr/\x0d/\x0a/'"
have tidy && alias tidy='tidy -m -c -i'
have vim && alias vi='vim'
alias vlock='vlock -a'
alias fastscp='scp -c arcfour -o Compression=no' # yay speed!
alias startx='nohup ssh-agent startx & exit'
alias whatlocale='printenv | grep ^LC_'
alias fixx='xauth generate $DISPLAY'
alias whatuses='fuser -v -n tcp'
alias which=type
alias ssh='env TERM="$TERM:$LANG" ssh'
have realpath || alias realpath=realpath_func
if have readlink ; then
    unset -f readlink_func
else
    alias readlink=readlink_func
fi
if [[ $OSTYPE == darwin* ]]; then
    alias top='top -R -F -ocpu -Otime'
    alias cvim='exec_cvim'
    alias gvim='exec_cvim'
fi

Did you note that ssh alias? Heh.

Tab-completion Options

Bash has had, for a little while at least, the ability to do custom tab-completion. This is really convenient (for example, when I've typed cvs commit and I hit tab, bash can know that I really just want to tab-complete files that have been changed). However, I won't bore you with a long list of all the handy tab-completions that are out there. Most of mine are just copied from here anyway. But I often operate in places where that big ol' bash-completion file can be in multiple places. Here's the simple little loop I use. You'll notice that it only does the loop after ensuring that bash is of recent-enough vintage:

completion_options=(
~/.conf/bash_completion
/etc/bash_completion
/opt/local/etc/bash_completion
)
if [[ $BASH_VERSION && -z $BASH_COMPLETION && $- == *i* ]] ;
then
    bash=${BASH_VERSION%.*}; bmajor=${bash%.*}; bminor=${bash#*.}
    if [ $bmajor -eq 2 -a $bminor '>' 04 ] || [ $bmajor -gt 2 ] ;
    then
        for bc in "${completion_options[@]}" ; do
            if [[ -r $bc ]] ; then
                dprint Loading the bash_completion file
                if [ "$BASH_COMPLETION" ] ; then
                    BASH_COMPLETION="$bc"
                fi
                #COMP_CVS_REMOTE=yes
                export COMP_CVS_ENTRIES=yes
                source "$bc"
                break
            fi
        done
    fi
    unset bash bminor bmajor
fi
unset completion_options

Machine-local settings

You'd be surprised how useful this can be sometimes. Sometimes I need machine-specific settings. For example, on some machines there's a PGI compiler I want to use, and maybe it needs some environment variable set. Rather than put it in the main bashrc, I just put that stuff into ~/.bashrc.local and have it loaded:

dprint checking for bashrc.local in $HOME
if [ -r "${HOME}/.bashrc.local" ]; then
    dprint Loading local bashrc
    source "${HOME}/.bashrc.local"
fi

Auto-logout

Lastly, it is sometimes the case that the TMOUT variable has been set, either by myself, or by a sysadmin who doesn't like idle users (on a popular system, too many idle users can unnecessarily run you out of ssh sockets, for example). In any case, when my time is limited, I like being aware of how much time I have left. So I have my bashrc detect the TMOUT variable and print out a big banner so that I know what's up and how much time I have. Note that bash can do simple math all by itself with the $(( )) construction. Heheh. Anyway:

if [[ $TMOUT && "$-" == *i* ]]; then
    echo '~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'
    echo You will be autologged out after:
    echo -e -n '\t'
    seconds=$TMOUT
    days=$((seconds/60/60/24))
    seconds=$((seconds-days*24*60*60))
    hours=$((seconds/60/60))
    seconds=$((seconds-hours*60*60))
    minutes=$((seconds/60))
    seconds=$((seconds-minutes*60))
    [[ $days != 0 ]] && echo -n "$days days "
    [[ $hours != 0 ]] && echo -n "$hours hours "
    [[ $minutes != 0 ]] && echo -n "$minutes minutes "
    [[ $seconds != 0 ]] && echo -n "$seconds seconds "
    echo
    echo ... of being idle.
    unset days hours minutes seconds
fi

dprint BASHRC_DONE

While I'm at it, I suppose I should point out that I also have a ~/.bash_logout file that's got some niceness to it. If it's the last shell, it clears sudo's cache, empties the console's scrollback buffer, and clears the screen. Note: DO NOT PUT THIS IN YOUR BASHRC You wouldn't like it in there.

if [ "$SHLVL" -eq 1 ] ; then
    sudo -k
    type -P clear_console &>/dev/null && clear_console 2>/dev/null
    clear
fi

And that's about it! Of course, I'm sure I'll add little details here and there and this blog entry will become outdated. But hopefully someone finds my bashrc useful. I know I've put a lot of time and effort into it. :)

Posted by Kyle Wheeler on March 13, 2008 12:51 PM | Permalink | Comments (0) | TrackBacks (0)

YAASI: Yet Another Anti-Spam Idea

Branden and I had an idea to help with the spam problem on our system, and it’s proven particularly effective. How effective? Here’s the graphs from the last year of email on my system. Can you tell when I started using the system?

If you want to see the live images, check here.

The idea is based on the following observations: certain addresses on my domain ONLY get spam. This is generally because they either don’t exist or because I stopped using them; for example, spammers often send email to buy@memoryhole.net. Branden and I also both use the user-tag@domain scheme, so we get a lot of disposable addresses that way. These addresses are such that we know for certain that anyone sending email to them is a spammer. Some of these addresses were already being rejected as invalid; some we hadn’t gotten around to invalidating yet.

By simply rejecting emails sent to those addresses, we were able to reduce the spam load of our domains by a fair bit, and the false-positive rate is nil. But we took things a step further: since spammers rarely send only one message, often they will send spam to both invalid AND valid addresses.

If I view those known-bad addresses as, essentially, honeypots, I can say: aha! Any IP sending to a known-bad address is a spammer, and I can refuse (with a permanent fail) any email from that IP for some short time. I started with 5 minutes, but have moved to an exponentially increasing timeout system. Each additional spam increased the length of the timeout (5 minutes for the first spam, 6 for the second, 8 for the third, and so on). Longer-term bans, as a result of the exponentially increasing timeout, are made more efficient via the equivalent of /etc/hosts.deny. I haven’t gotten into the maintaining-my-spammer-database much yet, but I think this may not be terribly important (I’ll explain in a moment).

One of the best parts of the system is that it is fast: new spammers that identify themselves by sending to honeypot addresses get blocked quickly and without my intervention. So far this has been particularly helpful in eliminating spam spikes. Another feature that I originally thought would be useful, but hasn’t really appeared to be (yet) is that it allows our multiple domains to share information about spam sources. Thus far, however, our domains seem to be plagued by different spammers.

Now, interestingly, about a week after we started using the system, our database of known spammers was wiped out (it’s kept in /tmp, and we rebooted the system). Result? No noticeable change in effectiveness. How’s that for a result? And, as you can see from the graph above, there’s no obvious change in spam blocking over the course of a month that would indicate that the long-term history is particularly useful. So, it may be sufficient to keep a much shorter history. Maybe only a week is necessary, maybe two weeks, I haven’t decided yet (and, as there hasn’t yet been much of a speed penalty for it, there’s no pressure to establish a cutoff). But, given that most spam is sent from botnets with dynamic IPs, this isn’t a particularly surprising behavior.

Forkit.org and memoryhole.net have been using this filter for a month so far. The week before we started using this filter, memoryhole.net averaged around 262 emails per hour. The week after instituting this filter, the average was around 96 per hour (a 60+% reduction!). Before using the filter, forkit.org averaged 70 emails per hour; since starting to use the filter, that number is down to 27.4 per hour (also a 60+% reduction). We have recorded spams from over 33,000 IPs, most of which only ever sent one or two spams. We typically have between 100 and 150 IPs that are “in jail” at any one time (at this moment: 143), and most of those (at this moment 134) are blocked for sending more than ten spams (114 of them have a timeout measured in days rather than minutes).

Now, granted, I know that by simply dropping 60% of all connections we’d get approximately the same results. But I think our particular technique is superior to that because it’s based on known-bad addresses. Anyone who doesn’t send to invalid addresses will never notice the filter.

The biggest potential problem that I can see with this system is that of spammers who have taken over a normally friendly host, such as Gmail spam. I’ve waffled on this potential problem: on the one hand, Gmail has so many outbound servers that it’s unlikely to get caught (a couple bad emails won’t have much of a penalty). Thus far, I’ve seen a few yahoo servers in Japan sending us spam, but no Gmail servers. On the other hand, as long as I simply use temporary failures (at least for good addresses), and as long as ND doesn’t retry in the same order every time, messages will get through.

I’ve also begun testing a “restricted sender” feature to work with this. For example, I have the address kyle-slashdot@memoryhole.net that I use exclusively for my slashdot.org account. The only people who are allowed to send to that email address is slashdot.org (i.e. if I forget my password). If anyone from any other domain attempts that address, well, then I know that sending IP is a spammer and I can treat it as if it was a known-bad address. Not applicable to every email address, obviously, but it’s a start.

It’s been pointed out that this system is, in some respects, a variant on greylisting. The major difference is that it’s a penalty-based system, rather than a “prove yourself worthy by following the RFC” system, and I like that a bit better. I’m somewhat tempted to define some bogus address (bogus@memoryhole.net) and sign it up for spam (via spamyourenemies.com or something similar), but given that part of the benefit here is due to spammers trying both valid and invalid addresses, I think it would probably just generate lots of extra traffic and not achieve anything particularly useful.

Now, this technique is simply one of many; it’s not sufficient to guarantee a spam-free inbox. I use it in combination with several other antispam techniques, including a greet-delay system and a frequently updated SpamAssassin setup. But check out the difference it’s made in our CPU utilization:

Okay, so, grand scheme of things: knocking the CPU use down three percentage points isn’t huge, but knocking it down by 50%? That sounds better, anyway. And as long as it doesn’t cause problems by making valid email disappear (possible, but rather unlikely), it seems to me to be a great way to cut my spam load relatively easily.

Posted by Kyle Wheeler on April 24, 2008 6:22 PM | Permalink | Comments (0) | TrackBacks (0)

Noodler Black Inks

I have been learning about fountain pens for a little while now, ever since being turned on to them by my parents and friends. My first fountain pen was a bit of a disappointing disaster, and I nearly wrote-off fountain pens entirely as a result (pun intended). I really like super-fine-tip pens for note-taking, which is most of what I do when I’m writing things out by hand. I like pens like the Pilot Precise V5 and similar needle-nose rollerball pens, because they’re fine-tipped and smooth, reliable writers. However, those pens can have problems with line quality - sometimes they leak a little too much ink on the paper, or sometimes when you’re writing quickly the line gets unusually thin or even skips, and so forth. What really impressed me about good fountain pens, when I finally found a fountain pen I love (a Pilot Namiki Vanishing Point with an extra-fine nib) is the line quality: very thin, and extremely consistent line width. And now that I’ve used it, I’m spoiled, and the tiniest inconsistencies that I get from other pens now annoy me. It’s like if you’ve gotten used to a 60Hz refresh-rate on your monitor; when you go back to a 30Hz refresh rate, the flicker is noticeable and annoying! Anyway, now that I have a pen I really love, I set about the next process of learning: ink!

Before we talk about ink, though, lets take a short detour into paper. Paper varies in quality a lot. Paper is generally made of some combination of plant pulp (e.g. wood pulp), cotton (or similar plant fiber), clay and other binding agents. Since the heart of paper is the fiber, which generally comes from plants, the main chemical component of paper is cellulose. The length of the fibers affects many of the important properties of the paper: short fibers are easier to come by (use more wood pulp, which helps make the paper inexpensive), but are more likely to get pulled out of the paper by a super-fine-tip pen or by a very wet ink (which makes the fiber swell a little and detach from their neighbors). Long fibers (e.g. with more cotton or similar) stay in the paper and generally make the paper tougher, but are more expensive to make (cotton is pricier than wood pulp). The clay and other binding agents help with the smoothness and brightness of the paper, and have an effect on ink penetration and propagation through the paper (they can limit it). Long-fiber paper is common when you need something more durable (e.g. many checks use long-fibers) or when you expect to use very fine-tip pens (e.g. it’s standard note-paper in places like Japan), whereas short-fiber paper is common when you don’t need the durability and want something a bit less expensive (it’s common in standard copy-paper, especially in the United States). This is why many fountain-pen fans often have strong opinions on the kind of paper they want to use, or will choose ink knowing what kind of paper they will use.

The first thing many people look for in ink is the color, and preferences are all personal. I really prefer black ink. There’s a formality and universality about it that I really like. However, not all black inks are created equal. There are all manner of considerations that I would not have initially thought of before I started educating myself. For example:

Blackness: Some vary from black to dark-grey, and this can be affected by the flow of the pen, the width of the nib, as well as the properties of the paper. You can even get blacks that have a little of some other color, like a blue-black or a green-black, to give your writing a subtle (or not-so-subtle) hint of being “special”.
Flow: This was one of the problems with my first fountain pen. The ink flow wasn’t consistent - sometimes it was quite wet and wrote well, sometimes it was dry and I had to tap and shake the darn thing to coax the ink to the tip of the nib. Pens that don’t consistently write when you want them to are really frustrating! I also once got a cheap fountain pen as a freebie that had the opposite problem: the ink would occasionally come out in a big droplet for no particular reason (I think this was mostly the pen’s fault, not the ink’s fault, though). Inks have an impact here, based on their viscosity, how much they stick to the inside of the pen, their surface tension (which affects wicking), and so forth. So-called “dry” inks often have a lower surface tension, which means they don’t wick as well. In practice, that means you can use-up the ink in the tip of the pen and it doesn’t draw more ink down to the tip. “Wet” inks have a higher surface tension, and so wick more readily. Which type of ink is better depends on your pen and your writing velocity.
Feathering: when ink enters the paper, it soaks into the paper. Depending on its viscosity and chemical reactions with the paper, it can wick along the fibers of the paper. The result somewhat depends on the paper - sometimes this means just a thicker line than you expected (common in short-fiber paper), sometimes it means that there are teeny tiny black lines stretching away from what you wrote (the length of these lines depends on the length of the fibers). It’s the latter that gives the effect the name “feathering”, but essentially it’s referring to the ink going someplace other than where you put it. Feathering goes in three dimensions as well - when it goes down into the paper, it’s sometimes called penetration or “bleeding”. The properties that cause feathering are often the properties that improve ink flow - the better the ink wicks to the tip of the pen, usually the better it soaks into the paper.
Drying time: This is of extra importance to left-handed writers, because they’re often dragging their hand across the page, but inks that take a while to dry result in smears for all kinds of reasons, or even mean they’ll soak into paper that’s laid on top of what you wrote (i.e. when you turn the page). Faster drying inks also tend to be thinner in viscosity, which means they often tend to feather more.
Permanence: how long will this color stay this color? Does it fade over time (e.g. does the pigment oxidize)? Lots of older inks turn grey or brown when left alone for a few decades. Does it fade in sunlight? Lots of inks are intended for writing in notebooks, and as such will be obliterated or altered by intense sunlight, such as on a sign or if you leave your notebook open on the windowsill.
Immutability: Can this ink be removed from the paper? For instance, if I spilled water on it, will the ink run off? Or, if someone was trying to wipe your ink off of a check, e.g. with bleach, isopropyl alcohol, acetone, or some other method, how successful would they be? Bleach is a common tool for so-called “check-washers”, and it’s remarkably good at removing a lot of inks, as if they had never been there.
Viscosity: Thicker inks may not flow as quickly, especially in a thin-tipped pen. This may be desirable, though, in a wide-tipped pen. Viscous inks may also take longer to dry, and viscosity can also reduce feathering or penetration of the paper. Also, viscosity can be used to keep the color very vibrant, by allowing you to lay down a thick layer of ink. Viscous inks have their place - the key is finding just the right balance for what you want to use it for.
Acidity: Acidic inks can often be more immutable, because they eat into the paper a little to thoroughly bond with it. However, acidic inks can also cause staining or even cause the paper to fall apart over the long term, which makes acidic ink unsuitable for archival purposes. Additionally, acidic ink can corrode the internals of the pen, including the nib. Most older inks, and even some modern inks, are at least a little acidic - that’s one reason quality fountain pen nibs are often made with (or are plated with) gold or stainless steel. Corrosion of the nib and the other metal pieces of a pen is a big consideration if you’re looking at buying older (used) fountain pens. Also, it can slowly degrade the rubber and soft plastic, like the cap seal or the ink bladder (in ink-bladder pens).
Lubrication: This affects several things, from how the pen “glides” over the paper to how smoothly the piston slides in the ink well of the pen (if you have a piston-based refill mechanism). The glide can be more of a personal preference thing - some people like their pen to glide over the paper like oil, some prefer a little bit (or a lot) of tactile feedback when writing. Personally, I used to be a fan of very smooth gliding over paper, but my fountain pen has just a little bit of tactile feedback and I’ve really grown to like that. I find it gives me just that little extra bit of control over the pen, and I miss it when using rollerball pens now.

The first ink I used in a fountain pen was Levenger Black, which came with the fountain pen I bought from them (an L-Tech 3.0). I really didn’t like the ink and pen combination - it had flow problems, as I mentioned (this was not the only thing I didn’t like about the pen, but was the least forgivable), and I got rid of the pen so quickly, I didn’t really test the other properties of the ink. I replaced it, by the way, with an L-Tech 3.0 rollerball, which has a nice needlenose tip refill that I really liked… until I fell for fountain pens. When I got my current (and favorite) pen, I used the Pilot ink that came with it: Pilot Namiki Blue. I refilled it with Pilot Namiki Black ink. Both of those inks work really nicely in that pen - excellent flow, very reliable, decent color. I don’t think I would have grown to like fountain pens nearly so much if that ink hadn’t been such nice ink.

But then, out of curiosity and because some of my friends had other inks, I began to investigate the options. Many pen companies (Waterman, Levenger, Pilot, Pelikan, Sailor, etc. etc.) all make inks as well, and folks have their favorites (Pelikan Blue, for instance, is a classic, well-regarded ink). There are also companies that only make ink: Diamine, J. Herbin, Private Reserve, and such. If you focus on black ink, though, you will inevitably come across one name: Noodler.

Noodler is purely an ink company, 100% made in the USA, all archival-quality (i.e. pH-neutral) and focused on value: they fill their ink bottles up to the tippy top, use whatever the cheapest glass bottle they can find is (so the bottle shapes tend to change from time to time), and even use their own ink for all of their labels. They’re quite popular among fountain-pen fans, and have a solid reputation for quality (can you tell I’m a fan?). When I started looking into them, I was bewildered by the breadth of colors they have, in particular, they have a bunch of different “black” inks, with almost no obvious explanation of why or what the differences are between them:
- Bulletproof Black (or simply “Black”)
- Heart of Darkness
- Borealis Black
- X-Feather
- Old Manhattan
- Bad Black Moccasin
- Black Eel
- Dark Matter
- El Lawrence
- Bernanke Black
- Polar Black
- Blackerase

That’s twelve different black inks! So, to help out the next guy looking at buying Noodler’s ink, here’s what I’ve gleaned - please correct me if I’ve gotten anything wrong.

Noodler’s Bulletproof Black

To my knowledge, this is the original Noodler ink. As you can read here they use the term “Bulletproof” to describe the ink’s immutability: it is water-proof, bleach-proof, UV-light-proof, etc. They use the term “Eternal” to describe the ink’s permanence: it doesn’t oxidize, and it doesn’t fade in UV light. This ink is designed to react with the cellulose in the paper, much like the way people die clothing, and as such is extremely immutable. All of their ink is pH-neutral, which means (among other things) it’s an archival-quality ink. This ink is also quite black, flows nicely, and raised eyebrows with how little it feathers, even on low-quality paper. It can seem to sit on top of the paper, rather than soak into it, which helps reduce the feathering. This ink is what, to my knowledge, Noodler built their reputation on. They even had an open challenge (for $1000) for a while, to see if anyone could find a way of erasing this ink from the page! (More on this in a moment.) As a result, the permanence and immutability of this ink is quite well-studied. It is sometimes regarded as THE benchmark for black inks, it is that consistent and that popular.

Noodler’s Heart of Darkness

As Noodler was making their name with their Black ink, some folks, inevitably, wanted it a bit darker. So, the brain behind Noodler set out to make an ink that was as dark black as he could possibly make it, while still being both Bulletproof (immutable) and Eternal (permanent). This is that ink: as black as could be engineered (at the time, anyway), and just as Bulletproof (immutable) and Eternal (permanent) as the original Black. Let’s not kid around: this is VERY black ink. It is a relatively quick-drying ink, and penetrates the paper more than the standard Bulletproof Black - which means it works better on shinier paper than Bulletproof Black, but also means it can feather or bleed more if you lay down a lot of ink. The feathering depends heavily on the paper and the wetness - in my experience (with an extra-fine nib), it feathers much less than the Pilot Namiki Black, and some find it feathers similarly to Bulletproof Black, but your experience will depend heavily on the nib/paper combination. It also flows quite well, which is important in extra-fine nibs. It’s my current favorite black ink. Its permanence and immutability isn’t as well-studied as Bulletproof Black, but is believed to be the same.

Noodler’s Borealis Black

This ink is the absolute blackest Noodler could make. It’s fashioned after traditional 1950’s inks that are EXTREMELY black. According to Goulet Pens, this was made to emulate an ink by the Aurora ink company, “Aurora Black”. In any event, it’s so black that multiple layers of the ink are just as black as a single layer. However, sacrifices had to be made to achieve this level of light-absorption (i.e. blackness). This ink is somewhat water-resistant, but is neither Bulletproof nor Eternal. It’s a “wet” writing ink, and takes a little longer to dry (so could be a bit “smeary” in practice). It also feathers more than the basic Bulletproof Black. However, if you want as absolutely black as possible, this is the ink you want.

Noodler’s X-Feather Black

This ink is specifically designed to feather as little as possible, even on very absorptive paper. Really, it’s Noodler showing off their mastery of the chemical properties of ink, because even their Bulletproof Black doesn’t feather much, and this feathers even less! It is more viscous than their other black inks, which makes it flow less well in particularly fine nibs (depends on the pen), and so is more popular for use with dip pens. It also dries quite slowly, comparatively speaking. It is fairly dark black - about the same as Bulletproof Black - and is also both Bulletproof and Eternal. However, because of the anti-feathering properties, this ink can be laid down quick thickly (or in multiple layers) to become VERY VERY black without becoming messy. As a result, this ink is particularly popular with calligraphers, who typically use pens that are quite wet (i.e. a very broad nib). If you don’t lay it down thickly, it is merely a very solid black.

Noodler’s Old Manhattan

This is an ink that Noodler doesn’t sell themselves - it’s exclusively made for a website called The Fountain Pen Hospital. This ink is reputed to be even blacker than Heart of Darkness, but not quite as black as Borealis Black (apparently being super super black is somewhat at odds with being bulletproof). It is supposed to be both Bulletproof and Eternal, however it likely makes a tradeoff in terms of its other properties (flow, feathering, etc.) to reach that additional level of blackness. Some have noted that this ink has sediment in it, and you need to shake it up before filling your pens. This sediment is bonded with the paper when it dries, but also bonds with your pen and may need a proper cleaning (with a cleaning solution, not just rinsing with water) to get it out again.

Noodler’s Bad Black Moccasin

I mentioned that there was a competition to try to erase the Bulletproof Black ink. A Yale scholar, Nicholas Masluk, actually found a way to do it, using carefully controlled lasers to blast it off of the cellulose in the paper (I believe this technique depends on knowing the precise makeup of the ink, so you use the exactly right wavelength of laser). This potential problem, naturally, needed a response, and this ink is that response. It is even MORE permanent than the standard Bulletproof Black, being impervious to lasers as well, and is essentially the same color. It dries more slowly than the standard Bulletproof Black, but flows faster. As a result, it feathers a bit more than Bulletproof Black. Actually, Noodler has created an entire line of laser-proof inks, all with a name beginning with “Bad” (e.g. Bad Belted Kingfisher, which is a green ink). Noodler calls this line of inks the “Warden” series. These inks are intended to be state-of-the-art in anti-forgery technology, so, among other things, they’re purposefully mixed with a slightly different recipe in every single bottle, to make it that much harder to forge and that much harder to remove (because the forger can’t know exactly what’s in it ahead of time).

Noodler’s Black Eel

This ink is in Noodler’s “eel” line, also sometimes referred to as Black American Eel, and is identical to Noodler’s Black with lubrication added in. This lubrication is intended to lubricate the piston in piston-refill pens, which would otherwise need to be dismantled and lubed on occasion. Many cartridge converters also use a piston design, and it’s good for that too. It is considered Bulletproof and Eternal, takes longer to dry as a result of the lube, but is otherwise identical to Bulletproof Black. The lube also affects the writing performance: it feels smoother going on the page. As I understand it, longer dry time doesn’t seem to increase the feathering of this ink relative to the Bulletproof Black, which is somewhat interesting.

Noodler’s Dark Matter

This is an ink that was formulated to replicate a special ink that was used by scientists at Los Alamos, New Mexico on all of their government documents during the Manhattan Project. The man behind Noodler was provided a bottle of the original ink and asked to replicate it, which he did, although he made the ink pH-neutral, and out of modern ingredients. It’s not really a black ink; more of a very dark grey (dark enough to be mistaken for black in thin lines). It’s also not considered Bulletproof or Eternal (it’s replicating a very old ink!), but is is water-resistant. In case you’re curious, part of the reason there was a special ink for Los Alamos during the Manhattan Project was so that the ink could be identified, authenticated, and traced if it showed up in random documents somewhere it shouldn’t have been (e.g. in the possession of Russian spies).

Noodler’s El Lawrence

This is another ink that Noodler doesn’t sell themselves - it’s exclusively made for a company called Goulet Pens. This ink has the color more of used motor oil: not quite black, a little bit green, a little bit brown. It is also a unique color because it fluoresces under UV light. It is considered Bulletproof and Eternal, but tends to stick to the pen a bit more, and so requires a bit more cleaning of the pen, especially when you change inks. I don’t know much about the flow or feathering of this ink.

Noodler’s Bernanke Black

This is a fast-drying ink (the label makes a point about needing to print money especially quickly without smearing). It achieves this by absorbing into the paper quickly, which means it’s quite “wet”, has really excellent flow, and consequently, that means it feathers quite readily, especially when laid on thickly. The color is about the same as Bulletproof Black.

Noodler’s Polar Black

The Noodler “Polar” line of inks is intended to work in extremely cold temperatures (i.e. less water content, and the ink won’t freeze unless it gets extremely cold). All of the “Polar” inks are based on “Eel” inks (lubricated inks), but with anti-freeze added as well as lube. Accordingly, this ink is based on Black Eel, which was based on Bulletproof Black. The anti-freeze thins the ink a bit, which means that, partially because of the lube-induced longer drying time, this ink feathers a bit, similar to Bernanke Black. It is considered Bulletproof and Eternal, and is the same color as the Bulletproof Black.

Noodler’s Blackerase

This is part of Noodler’s “erase” line of inks, which is intended to work on wet-erase whiteboards. It was originally done as an experiment, but has been popular enough to stick around. Essentially, it goes on, dries, and can be completely removed with a wet rag. It is a relatively “wet” ink, in that it penetrates well (it’s intended for being used in a felt-tip marker), and as a result feathers a fair bit. It is neither Bulletproof nor Eternal (obviously), but it is very black. It’s not recommended for use on paper, though of course you can.

If you poke around the internet, you will likely find people who have different impressions of the properties (flow, feathering, blackness, etc.) of these inks. I am sure that their experiences are accurate; the thing to keep in mind is this: everyone’s experience will be somewhat different because of variations in pen and paper. Additionally, Noodler’s ink is all handcrafted, so there can be slight variations in the effective properties of each ink from batch to batch (of course, they try to minimize these differences, except in the Warden series, but it happens). What I’m trying to explain here is why these inks were made, so you can understand what the purpose of each is, and what their key properties are.

That said, if you’re looking for a solid, well-behaved, very black, very permanent ink, Bulletproof Black is an excellent starting point.

As I see it (this is just my opinion), the mainstays of Noodler’s black ink offerings are their Bulletproof Black, the Heart of Darkness, and X-Feather. They’re all Bulletproof, they’re all Eternal, they’re all popular and generally well-behaved inks. Heart of Darkness flows faster and dries quickly and so is good for finer pens, drier pens or lefties, while X-Feather is good for very wet, wide pens, and Bulletproof Black is halfway in between - a good “all around” black ink. All three are very black; Heart of Darkness was intended to be the blackest, but the darkness you achieve depends on your paper and how much ink you lay down. They have some “special” inks that are intended to re-create special inks and very specific hues, like El Lawrence and Dark Matter. Then there are the inks that are designed to have specific unique properties - Bernanke Black dries extra fast, Bad Black Moccasin is even more Bulletproof than the rest (more than most people would realistically need), Black American Eel is specially lubricated (for those that want a smoother ink or that have problems with older, finicky piston pens), Polar Black won’t freeze (for those that need to operate in those conditions), Borealis Black is for pitch-black extremophiles, and Blackerase is for wet-erase markers. To achieve those special properties they have made sacrifices in the other properties of the ink (e.g. usability, permanence, and/or immutability), but that’s just the price you pay for those special properties. In practice, however, ALL of these inks are excellent, and with the exception of Dark Matter, are all very black inks. If they work well for you, in your pen and on the paper you use, there’s no reason not to use them. I wouldn’t necessarily recommend, for example, that someone using Bad Black Moccasin start using some other ink unless it was behaving in a way they didn’t like.

Personally, I use an extra-fine nib and I tend to write very quickly, so things like flow are very important considerations, while feathering is less likely (simply because my lines use so little ink). Because of that, and because I was attracted to the ultra-black intent, I started with the Heart of Darkness and I’ve been very happy - it works very well for me. It flows very well, dries quickly, and doesn’t bleed much at all, even on absorptive paper. For example, I use Levenger note cards for things like Todo lists. In my experience, Pilot Namiki Black feathered pretty badly on those cards: my nice thin lines doubled in width on those cards. Heart of Darkness, on the other hand, hardly feathers at all there. And on most paper Heart of Darkness looks a touch blacker than the Pilot ink, which I like (not that I was upset with the blackness of the Pilot ink!). However, some people find that Heart of Darkness bleeds too much for them - these people are likely using wider-nibs or wetter pens than I am, but maybe it’s different paper, or maybe they write more slowly than I do, or maybe they just have a super-low tolerance for feathering. In any event, if that is you, I would suggest that you go try Bulletproof Black or even X-Feather, because they have a reputation for not feathering. You could try others, like Bad Black Moccasin or Black American Eel, but they would likely feather just as much (maybe more) and those inks make trade-offs in other ways that might end up being just as annoying to work with. On the other hand, if you are interested in particular ink qualities, for instance if you’re particularly concerned about anti-forgery and want the extra protection that Bad Black Moccasin provides, then you really don’t have much of a choice: there’s only one black ink in Noodler’s arsenal that provides that property.

If what you’re after is simply the blackest ink Noodler makes, get Borealis Black. If you want the absolute blackest Bulletproof ink they make, get Old Manhattan. To get that black, though, you have to sacrifice something, such as permanence, immutability, feathering, drying time, or what-have-you.

It’s worth noting, in closing, that there are other inks out there that provide some of the properties that Noodler’s has made famous. For instance, Private Reserve Invincible Black is supposed to be “Bulletproof” as well, using a similar cellulose-reaction technology that Noodler’s Bulletproof inks do, and some people like various things about it better (e.g. its a little bit more lubricated, and so a little bit smoother - along the lines of Black American Eel - but the exact shade of black is likely different). Noodler’s is far from the only game in town. I’m not advocating Noodler’s exclusively, just trying to explain what I’ve learned about their multitude of black inks.

Posted by Kyle Wheeler on August 7, 2016 6:19 PM | Permalink | Comments (0) | TrackBacks (0)

Kyle

High catecholamine levels

Cool Stuff Archives

April 17, 2004

Breakthrough Motor!

October 9, 2004

In Defense of Macs

February 13, 2005