The Homepage of tagurl.pl

Version 1.0 can be downloaded here. This is a Perl script that tags URLs in correctly-encoded MIME email messages. It generates a file that can be sourced by the mutt email client to define numerical macros to access those URLs.

Why?

Urlview is a great program, and my extract_url.pl script works very well, but both require a multiple-step process and neither show you the context of each link as you select it.

Dependencies

Mandatory (these usually come with Perl):

How to use it

This perl script expects a valid email to be piped in via STDIN. It can either be used as a prefilter for entries in your ~/.mailcap file, or as a $display_filter. If you use it as a prefilter, that works well with HTML email, but can be a bit annoying for plain text email. However, if you use it as a $display_filter, it can only tag URLs that are displayed as URLs, so that's not a good option for HTML email. Here's how you can test it:

        cat message.html | tagurl.pl -o muttmacros > taggedmessage.html

The file muttmacros should then contain a bunch of mutt-compatible macros for opening the first nine URLs in the HTML message.

Here's an example mailcap entry:

        text/html; cat %s | tagurl.pl -o ~/.muttmacros | elinks -dump -force-html; copiousoutput

Then all you have to do is get mutt to load the macros. You have to make sure that the file has been generated first, so something like <push> is necessary. For example: (see Known Problems)

        message-hook . 'push <enter-command>source ~/.muttmacros<enter>'

Using this script as a display filter is also really easy:

        set display_filter="tagurl.pl -o ~/.muttmacros"

The trick is in using it as a display filter for text/plain messages, but turning it off for text/html messages. Here's one way to do it: (see Known Problems)

                message-hook "~h 'Content-Type: [tT][eE][xX][tT]/[pP][lL][aA][iI][nN]'" \
                        'set display_filter="tagurl.pl -o ~/.muttmacros"'
                message-hook "~h 'Content-Type: [tT][eE][xX][tT]/[hH][tT][mM][lL]'" \
                        'unset display_filter'

The script has several arguments that it will accept:

Config File

You can specify what command to use to view a URL by putting it in the ~/.tagurl file. So far, there are two kinds of lines you can have in this file:

Here is an example config file:
COMMAND mozilla-firefox -remote "openURL(%s,new-window)"
MACROFILE ~/.muttmacros

Known Problems

Many messages are mis-labelled, un-labelled, or are simply too complex for the message-hook examples given above. I don't know any really bulletproof way to toggle the display_filter setting, so you're pretty much on your own there. If anyone comes up with a good way, please let me know! (For what it's worth, I currently prefer using the extract_url.pl script for this very reason.)

The message-hook suggested above that uses push to source the muttmacros file hits the very old bug #1365 in mutt. The problem stems from the fact that message-hooks get triggered by the <save-message> command. Thus, anything pushed by a message-hook ends up being entered as the filename for any <save-message>, completely breaking this latter command. The best solution at the moment (that I know of) is to abandon sourcing the tagurl-generated macro file and instead use another external script to open the URLs. The script extracts the nth URL from the tagurl-generated file (where n is passed to the script as an argument), then calls a browser command to open the URL. With that script, instead of the above message-hook, all that's required is to bind the number keys in .muttrc to macros which call this new script, passing the appropriate argument, for example:

macro pager 01 "<shell-escape>tagurl_open 1<enter>"
macro pager 02 "<shell-escape>tagurl_open 2<enter>"
...

The tagurl_open script can be very simple. Here is an example. If tagurl were modified to simply spit out URLs instead of macros, it could be even simpler:

#!/bin/bash
COMMAND="firefox"
MACROFILE="$HOME/.tagurl.macro-output"
url=$(sed -n "$1 {s/.*'\([^']*)'.*/\1/; p}" "$MACROFILE")
$COMMAND "$url"

Security

All URLs have dangerous shell characters (namely a single quote and a dollar sign) removed (transformed into percent-encoding) before they are used in a shell. This should eliminate the possibility of a bad URL breaking the shell.