Version 1.5 can be downloaded here. This is a Perl script that extracts URLs from correctly-encoded MIME email messages. This can be used either as a pre-parser for urlview, or to replace urlview entirely.
Urlview is
a great program, but has some deficiencies. In particular, it
isn't particularly configurable, and cannot handle URLs that have
been broken over several lines in format=flowed
delsp=yes email messages. Also, urlview
doesn't eliminate duplicate URLs. This perl script handles all of
that. It also sanitizes URLs so that they can't break out of the
command shell.
This is designed primarily for use with the mutt emailer. The idea is that if you want to access a URL in an email, you pipe the email to a URL extractor (like this one) which then lets you select a URL to view in some third program (such as Firefox). An alternative design is to access URLs from within mutt's pager by defining macros and tagging the URLs in the display to indicate which macro to use. A script you can use to do that is tagurl.pl.
Mandatory (these usually come with Perl):
Optional:
--version and --listThis perl script expects a valid email to be piped in via STDIN. Its STDOUT can be a pipe into urlview (it will detect this). Here's how you can use it:
cat message.txt | extract_url.pl
OR
cat message.txt | extract_url.pl | urlview
For use with mutt 1.4.x, here's a macro you can use:
macro index,pager \cb "<enter-command> unset pipe_decode<enter><pipe-message>extract_url.pl<enter>" "get URLs"
For use with mutt 1.5.x, here's a more complicated macro you
can use:
macro index,pager \cb "<enter-command> set my_pdsave=\$pipe_decode<enter>\
<enter-command> unset pipe_decode<enter>\
<pipe-message>extract_url.pl<enter>\
<enter-command> set pipe_decode=\$my_pdsave<enter>" "get URLs"
Here's a suggestion for how to handle encrypted email:
macro index,pager ,b "<enter-command> set my_pdsave=\$pipe_decode<enter>\
<enter-command> unset pipe_decode<enter>\
<pipe-message>extract_url.pl<enter>\
<enter-command> set pipe_decode=\$my_pdsave<enter>" "get URLs"
macro index,pager ,B "<enter-command> set my_pdsave=\$pipe_decode<enter>\
<enter-command> set pipe_decode<enter>\
<pipe-message>extract_url.pl<enter>\
<enter-command> set pipe_decode=\$my_pdsave<enter>" "decrypt message, then get URLs"
message-hook . 'macro index,pager \cb ,b "URL viewer"'
message-hook ~G 'macro index,pager \cb ,B "URL viewer"'
It's not perfect, but it works for me.
If you're using it with Curses::UI (i.e. as
a standalone URL selector), this perl script will try and figure
out what command to use based on the contents of your
~/.urlview file. However, it also has its own
configuration file (~/.extract_urlview) that will be
used instead, if it exists. So far, there are eight kinds of lines
you can have in this file:
COMMAND ...%s, which will be
replaced by the URL inside single-quotes. If it does not
contain a %s, the URL will simply be appended to
the command. If this line is not present, the command is
assumed to be "open", which is the correct command
for MacOS X systems.SHORTCUTNOREVIEWPERSISTENTextract_url.pl will exit. If you would
like it to be ready to view another URL without re-parsing the
email (i.e. much like standard urlview
behavior), add this line to the config file.IGNORE_EMPTY_TAGSHTML_TAGS ...a, applet, area,
blockquote, embed, form,
frame, iframe, input,
ins, isindex, head,
layer, link, object,
q, script, and xmp tags
for links. If you would like it to examine just a subset of
these (e.g. you only want a tags to be examined),
merely list the subset you want. The list is expected to be a
comma-separated list. If there are multiple of these lines in
the config file, the script will look for the minimum set of
specified tags.ALTSELECT ...extract_url.pl will quit
after the URL viewer has been launched for the selected URL.
This key will then make extract_url.pl launch the
URL viewer but will not quit. However, if
PERSISTENT is specified in the config file, the
opposite is true: normal selection of a URL will launch the URL
viewer and will not cause extract_url.pl to exit,
but this key will. This setting defaults to 'k'.DEFAULT_VIEW {url|context}extract_url.pl shows a list of URLs.SHORTCUT COMMAND mozilla-firefox -remote "openURL(%s,new-window)" HTML_TAGS a,iframe,link ALTSELECT Q DEFAULT_VIEW context
All URLs have any potentially dangerous shell characters (namely a single quote and a dollar sign) removed (transformed into percent-encoding) before they are used in a shell. This should eliminate the possibility of a bad URL breaking the shell.
Here's what it looks like for a standard email:

If a URL is too big for your terminal, when you select it,
extract_url.pl will (by default) ask you to review
it in a way that you can see the whole thing. Here's what that
looks like:

ALTSELECT), allowing a person to, in
effect, temporarily negate the PERSISTENT
setting.multipart/alternative parts
don't actually have an alternative, which could fool this
script. Now they're handled correctly (normal text/plain
have 0 parts, according to the MIME parser).