What has you here today?    work history (html) about me tajik bookmarks

Wasabi … Perl and RFC2047. 5th of March, 2006 POST·MERIDIEM 03:32

I saw http://​www.​imdb.​com/​title/​tt0281364/​ (‘Wasabi—The Japanese Dip That Kicks Like a Mule’) last night, a Bittorrented copy on the recommendation of random internet people, and my, was the dubbing terrible. (“It’s not the former West?” Of course it’s not, Mr. Harry-Callahan’s-boss-transplanted-to-Paris) The film itself wasn’t bad; very kitsch, and Jean Reno wasn’t as sympathetic as I’ve found him in the past, but his daughter (in the film) is hot, and the occasional extremely ridiculous touch of the plot is diverting.

Here’s some Perl you can use with MIME::Parser to transform your email headers from the RFC 2047 inline-charset-plus-quoted-printable-or-base64 ugly mess, into UTF-8, praise be on its name. The MIME::Parser people just punted, bless their monolingual hearts.

use MIME::Words qw/decode_mimewords/;
use Text::Iconv;
use Carp;
 
my $iconv_cache_to_utf_8 = {};
 
sub decode_mime_to_utf_8 {
    my $enmimed = shift;
    my $concatted = "";
    my $from_charset = "";
    my @decoded = decode_mimewords $enmimed;
         
    for (@decoded) {
        if (defined $_->[1]) {
            $from_charset = lc $_->[1];
 
            # The MIME name for the character set isn't supported on my machine.
            $from_charset = "cp1252"  if $from_charset eq "windows-1252";
 
             unless (defined $iconv_cache_to_utf_8->{$from_charset}) {
                $iconv_cache_to_utf_8->{$from_charset} = 
                    Text::Iconv->new($from_charset, 'utf-8');
                Carp::carp "Couldn't get a Text::Iconv handle for $from_charset"
                        unless defined $iconv_cache_to_utf_8->{$from_charset};
            }
 
            $concatted .= $iconv_cache_to_utf_8->{$from_charset}->
                convert($_->[0]);
        } else {
            $concatted .= $_->[0];
        }
    }
    return $concatted;
}

 [No extant comments for this entry.]

Comments are currently disabled.