.comment-link {margin-left:.6em;}

Sometimes I Wish That It Would Rain Here

Wednesday, January 21, 2009

fabulous

I recently came across some brilliant creative work: spam poetry. these are poems constructed entirely from the subject lines of spam emails the poet receives. the dates on the website give the impression that the work is a couple years old, but I just discovered it recently.

here are two of my favorite stanzas. the first is from "Sometimes I worry":

grammarian and under
funded
education is dead
in America.

I wonder if the poet looks for subject lines that include punctuation, like that ending period, or if she adds punctuation as needed. the other one I like is from "全国総合出会いセンターよりお知らせです。":

To everybody in Columbus
An apology -
The cookies are coming....

RUN FOR YOUR LIFE!!!!!!!!!!!!!

possibly the best poem currently on the front page is "Your secret?" which must be read in its entirety to appreciate it fully.

while there are myriad reasons that this work is fabulous, rather than engaging in an intellectual diatribe (and since I've still much more data to analyze today), I'll just leave it for you to enjoy.

Labels: , , ,

Monday, January 12, 2009

an intentional error?

while looking up email addresses for some faculty in my department, I noticed an interesting anomaly. if I simply click on the faculty member's email address, it opens my email client with a new email properly addressed. however, if I right click, select "Copy Email Address," then paste it in the To field for a new email, there is a leading "%20", the HTML code for a space. I initially thought this might be just a typo, that there was an extra space in the mailto tag, but the character appear on all the email addresses. perhaps this is some sort of counter-spam effort.

looking closer at the HTML source, it does seem to be counter-spam. for example, the link to email Richard Taylor shows up as follows:

<a href="'&#109;&#097;&#105;&#108;&#116;&#111;
&#058;%20taylor&#64;ics&#46;uci&#46;edu'">taylor&
#64;ics&#46;uci&#46;edu</a>

to the average human eye, this looks mostly like a bunch of mumbo-jumbo, but effectively it's encoding the mailto tag using ASCII code points, swapping the letters for numbers. thus, if someone looks directly at the HTML source, as most spam harvesters do, it doesn't look like an email address. even if someone does try to automatically harvest it, there's an extra space in there. however, when the browser renders it, everything looks normal, and you can even click on the address and your email client will automatically remove the extra leading space for you.

what I don't quite understand is how or whether this actually impedes spam harvesters. if a browser can render the above coding into a meaningful email address, why can't a email-harvesting bot do the same? do most harvesting bots just go for low-hanging fruit rather than trying to decode obfuscated email addresses? is it just a matter of adding one more layer of resistance? or is there something intrinsically difficult about having a bot resolve the above HTML to a meaningful email address?

Labels: , ,