Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It always fascinates me that nobody ever brings up regular expression in these kinds of discussions.

Spammers literally sit around all day figuring out ways to deliver more spam, I'm fairly certain they've spent the 30 minutes it takes to craft a regular expression to harvest the easy 80% of these 'obfuscated' e-mails.



Sending spam is a means; the goal is to have people click on ads, buy stuff, download malware, etc.

The distribution of the probability that users click an ad is highly skewed. I suspect that it has probabilities close to zero for people who use "these 'obfuscated' e-mails".

If that is the case, spamming those users does not make economical sense.


Sure, it's a means.

So is scraping - a means to sell big lists of e-mail addresses to people who buy them for many thousands of dollars, with little regard to where they came from - only that they receive e-mail.


I was going to come in to post this. Regular expressions are the first thing I think of when it comes to this, and making a pattern to match [AT] instead of @ is just a few keystrokes away. Make up a few permutations of popular replacements and you've just defeated 90% of email obfuscation in 15 minutes.

And yet I see seasoned programmers who should ostensibly know a little about regex using this kind of obfuscation all the time!


"I don't have to outrun the bear, just you."

So long as you're in the other 20%, you'll probably do well.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: