I <3 Bots!
fuckthespam!

jump to

links

tags

viagra music funny pathetic enlargement interesting cats weight bottle funny nuddist credible reddit cat

popular

sponsor

email spam

hiding the email address

For sure, the best solution for not getting spam on your email address is not to giving it to anybody; but well, you may want to get contacted, whether it's for your website, or whatever. The solution I give right are kinda easy to use and garanty that for now you will reduce the number of spam bots that can read your email address, even if users always can.

using an image with your email inside

This solution is really famous; lot of webmaster decided to create the little images with the email inside. We can actually see this on Facebook and so on.
The mail problem is that it's usually not pretty and if you change your email address, you may need to redo some work.
A solution to maybe to use a remote script like this one where you input the Base64 (for example) of your email address:
The script is callable like that:
		
<img src="./tools/string_image.php?email=bmFtZUBkb21haW4uY29t&black" />
		
		
Well, this solution is okay but the OCR are getting better and faster. And the bots if they don't do this currently, should be able to get the email from images soon.

email address obfuscation

There are many ways to protect the address email and the obfuscation is one of them.

obfuscation using JavaScript

Obfuscating is generally using JavaScript (or whatever client-side scripting technology) in order to create a dynamic reconsitution of the email by the browser. Of course, we are assuming that our users have the technology activated.
Here is an example of the JavaScript obfuscation I use on my websites:
		
// For the email name@domain.com
function obfuscate_email() {
  email_start = "" + "name";
  email_end = "." + "com";
  email_middle = "dom" + "ain";
  return email_start + String.fromCharCode(64) + email_middle + email_end;
}
		
		
You can also use a tool to compress the JavaScript in order to have a stronger obfuscation like the following code which is packed version ofr the previous JavaScript function:
		
eval(function(p,a,c,k,e,r){e=function(c){return c.toString(a)};if(!''.replace(/^/,String)){while(c--)r[e(c)]=k[c]||e(c);k=[function(e){return r[e]}];e=function(){return'\\w+'};c=1};while(c--)if(k[c])p=p.replace(new RegExp('\\b'+e(c)+'\\b','g'),k[c]);return p}('b 5(){1=""+"6";0="."+"4";2="9"+"7";8 1+3.a(c)+2+0}',13,13,'email_end|email_start|email_middle|String|com|obfuscate_email|name|ain|return|dom|fromCharCode|function|64'.split('|'),0,{}))
		
		
Then, you can insert the your email like this:
		
<a href="javascript:void(0)" onclick="this.href='mailto:'+obfuscate_email();"onmouseover="window.status='mailto:'+obfuscate_email();return true;" onmouseout="window.status='';return true;">contact me</a>
		
		
Just remember that the most important is that your email address never appear in clear in the HTML source code.

obfuscation using the CSS

Another way of hidding the address email is to use the CSS. With a simple CSS line, you can reverse the order of the letters in the sentence. This trick is possible in HTML using the bdo (bi-directional object) tag:
		
<bdo dir="rtl">moc.niamod@eman</bdo>
		
		
This will print name@domain.com after the CSS rendered it.

obfuscation using unicode mirroring characters

In order to do the same as the previous idea, we can use some special mirroring characters (RLO & LRO): &#8238; and &#8237; These characters will mirror the text such as the following example:
		
&#8238;moc.niamod@eman&#8237;
		
		
This will also print name@domain.com.

obfuscation using encodings

We can also simply encode the characters into different HTML representation using encodings. For example, the letter 'e' can have different representation in HTML. It can first be 'e' itself, but you can also get the same result by encoding into decimal and hexadecimal HTML entities (note that you can also use URL Encoding if you are using a link) You can see the example bellow:
Letter Decimal Hexadecimal URL Encoding
e &#101; &#x65; %65
So, for example, name@domain.com can be represented by:
		
&#x6e;&#x61;&#x6d;&#x65;&#x40;&#x64;&#x6f;&#x6d;&#x61;&#x69;&#x6e;&#x2e;&#x63;&#x6f;&#x6d;
		
		
There are a couple of websites providing email encoding techniques such as Spam-Me-Not! and Mysterious Ways. I personnaly use h4k.in for encoding the variables and so on.

obfuscation using challenges

Finally, the more complex one but also defintely more secure since this is the only method that doesn't rely on the fact that the spam bots are not understanding the technology yet.
A service like reCAPTCHA proposes a special service for hiding emails. The idea is that you have to solve a CAPTCHA before you would be able to see the email. The email will never appears in clear in your HTML pages. The only problem with this method so far (and more generally, with the CAPTCHA) is the usuability. It's really bothering to solve CAPTCHAs and sometimes, it's not even easy...

fight the spam bots, protect your website

Spam bots are the type of bots that are crawling your website and trying to insert some advertising in your website. This section mostly concerns the webmasters using famous software such as WordPress, PhpBB, etc.

using a web service

A couple of web services are available to check the spam you can get on your website. Whether it's a free service or not, it works the same way: when a user sends you a comment, a trackback or something like that, you send these data to the web service which tells you back if this is a spam or not. You can find easily these services by searching the web, but a famous one and free is Akismet.

using a nonce to prevent basic automation

Some spam bots are really basic and have a for target a couple of web applications (knowing some vulnerabilities or not). There is an attack called Cross-Site Request Forgery (CSRF) which may be related to the spam bots. In fact, this vulnerability occurs when you're basically able to do a request like:
	
http://domain.com/action.php
(POST)name=Viagra&new_email=name%40domain.com&comment_content=spam...
	
	
The problem here is that with a single request you are able to perform an action on the website, here it's about sending a comment. This is what we call a CSRF because if I make you execute this under your session, you would be the one that have send this comment; see? Okay, the vulnerability is definitely more powerful than just sending spam, and is currently one of the most dangerous and widespread.
The idea to protect such vulnerability is to force the user to have a two-time interaction with the website. The basic solution is to generate a unique token (a nonce) when you create the form, then, when the form is submitted you verify that nonce on the server-side. The following example is really simple but aims to explain how such an implementation should work:
	
<?php
	// just a way of generating something...
	function n($rand_value) {
		 return sha1(session_id() . $_SERVER['REMOTE_ADDR'] . $rand_value);
	}
	// generating unique names and store it in the SESSION
	$rand_value = mt_rand();
	$_SESSION['RAND'] = $rand_value;
	$nonce = n($rand_value);
	// then, write the nonce in the HTML
?>
	
	
The generate HTML should be like this:
	
<form method="post" action="check.php">
	<input type="text" name="nonce" value="64dab0...8a39f" />
	<input type="text" name="username" />
	...
	
	
And when you check your form, you verify that the nonce is correct compared to the random value that you have in your session:
	
<?php
	// just a way of generating something...
	function n($rand_value) {
		 return sha1(session_id() . $_SERVER['REMOTE_ADDR'] . $rand_value);
	}

	if (isset($_POST['submit'])
	&& (isset($_POST['nonce']) && isset($_SESSION['RAND'])) 
	&& $_POST['nonce'] == n($_SESSION['RAND']))
	{
		// seems to be okay for the nonce
	}
?>
	
	
Whith this, and if there is no flaw in the process, you will be sure that the user/bot actually came to your website and sent the form you wrote with the nonce. Of course a bot can do that... but not every of them.

using a captcha to prevent automation

Like for the email obfuscation methods using a challenge, if you can stop the automation by asking the client some challenge, you are most likely to stop all bots for now. But keep in mind that a CAPTCHA can be weak, human can solve it for money (he may also solve it without knowing he is solving your CAPTCHA), etc. so if your website is important and if spammers really want to spam it, they will be able to.
This is really important to understand that the CAPTCHA may not do everything. Especially if at some point you trust some of your users by asking them to solve a CAPTCHA only when they register; a human can just register and send you thousand of spam before you would be aware of.

prevent bots with tricks

The main idea in this section is to prevent the bot of understanding how your system is working; if it does, it will be able to send spam. So we have different solution, like in the email obfuscation section, from creating dynamically the forms, shuffle input parameter names, using CSS to add input that would unvalidate the client and so on. Most of the given solution are assuming that the human has JavaScript/VBScript enable and that the spam bot is not able to understand scripting as well as CSS.

create your forms dynamically

Here is the fact: if the spam bot which is not executing the JavaScript in your webpage doesn't see the form, then it will not be able to use it to send spam.
So, a possible solution is to use the client-side scripting (JavaScript) to render your form. Let's look at this JavaScript:
		
formPos  = document.getElementById("formPosID");
postForm = document.createElement("form");
postForm.method = "post";
postForm.action = "http://domain.com/action.php";
postForm.setAttribute("name", "MyPostForm");

// create an input
inputName = document.createElement('input');
inputName.type = "text";
inputName.name = "name";
// create the submit button
inputSubmit = document.createElement('input');
inputSubmit.type = "submit";

//now add the input to the DOM.
postForm.appendChild(inputName);
postForm.appendChild(inputSubmit);
formPos.appendChild(postForm);
		
		
This script will only create a POST form which sends a name variable to http://domain.com/action.php; the only constrain is to add in your HTML page a specific location to add this form such as:
		
<div id="formPosID"></div>
		
		

shuffle input parameter names, server-side solution

Some spam bots are almost smart. They are able to do multi-steps action on your website; I had once to manage a PhpBB 2.x forum where there was lot of spam bots (~20 a day) coming, registering and posting their spam in random sections. I spent some time looking at their behavior, they spent random time between the actions, sometimes did some random action (read a random topic) and so on. Then, it is really difficult to block them looking at their behavior.

The question now, is how a bot is able to do that. You should know how such a bot is working in order to block it. The spam bot have specific script locations to look at (post_comment.php and so on) this is a builtin knowledge that it has. If the bot targets PhpBB forums or whatever specific platform, it knows where the important scripts are, what are the parameter names and what they mean; so are able to play around this.

A possible solution for this problem is to shuffle at the input parameters name. The idea is not to do it once, but everytime you are loading the page. There are many way to perform such an operation but here is a possible solution using PHP assuming that the you have to protect a form with only username and password fields. The scripts assume that everything goes well and shouldn't be use like that in a real website.
		
<?php
	// just a way of generating something...
	function n($rand_value, $prefix) {
		 return sha1($prefix . session_id() . $_SERVER['REMOTE_ADDR'] . $rand_value);
	}
	// generating unique names and store it in the SESSION
	$rand_value = mt_rand();
	$_SESSION['RAND'] = $rand_value;
	$passwordFieldName = n($rand_value, "pass");
	$usernameFieldName = n($rand_value, "user");
?>
		
		
When you are getting the sent value you only need to reconstruct the names and look for them with something like this:
		
<?php
	// just a way of generating something...
	function n($rand_value, $prefix) {
		 return sha1($prefix . session_id() . $_SERVER['REMOTE_ADDR'] . $rand_value);
	}
	// you got something
	if (isset($_POST['submit']) && (isset($_SESSION['RAND']) && $_SESSION['RAND'] !== 0)) {
		$rand_value = $_SESSION['RAND'];
		$passwordFieldName = n($rand_value, "pass");
		$usernameFieldName = n($rand_value, "user");

		if (isset($_POST[$passwordFieldName]) && isset($_POST[$usernameFieldName])) {
			// do your actions...
			

		}
		// set as 0 in order to block the next try of the bot
		// if it wants to flood you
		$_SESSION['RAND'] = 0;
	}
?>
		
		
This solution may be considered as good since it will generate new names everytime the page is reloaded. But it's true that it can be tough to add this in an existing software...

shuffle input parameter names, client-side solution

So, facing the fact that server-side modification can really be tough since you may have to change the scripts, templates etc. So one way to go over it would be to make the modification only in the template (HTML code).
As always, JavaScript or client-side scripting helps. A solution I choose is to reverse the name of the input parameters in the HTML code, then, reverse it with a JavaScript before submission. Here is the source code of this:
		
<script>
String.prototype.reverse = function() { return this.split('').reverse().join(''); };
function reverseNames() {
	formElement = document.forms[0].elements;
	for(var i = 0; i < formElement.length; i++)	{
		formElement[i].name = formElement[i].name.reverse();
	}
	formElement.submit();
}
</script>

<form method="post" action="check.php" onsubmit="reverseNames()">
	<label for="emanresu">&#8238;emanresu&#8237;</label> <input type="text" name="emanresu" />
	...
		
		
Basically, we are pluging in our usual form the following code onsubmit="reverseNames()" with the associated script. As we also want to write username in our HTML, I use the reverse trick using special characters.
When the form will be submitted by the user, the reversNames function will be executed and then, will reverse the names of the input parameters.

This solution is really simple and is based on the fact that bots wouldn't understand this code; it's actually easy for them to spot that username or the reversed variant is in the webpage. Then, another solution would be not to reverse, but use a more complicated version such as Sven Vetsch's using the Vigenere cipher: jsVigenere.

dynamically insert an input parameter in the form

Even if this solution is really basic, this is the simplest that uses JavaScript and seems to be efficient enough to stop all the spam bots I had on my PbpBB2 forum.
The idea is simply to use a document.write in order to insert an input tag into the form. This input tag will contain a key which will be sent by users that can execute JavaScript. Here is a basic implementation:
		
<script>
// the key I use here is "My_KEY"
function generateKey() {
	var middle = "_K";
	return "My" + middle + "EY";
}
</script>

<form method="post" action="check.php">
	<script>
	document.write("<input type='hidden' name='persoCaptcha' value='" + generateKey() + "' />");
	</script>
	<label for="username">username</label> <input type="text" name="username" />
	...
		
		
On the server-side you only need to verify that the input persoCaptcha has been sent with the form and that the value is your "key". Another way of doing this is with the following code:
		
<script>
var inputInserted = false;
function addInput() {
	var W3CDOM = (document.createElement);
	function addInput() {
		if (!W3CDOM || inputInserted)
			return;
		// create the input form
		var hiddenInput = document.createElement('input');
		hiddenInput.type = "hidden";
		hiddenInput.name = "persoCaptcha";
		hiddenInput.value = generateKey();
		//now add the input to the DOM.
		document.forms[0].appendChild(hiddenInput);
		inputInserted = true;
	}
}
</script>

<form method="post" action="check.php">
	<label for="username">username</label> <input type="text" name="username" onfocus="addInput()"/>
	...
		
		
The two versions above are giving the same result.

hiding an input using CSS

As a bot doesn't have the graphical representation, it is not aware of the graphical content if it has not been in his conception (such as builtin OCR etc.). Anyway, another solution for unvalidating a bot which send a lot of content would be to add a field in the HTML, hiding it with CSS. If the hidden information has been changed, then a bot did it. Here is a simple example:
		
<form method="post" action="check.php">
	<input style="display:none" type="hidden" name="changeMeBot" value=""/>
	<label for="username">username</label> <input type="text" name="username" value=""/>
	...
		
		

server-side bot detection using htaccess

Another type of protection I've seen so far is using a black-listing in order to detect the bots and redirect them to an error page. They wouldn't be able to access your website at all.
Ronald van den Heetkamp has a list of them in his .htaccess. If you don't understand the syntax here is an example with some explanation:
	
# Writing the condition
# If the User-Agent contains libwww-perl, something common with the bots
RewriteCond %{HTTP_USER_AGENT} ^libwww-perl [OR]
...
RewriteCond %{HTTP_USER_AGENT} ^Python-urllib [OR]
# If the User-Agent has been detected, then we redirect the bot 
# to the page /bot_spotted.php
RewriteRule ^(.*)$ /bot_spotted.php
	
	
The main problem is that it's really easy to change the User-Agent while scripting. So, the spam bot may have change their User-Agent in order to make it like a typical browser. This solution is good anyway to prevent Google,MSN,Yahoo! bots from crawling your website if you want.
>