The Apache server provides directory-level configuration via .htaccess files. This file can override Apache default configuration and change it for the local directory.

If you are not a lazy blogger, you may be intrested in some tips I recently discovered to optimize your .htaccess file in order to have better search engine position, avoid spam comments and protect your content.

Redirection

search engines see http://www.mapelli.info and http://mapelli.info as two different sites… this is bad for two reasons:

  1. search engines penalize sites with duplicated content, removing some (if not all) the duplicated pages
  2. some sites will link you as www.yoursite.com and other as yoursite.com, this is bad because your pagerank and your link popularity will be halved

to avoid this, you can simply redirects all the request from http://www.yoursite.com to http://yoursite.com or viceversa adding some directives to your webroot .htaccess file.

Use the following code:

RewriteEngine On
RewriteCond %{HTTP_HOST} !^www.domain.com [NC]
RewriteRule ^(.*)$ http://domain.com/$1 [R=301]

Explanation:

RewriteEngine On

activate the rewrite engine (that is, the ability to change the requested url to something else)

RewriteCond %{HTTP_HOST} !^www.domain.com$ [NC]

this say that the rewrite action (specified in the RewriteRule line) should be applied if the file requested does not (that’s the !) start with www.domain.com (that’s the ^). The [NC] says to check in case insensitive mode

RewriteRule ^(.*)$ http://domain.com/$1 [L,R=301]

this says that each request matching the RewriteCond should be rewritten as follow: put the string that starts with (that’s the ^) any string (that’s the .*) and then finish (that’s the $) in the first variable (called $1), then rewrite as http://domain.com/$1 and to redirect using 301-Moved permanently (that’s the R=301) and to stop applying rules from .htaccess (L for Last). This means a request directed to http://www.domain.com/foo will be redirected to http://domain.com/foo

Spam Blocking

wp-comments-post.php protection

In wordpress when a user posts a comment the file wp-comments-post.php is accessed.
The normal user post the comment from one of your blog’s page, sending an inside referral (i.e. the page that took the user to wp-comments-post).
A spammer access directly the wp-comments-post.php file, having no referral or an outside (not from your domain) referral. You can use this difference to block spam comments via .htaccess. If you don’t use wordpress you have to change the file name to the one that fits for you, but the tecnique can still be used.
Here’s the code

RewriteEngine On
RewriteCond %{REQUEST_METHOD} POST
RewriteCond %{REQUEST_URI} .wp-comments-post\.php*
RewriteCond %{HTTP_REFERER} !.*yourdomain.com.* [OR]
RewriteCond %{HTTP_USER_AGENT} ^$
RewriteRule (.*) ^http://%{REMOTE_ADDR}/$ [R=301,L]

Here’s the explanation:

RewriteEngine On

activate the rewrite engine

RewriteCond %{REQUEST_METHOD} POST

if the request method is POST

RewriteCond %{REQUEST_URI} .wp-comments-post\.php*

if the request uri (the page requested) is [single character or nothing]wp-comments-post.php[anything]

RewriteCond %{HTTP_REFERER} !.*yourdomain.com.* [OR]

if the referrer is not in your domain or (the [OR] flag does an or with the next rule)

RewriteCond %{HTTP_USER_AGENT} ^$

if the user agent is empty

RewriteRule (.*) ^http://www.somsite.com/$ [R=301,L]

redirect to somesite.com

Tor servers blocking

The tor network is a nobile thing… but it’s often used by spammers to run spambots.
I would not recommend this, but if you really need to, you can block the entire tor proxies network using the tor blacklist (just copy the content of the file into your .htaccess file)

Protection

Ip banning

say you want to block a spammer that use always the same ip…

deny from 192.168.0.1

this is gonna deny access to 192.168.0.1 . Note that you can use also 192.168.0.* to ban an entire class of addresses, or 192.168.0.1/20 to ban a subnet using subnet mask.

Deny .htaccess access

this can be used to prevent .htaccess file access

<Files .htaccess>
order allow,deny
deny from all
</Files>

This way all requests to .htaccess file will return a 403 error code (Access denied).

Stop hotlinking

If you don’t want other sites to link directly to your images on your server, you can redirect the png/jpg request to a particular image (saying something like “this site is trying to steal my images”) with code like this:

RewriteEngine On
RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain\.com/.*$ [NC]
RewriteRule .*\.(jpg|png)$ http://www.yourdomain.com/thief.jpg [R,NC,L]

Here’s the explanation

RewriteCond %{HTTP_REFERER} !^http://(www\.)?yourdomain\.com/.*$ [NC]

this says that this rule should be applied if the referrer does not start with http://www.yourdomain.com or http://yourdomain.com (case insensitive)

RewriteRule .*\.(jpg|png)$ http://www.yourdomain.com/thief.jpg [R,NC,L]

this says that requests ending with .jpg or .png (not case sensitive) should be redirected to yourdomain.com/thief.jpg and that this will be the last rule to be applied (the L flag).

Resources

some useful resources

francesco mapelli