
![]() | ![]() |
Stopping The Dark Side
Preventing Session SEO Spam attacks

by John Hughes
of http://www.oyster-web.co.uk
Last updated: 19 Dec 2006
In our previous article The Dark Side Attacks, Chance Hoggan suggested a way that you could theoretically damage a competitor's Search Engine listings. Of course, we don’t condone such behaviour, but what would stop a competitor using such a technique against you?
Well, in short, there is nothing to stop a competitor trying this technique against you, but there is a way to protect yourself.
First of all, we have to say that Google is getting better at ignoring session ids in URLs, although it is far from comprehensive in doing so. We expect there to be a time not too far away in the future when Google successfully recognises session ids more often than not, even if they are spoofed as they are in the dark-side technique. Therefore, we expect the dark-side technique to have a relatively limited life-span.
In the meantime however, here is a relatively simple way to protect your site from this kind of URL spamming. This technique will in fact protect you from a wider version of this technique which could use randomly made up URL variables rather than just session ids.
Firstly, you need to list all of the URL variables that your site currently uses. For this example, let’s say your site uses the variables "page", "offset", and "results". Therefore a URL might look like:
http://www.site.com/product.php?page=45&offset=22&results=10
Using the dark-side technique, a competitor might try to duplicate your content on your own site by linking to an extended version of the URL, for example:
http://www.site.com/product.php?page=45&offset=22&results=10
&PHPSESSID=h32b0832h893he94rp98473
Next, create a file (we’re assuming PHP here, however similar principles apply in other coding languages). Lets call the file "darksidestopper.php". This file should have an array containing all your variable names used in URLs, and the following code to check if there are others in the URL:
<?
$urlVars=array("page","offset","results");
// replace this with your site variables
$newQS="";
// this variable builds the real query string
foreach($urlVars as $value){
// this loop checks the values of the
// real query string against that requested
if(strlen($_GET[$value])>0){
if($newQS=="") {
$newQS = $value . "=" . $_GET[$value];
} else {
$newQS .= "&" . $value . "=" . $_GET[$value];
}
}
}
if($newQS!=$QUERY_STRING){
// this conditional redirects the file with a
// 301 Redirect if the wrong query is requested
header( "HTTP/1.1 301 Moved Permanently" );
if(strlen($newQS)>0){
header( "Location: " . $SERVER['SCRIPT_NAME']
. "?" . $newQS . " )";
} else {
header( "Location: " . $SERVER['SCRIPT_NAME']
. " )";
}
}
?>The code has the added benefit of ensuring that the URLs used in links are always used in the order you specify in the array, which solves another common problem with query-string driven websites and Supplemental Results in Google.
The final step is to make sure that you include the darksidestopper.php script at the start of all your web pages.
ADDENDUM
Depending on the version and precise configuration of PHP you have installed, the above code might need tweaking to fit your circumstances. If it does not work, the most likely explanation is that the way PHP has configured server variables is slightly different (some people might say slightly wrong, but who am I to judge!) If this is the case, try using $_REQUEST or $HTTP_GET_VARS instead of $_GET.







Bookmark this page with: