I came to the office this morning to discover that there were a lot of new referrers to posts on my weblog, people offering legal services, weight loss pills, discount travel, and other sorts of products relevant to software, business, Java, XML, and web services… I solved the problem, in the short term, by simply turning off the display of links to ostensibly referring sites.Referrer spam isn't a new problem, and there is already plenty being said about it.The heart of the motivation for referrer spammers, as well as the "heart of our [Google's] software" is PageRank, and ironically, most of the pages that are the targets of the spam referrer links have Google AdSense banners on them.
No Easy Way Out
It doesn't look like there is a naive solution. A quick look at the logs of the Apache proxy shows that blocking an IP address or range of IP addresses would be ineffective:[prb /var/log/httpd] grep lawyer access_log | awk '{print $1}' \
| sort | uniq | wc -l
135And this is because the spammers are using other people's machines to do their work:[prb /var/log/httpd] grep optinpr access_log | awk '{print $1}' \
| sort | uniq | xargs -n 1 dig +short -x
cache-ra08.proxy.aol.com
...
resnet172-51.plymouth.edu
...
gateway102.gsi.gov.uk
...
firewall.bassett.org
...
cf.nhyoko.med.navy.mil
...
dialup-67.30.253.211.Dial1.Atlanta1.Level3.net
...The majority of the addresses look like home DSL and cable modem users, which would normally cause me to attribute something like this to a network of compromised machines… but that's not the case.
Too Bad
I would like to be able to provide links back to people who link to my blog, and the first-guess reasonable thing appears to be validating that a link actually exists from the referring page. However, in these cases, there are links on the referring pages!Loading one of the pages shows a block of HTML like so:[...]
[...]Now I'll have to think about a long-term solution...
Short-Term / Immediate Solution
For similarly afflicted users of SnipSnap 0.4.2, the place to shut down the noise is org.snipsnap.render.filter.links.BackLinks, and there is a JAR attached to this snip that you can use to patch the installation. (Add the one class in the JAR to snipsnap-servlets.jar.)Although it's not a scalable solution, I've also modified the Apache proxy to deflect hits from undesired referrers:RewriteMap deflector txt:/path/to/deflector.mapRewriteCond %{HTTP_REFERER} !=""
RewriteCond ${deflector:%{HTTP_REFERER}} ^-$
RewriteRule ^.* %{HTTP_REFERER} [R,L]RewriteCond %{HTTP_REFERER} !=""
RewriteCond ${deflector:%{HTTP_REFERER}|NOT-FOUND} !=NOT-FOUND
RewriteRule ^.* ${deflector:%{HTTP_REFERER}} [R,L](Snipped from the Apache docs.)