Wednesday, December 02, 2009

Browsing as Googlebot to circumvent paywalls

Interesting technique from a Slashdot article and commenter darthflo:

Most 'papers like Google and the visitors Google sends them; so the Google Bot and hits with a Referer tend to get a free pass. Use this to your advantage:

  • Google the Article's URI, click the link and off you go (with a real Google referer).
  • If it's not indexed yet and you're using Opera: Go to any Google page, press Ctrl + U, change any one link's href to the article's URI, click "Save Changes", click the link and off you go (with a fake Google referer. This works for any fake referer, by the way).
  • If they're picky, they mightn't let hits from Google through but still allow the Google bot to index their pages. Change your User-Agent accordingly. In Firefox, go to about:config and change general.useragent.extra.firefox to Googlebot 2.1 and off you go (as Googlebot).
  • As a last resort, there's quite a few ad-blocking personal proxies out there. Most of them allow you to fake Referers or change User-Agents, for any browser.