Wordpress Duplicate Content Prevention with robots.txt
To prevent the Wordpress Duplicate Content in Google that may arise when you use any versions of Wordpress, here is the typical content of the robots.txt file:
User-agent: *
Disallow: /comments/feed/
Disallow: /feed/
Disallow: /feed/atom/
Disallow: /feed/rss/
Disallow: /rss/
Disallow: /trackback/
Disallow: /wp/
Disallow: /*/comments/feed/$
Disallow: /*/feed/$
Disallow: /*/feed/atom/$
Disallow: /*/feed/rss/$
Disallow: /*/rss/$
Disallow: /*/trackback/$
The above code is based on the following assumptions:
- The WordPress address (URL) is in the form http://www.yoursite.com/wp
- The robots.txt file is located at http://www.yoursite.com/robots.txt
- The above robots.txt file content is to be used with the AdFlex Wordpress theme








Hey VK,
Another question. I’ve added the lines in my robots.txt since some days but I haven’t seen any changes yet?
How long does it take to be effective?
Do I need to add other things somewhere else?
Mike
Hi Mike,
It’s taken into account by Google and other SE during their next web spider visit.
Now if you have already current items in your SERPs that you want to remove, you’ll have to use the URL remove tool as given by Google, Yahoo and MSN.
As I suppose you’re interested mainly by Google as it’s currently the only search engine that may give you some duplicate content if similar items are in your SERPs.
For Google, you need to use the Google URL Remover tool (also referred as Google Automatic URL Removal System). It’s located at URL: http://services.google.com/urlconsole/controller
This Google tool is not as robust as other Google tools you may know but it does the job in most of the case. It happens it’s down time to time. It’s also not as secure as other Google tools as the Google’s Remove URL tool is accessible with HTTP and not HTTPS.
The process is simple:
1) After you’ve added the content to exclude in your robots.txt file, connect to the URL:
http://services.google.com/urlconsole/controller
2) There, just create an account to use the tool. You’ll get an e-mail that you’ll need to confirm.
3) After confirming the e-mail for the account creation, just connect to the Google URL Remover service.
4) Select the option “Remove pages, subdirectories or images using a robots.txt file.”
5) In the field “URL to your robots.txt”, type the URL to your robots.txt. For instance: http://www.yoursite.com/robots.txt
6) Click on the button “Remove Pages”
In about 2 days, you can check your SERPs by check if Google has already done the URL removal request by doing a Google site command. The Google site command means you do a Google search for the following string:
site:www.yoursite.com
Of course, you have to replace www.yoursite.com by your site name but don’t put http:// there.
You’ll see that the bad, not nice, crappy URLs that were in your SERPs will be removed and you’ll get a much nicer SERPs when doing the Google site command.
Now some notes:
a) - Be really sure to put the right content in the robots.txt file because when performing using the Google Automatic URL Removal System, the URLs to be removed will be removed out of Google index for 6 months!
b) - As said previously the Google Automatic URL Removal System at URL http://services.google.com/urlconsole/controller is often down. If it’s the case, you can also use the other URL http://services.google.com:8882/urlconsole/controller that is more often up.
The second URL version use port 8882 and it means you may not be able to access it when behind a company firewell that blogs such non-standard port number.
Of course a direct connection to Internet will work fine most of the time.
Hope it helps.
Cheers,
VK
How big is the problem with duplicate content in wordpress. I’ve never seen any huge problems with it.
Its really not so much a problem as it is just holding back your site from geting higher rankings, more targeted visitors, and less supplementals..
Check this new updated version:
http://www.askapache.com/seo/updated-robotstxt-for-wordpress.html
[…] prevent the Wordpress Duplicate Content in Google that may arise when you use any versions of Wordpress, here is the typical content of the […]
Nice wordpress tips. I’d never thought about the duplicates like that