Click to Play

SES: Focus On Call To Action
After going to all the trouble of getting users to your site, you don’t want your landing page to turn them away. According to Tim Ash of SiteTuners.com...

Web News

New security database angers many in France
A new French security database that could track anyone deemed a "possible threat to public order" — even minors as young as 13 — has outraged privacy crusaders and put France's conservative government on the defensive. Critics have collected some 130,000...

Alcatel-Lucent receives Mobile WiMAX certification...
Alcatel-Lucent has been awarded the WiMAX Forum Certified seal of approval for its WiMAX 802.16e Compact Base Station at 2.5GHz. Alcatel-Lucent's infrastructure portfolio is made up of products that strictly comply with the 802.16e-2005 standard (also called Rev-e)...

Five, France 5 pick up Noddy series
IP company Chorion has signed two major broadcast deals with Five in the UK and France 5 for the new series of Noddy. [SB decode wk 1 tues] Noddy In Toyland is currently in production and will broadcast in spring...

France Telecom test electronic newspaper with Figaro...
France Telecom is testing a new electronic newspaper with the French newspaper industry called Read & Go. Seven French publications have joined France Telecom for the test, which is intended to provide a convincing facsimile of its traditional counterpart. 120 people in France...


09.09.08

Robots.txt File Disallowed Pages Still Gain PageRank

By Bill Hartzer

In a previous blog post, I talked about duplicate content and search engine optimization: and how it's important to fix duplicate content.

I personally prefer to completely remove all interior links to web pages rather than adding a "disallow" to them in the robots.txt file. Why?

According to Matt Cutts of Google, even though you stop the crawlers from indexing a web page, that web page can still accrue PageRank. Let's take a look what Matt Cutts said in this old interview:

Now, robots.txt says you are not allowed to crawl a page, and Google therefore does not crawl pages that are forbidden in robots.txt. However, they can accrue PageRank, and they can be returned in our search results.

It is important to note that even if a web page that is not allowed to be crawled by the search engine, it can still show up in the search results. An example of this would be a web page that has external links (links from another web page going to that web page. If that's the case, even though Google is told not to crawl the page it can still show up in the search results.)

It is also important to note that even if a web page is not allowed to be crawled by the search engine, it can still accrue PageRank. But, that web page can still pass PageRank to another web page. In fact, even if the page cannot be crawled (disallow in the robots.txt file) and even if a "no index" meta tag is added on the page, the page can still accrue PageRank and it can still pass PageRank.

Try Noteworthy Hosted Email

If you do not want a web page to pass PageRank then you need to use the nofollow attribute.

A while back, my friend Aaron Wall talked about getting your blog out of Google's Supplemental index, which, in my opinion, has to do with duplicate content on your blog. Unfortunately, the default installation of blogs and most blog themes create all sorts of duplicate content. If you want your blog posts to rank well in the search engines, you need to take a look at removing the duplicate content from your blog. The archives and tags pages of your blog tends to be duplicate content. You need to take care of that on your blog.

There are two ways to do this:

1. Remove the links to the duplicate content
2. Disallow them in the robots.txt file

If you choose option two, which is to simply "disallow" the duplicate content from being crawled, those pages can still accrue and pass PageRank. If you disallow your blog archives pages from getting crawled you also need to make sure that you add noindex to the pages. Also, make sure that you add the appropriate nofollow tags to links, as well.

I prefer to remove links to the archives and the tags and any other pages that I believe are duplicates: so they cannot be crawled and people won't link to them.

So, even if you plan a strategy to "optimize" your blog by disallowing pages on your site to the robots.txt file, it is important to consider the fact that the page can still accrue and pass PageRank on to other web pages. If someone can still "get to" that web page in their web browser, you might also consider adding a noindex meta tag to the page and adding the appropriate nofollow tags on your site.

Comments


About the Author:
Bill Hartzer manages the Search Engine Marketing and Social Media Marketing team at Vizion Interactive, a leading search engine marketing, social media marketing, and web design firm based in the Dallas, Texas area. Hartzer recently joined Vizion Interactive, where his vast experience in the both search engine marketing and social media marketing bolster’s Vizion Interactive’s already robust search engine marketing and social media marketing offerings.
About WebProNewsFrance
The French edition of WebProNews is designed to keep French Internet professionals up to date on the latest news and trends in the online world. Stay up to date with WebProNewsFrance. Your source for news, commentary and expert tutorials designed to help your online business efforts succeed..





WebProNewsFrance is brought to you by:

WebProNews.com Jayde.com
MarketingNewz.com SalesNewz.com
CareerNewz.com InvestNewz.com
eCommNewz.com WebsiteNotes.com
AdvertisingDay.com ManagerNewz.com
SoHoDay.com CRMNewz.com






-WebProNewsFrance est un publication de iEntry, Inc.
iEntry, Inc. 2549 Richmond Rd. Lexington KY, 40509
© 2008 iEntry Inc. All Rights Reserved Privacy Policy Legal

archives | advertising info | news headlines | free newsletters | comments/feedback | submit article


News and Views for Internet Professional in France WebProNewsFrance News Archives About Us Feedback WebProNewsFrance Home Page About Article Archive News Downloads WebProWorld Forums Jayde iEntry Advertise Contact