Index and Root Pages Merged by Google
Posted: April 5th, 2008
It’s taken awhile, but Google has finally merged the index page with the root page of a domain as long as the content was the same. That is, Google now interprets www.abc.com and its counterpart www.abc.com/index.html to be one page, rather than two separate pages. For you PageRank fanatics, this does implicitly mean that PageRank won’t be split amongst the two pages.
The merging could be the result of the duplicate content filter going to work, but for now there hasn’t been any official comment on the matter.
Inadvertently creating duplicate content through the naming structure of a website has always been an issue, and SEOs have had to carefully handle these situations with whatever they might have in their toolkit. Apache mod_rewrites come into play here, alongside careful inspections of how you link to your internal pages.
So, I don’t have to worry about all of this anymore, right? Wrong.
Don’t delete that .htaccess file just yet! Keep in mind that Google has once merged www.randomdomain.com and randomdomain.com but then later decided to separate them, and that this index/root grouping could be only temporary as well.
*If you aren’t redirecting your domain’s www <-> non-www versions through an .htaccess file, you can still nudge Google to see them as the same through Webmaster Tools.*

Nice article. My mind is a bit confused about this .htaccess file, what we have to write on it to have a better results from google Spiders and rank meter ?
As crazy as this sounds, I can see why Google is having problems with both the http://www.* and */index.* issue, especially given how each function works.
With DNS, http://www.* is a unique host so http://www.domain.com and domain.com are considered completely different hosts. So, keep the .htaccess. I prefer to redirect to the non-www variation of my domain, but that’s a matter of personal preference.
With the /index.* issue, it all depends on how your web server is configured. It’s a fairly simple matter to tell your web server to treat any filename as the directory index before index.*. So, let’s say I want a splash page before my main site index. I would simply tell my server to treat /splash.* as the directory index before index.*. In any directory where there is both a splash.* and an index.*, the splash.* will be treated as the directory index before index.*. If no splash.* is found, the server then treats index.* as the directory index.
Sadly, I’ve seen both of the above situations enough times to where you cannot discount them as possibilities. Clearly, this has to be a headache for Google.
@Blogging on my Way:
Check out this post to learn more about server-side redirection: HTTP 301 Permanent Redirect Codes.
@JMorris:
As you can tell from this blog, I prefer the www variant, but like you said, it’s really just a personal preference (call me traditional, lol).
Thanks for the insight about splash pages. Fortunately I haven’t had to deal with them from an SEO point of view, but I know there are still splash pages out there.