Site Language Affected by PDF Files
Posted: June 21st, 2008
Imagine having a website that is written entirely in English and is targeted towards an English-speaking community. All your incoming links are from other English websites, and even your geographic targeting setting in Webmaster Tools is set to Australia, Canada, the United States, or the United Kingdom. However, your website is labeled as Spanish by Google, and shows up in the SERPs for Spanish keywords.
An incident similar to this occurred with a website about Vidal Sassoon. Despite not having any Spanish words on any of the pages, the site was showing up for Spanish keywords, which was "severely affecting the site’s rankings."
It turns out that the Spanish was contained within PDF files from the website, and Googlebot’s crawling of them led to the incorrect interpretation of the site’s language.
What if this happens to you?
If the PDFs are trivial and inconsequential to your site, then just delete those files altogether. If the PDF documents do bring value to your visitors, then the best way to tackle this issue is to use your robots.txt file to prevent crawling of the PDF documents. After awhile, these files should be removed from the search index and no longer affect your site’s language.
Tags: Adobe, Google, indexing, languages, PDF
I had no idea that could happen, thanks for cluing me in, I will need to be more careful in the future.
That would be kind of funny to have your site keywords targeted in a different language. I will remember that for the future.
Yes, Me too had no idea this could happen. Thanks for the info.