Canonical Link Element Mistakes
Duplicate content can be a proper nightmare. The Canonical Link Element was introduced to help webmasters and site owners stop duplicate urls from getting indexed in Google.
Matt Cutt’s talks about the Canonical Link Element in the following video and blogged about it here.
So how does Google treat Canonical Mistakes?
In Matt’s video he mentions that the presence of the Canonical Link Element in the page’s code is a strong hint as to whether or not a url should be indexed or not. If a webmaster accidentally makes a mistake then, Matt says ‘we don’t promise we will abide by this 100%’ and Google reserves the ‘right to do what we think best’ for the user. So if Google thinks there’s been a mistake, that a webmaster has accidentally messed up, the page(s) may still be indexed.
So here’s how the 3 engines treated a Canonical ‘Shoot yourself in the Foot’ Mistake.
- Website with 30 pages of content, the canonical element with http://www.thewebaddress.com/ inserted into a header so that it was on every page of the site.
- Every page had quality and unique content with images. The site had unique titles and meta descriptions.
- There was a standard navigation and good internal linkage.
- The site had links and page rank.
So what happened?
Google clearly didn’t realise that this was a mistake , it only indexed 1 page, the homepage that had the canonical url in place. Google continually crawled the site over a 2-3 month period but only indexed the homepage. Yahoo and Bing indexed all 30 or so pages. After a while I got bored, when I realised Google wasn’t going to figure this out itself and removed the canonical element and the site’s pages got indexed.