Banner
Which will make the most impact on search in 2012?
 
RDFa:  The Inside Story from Best BuyRDFa: The Inside Story from Best Buy
An Interview with Jay Myers Late last year, I approached Jay Myers, Lead Development Engineer for Best Buy, and asked him...
Google Rankings and LDAGoogle Rankings and LDA
The new LSI for 3rd Generation SEOs? Or is it the new LSI for engineers? Well, possibly both, we're not...
How does Google handle reviews and sentiment?How does Google handle reviews and sentiment?
Do you remember the whole 'Borker Affair'? You know, that fella that went on the NYT bragging about...

By A Web Design

Jump Menu

Latest Articles

Googles Algorithm Leaked! - A What If StoryGoogle's Algorithm Leaked! - A...
As a kid, I read a lot of comics. And, to clarify “a lot”, we’re...
Read More >>
SEO Excuse #143; The other guy is doing  it, so it must workSEO Excuse #143; The other...
So my friends... here we sit. After many a Google update it seems people...
Read More >>
Digital Barter Could Fill Knowledge GapsDigital Barter Could Fill Knowledge...
One of the recent changes I've noticed both internally within agencies but also with...
Read More >>

Latest Comments

Join Us

Banner

SNC Authors

The Importance and Power of Launching an SEM Labs Program

Many companies work hard to develop and optimize their paid search campaigns. And when those campaigns...Read more >>

Useless Noise: SEO and the Dirty Web

I love SEO. I really do. I enjoy every nook, cranny, get-your-hands-dirty aspect of it. What I don’t...Read more >>

An SEO's Guide to a Painless Website Migration

It can be a daunting and scary experience to migrate a website to a new server or web host, especially...Read more >>

22 Questions to Jumpstart Your Link Building Campaign

If you're starting from scratch on a link building campaign, there are a number of crucial considerations,...Read more >>

Google's New Definition of search

Google’s Schmidt Says Future Of Search Is Autonomous, Personal Fast Search Andy Beard discovered this...Read more >>

The Death of SEO

Yes, I'm going there. Bear with me, this is not another one of those typical clueless 'SEO is dead' articles....Read more >>
View all authors

Follow Along

RSS Feed Join Us on Twitter On Facebook

Featured Article

SEO is all about the situationSEO is all about the situationI was reading some guy's promotional material today and noted he had 'Done SEO for some BIG corporations'. Like...
Read More >>

Our Sponsors

Banner
Banner
Banner

Latest Search Videos

Banner
Why you should still care about duplicate content
Written by Ian Lurie
Monday, 24 January 2011 14:21

Google's so dang nice. I could just hug them all.

Recently, they announced that we no longer have to worry about duplicate content. See, Google will sort it all out for us.

So, if you have the same article on your site at; www.mysite.com/article/, www.mysite.com/article/?from=home, and at www.mysite.com/article.html, they'll be all charitable and figure out which one to use. We're saved, Google says. Go about our business.

I immediately got 30 snippy e-mails from developers who already hate me, telling me I've been wasting their time making them clean up duplication issues.

No.

Duplicate Content

I could elaborate on 'No', but it'd require cursing, so I'm going to stick with 'No' and explain why duplicate content still sucks.

 

Wasted crawl budget

I don't care if Google can suss out every instance of duplicate content on the web. You're still forcing them to suss it out.

If you have a 10,000 page web site, and 9,000 pages of those pages are duplicates, then Googlebot still has to crawl 9,000 pages it doesn't need. There. is. no. way. that that is a good thing. Use rel=canonical if you want—Google still has to hit each URL. You're wasting their time. No one likes having their time wasted.

This is all about crawl efficiency. Don't waste a search engine's time if you don't have to. Let a visiting spider grab what it needs and go on its way.

Duplicate content still sucks.

 

There is another search engine

Ever hear of Bing? It's not so speedy or clever. But it does generate 10-15% of all web traffic. If you think that's not worth bothering about, you're in better shape than I am. I'll take any smidgen of relevant traffic I can get.

Duplicate content will still wreak havoc on Bing, as well as on many vertical search engines, Facebook's proto-search engine and everything else people use to crawl the web.

Duplicate content still sucks!

 

Link love

I come to your site and find the article to which I want to link at www.site.com/?blah=foo, and then someone else finds the same content at www.site.com/?blah=foo&dir=dem and links to it there. Congratulations! You just split your link authority in half for that page! Nice job.

Except it's not a nice job. It's a stupid job. And again, rel=canonical may help sort out the link chaos, but not as well as just doing it right in the first f@#)($* place.

Duplicate content still sucks!!!

 

Server performance

First thing you do to improve server performance is set up some kind of caching. Caching stores a copy of all, or most-accessed, pages on your site. But most caching schemes are based on page URLs. Say you have the same exact article at three different URLS. Your web server or caching server will have to store three copies of the same page.

That wastes storage, memory and resources on your server. It also means that, until all three versions of the page are cached, you're still not delivering the performance improvement caching normally generates.

Duplicate content still sucks!!!!!!

 

Analytics mayhem

Trying to track the attention a single page on your site gets? Duplicate content turns it into a shell game. Multiple versions of each page means tracking down each version, averaging time-on-page, averaging bounce rate, etc..

The irony is that many developers create duplication trying to make analytics easier: They'll add something like ?from=topnav to all links in the top navigation so that these show up as separate clicks in traffic reports.

Not smart. You can track which clicks come from which areas using tools like ClickTale or CrazyEgg. And you've created a total mess for engagement analysis.

Duplicate. Content. Is. The. El. Sucko.

 

You get my point

Hopefully by now you get the point. Duplicate content is bad for plenty of reasons. Google's latest questionable claim is another excuse for doing it wrong. Don't buy it. Build your site right, fix duped content and you'll have a faster, better-ranking, easier-to-measure site.

Thoughts?

Ian Lurie -

Ian Lurie is Chief Marketing Curmudgeon and President at Portent, an internet marketing company he started in 1995. Portent is a full-service internet marketing company whose services include SEO, SEM and strategic consulting. He started practicing SEO in 1997 and has been addicted ever since. Ian rants and raves, with a little teaching mixed in, on his internet marketing blog, Conversation Marketing. He recently co-published the Web Marketing for Dummies All In One Desk Reference. In it, he wrote the sections on SEO, blogging, social media and web analytics.

Also hook up via

Read More >>


More articles by this author

Dear Google: This is warDear Google: This is war
Dear Google: With your announcement yesterday, you've become the enemy. My...
Read More >>
Building the perfect SEO crawlerBuilding the perfect SEO crawler
A guy can dream, right? I've fiddled with crawler technologies for...
Read More >>
Last Updated on Monday, 24 January 2011 14:23
 

Comments  

 
0 #1 Bill Marshall 2011-01-24 18:02
Well said sir, it often seems to be a constant battle against developers and their (usually) open source content management systems which find ever-more ingenious ways of producing half a dozen addresses for each page.

Just like the old arguments about whether validation helps ranking (when the argument should be does good coding help produce good websites) another ill-considered Google quote gives them all the excuse to carry on doing it wrong.

And if one more programmers tells me that "it uses a 302 redirect because that's the default" then I may just turn into Freddie and start slicing and dicing...
Quote
 
 
0 #2 Seo manager France 2011-01-24 18:17
Definitely a great post about duplicate content!
It's obvious for many of us, but having too many similar pages is a pain, for crawlers and people alike!
Quote
 
 
0 #3 seo india 2011-01-25 07:30
:D thanks.. I myslf have a site of 300 pages n some are similar.. helpful post for me.
Quote
 
 
0 #4 regional seo 2011-01-25 09:07
no doubt this post is very informative and have facts that there are many issues you can face because of duplicate content apart from only the typical "content duplication penalty". also the Analytic Expert should refer it for their practices because they do nourish duplicate pages.
Quote
 
 
0 #5 reverse seo services 2011-01-25 09:09
thanks for this information, as i do tend to use those tags for analytics and i also have site which has 40k around pages.
Quote
 
 
0 #6 Ben 2011-01-25 22:41
Totally agree. Sculpting is just as important now with competition on the rise.
Quote
 
 
0 #7 Doc Sheldon 2011-01-28 04:05
Good article, Ian. I've been somewhat on the fence regarding dup. content. But I've always leaned toward the cautious side. It's just not good practice, IMO, and it certainly does nothing for the user experience, even if Google does catch it all.
Quote
 
 
+1 #8 g1smd 2011-01-28 23:38
There's a seemingly endless list of ways to screw up - developers seem to getting ever more inventive at how to produce sites that are "teh suck".

Most forum, blog, cart and CMS software is riddled with these problems. Sure, Google will "clean it up for you" - by taking a guess which URL to use. You can be quite sure that it will not be the one that you would have chosen.

Good post. Maybe this time someone will sit up and take notice, but I am not holding my breath.
Quote
 
 
0 #9 Ana Hoffman 2011-01-31 02:14
Ian - do you have any resources you could share regarding this announcement from Google?

Would love to read more about it...

Thanks!

Ana
Quote
 
 
0 #10 Bill Marshall 2011-01-31 12:48
Personally I gave up reading Google announcements 'cos they always lack substance and are full of generalities and idealised situation (and often a bit of FUD too). They said 10 years ago that they could easily detect hidden text yet their serps have been full of sites using it blatantly ever since.
Watch what they do not what they say!
Quote
 
 
0 #11 Link Building 2011-07-27 23:07
I agree that we should remove duplicates, especially when we know they exist -- it is a waste of the search engine's time. While one may think the task is tedious, it makes for a better website.
Quote
 

Add comment


Security code
Refresh

Getting Around the Site

Home - all the latest on SNC
SEO - our collection of SEO articles
Technical SEO - for the geeks
Latest News - latest news in search
Analytics - measure up and convert
RSS Rack - feeds from around the industry
Search - looking for something specific?
Authors - Author Login
SEO Training - Our sister site
Contact Us - get in touch with SNC

What's New?

Is Google Really De-indexing Free Directories?

article thumbnail

Today Barry Schwartz reported on a WebmasterWorld thread in which an SEO had noticed that Google has [ ... ]


Strange SERPs Could Be Penguin 1.1

article thumbnail

There is evidence today that Google may have updated Penguin. While there is no official word on t [ ... ]


Stuff We Like

Raven - SEO ToolsLink AssitantLink Assitant Link Assitant