One of the most common problems e-commerce website owners worry about is duplicate content. It’s incredibly easy to create an e-commerce site where your content is seen as totally irrelevant by the search engines because it either looks like every other e-commerce site out there, or every page on your own site looks nearly identical. Fortunately, there are ways that you can minimize the duplicate content issue.
Are There Duplicate Content Penalties?
The short answer is no, there is no “˜penalty’ for duplicate content. There is, however, a filter that will keep content that is basically identical from dominating the SERPs. Why would Google want to show a dozen pages in a row that effectively say exactly the same thing in exactly the same way? Is that a result you’d want to see?
To many people, the end result still looks like a penalty since their page doesn’t rank while other identical pages do rank well. In these cases, the other pages have you beat in other areas that are causing them to rank while you’e nowhere to be found.
Crawl Budget
Excessive duplicate content can affect your website in other ways as well. Google decides how often, and how deep to crawl your website looking for updates and new content. Each website is given its own crawl budget and forcing Google to crawl and index multiple pages with identical content can lead to you burning through your crawl budget much faster than your content needs.
If your crawl budget is 500 pages per day, and you have 400 products, each with 4 or 5 identical pages, you are now asking Google (or any search engine) to crawl 1600 to 2000 pages looking for the right one. This can lead to important pages being overlooked.
Here Are a Few Ways You Are Creating Duplicate Content
Multiple, Identical Product Pages
One of the most common places you’ll find duplicate content is on your own website. It’s amazing how many different ways programmers think you need to be able to get to your product pages. Let’s take a quick look at Shopify.
When you create a new product in Shopify you are given a URL for that product. Usually in this format
domain.com/products/my-amazing-product
The problem comes in when you add that product to a collection. Now that same product has the URL
domain.com/collections/collection-name/my-amazing-product
You now have two pages with exactly the same content. Want to make it even worse? Add that same product to multiple collections. You end up with
domain.com/collections/other-collection-name/my-amazing-product
Use The Canonical Tag
Fortunately, there’s a rather simple way to deal with this. You can set one of those products as the primary “canonical” page and all others are considered redundant and no longer duplicate content. Your canonical tag will look like this
<link rel=”canonical” href=”https://domain.com/products/my-amazing-product”>
There are more ways you can create confusion with your page URLs. If you are using UTM tracking for social media or other session modifiers to tracking users on your website, you may end up with even more versions of your pages that could make use of the canonical tag. But what do you do if you can’t implement a canonical tag for some reason?
Use The Robots.txt file
It’s possible to force the search engines to not index pages with session IDs or UTM tags or just about anything with just a few lines of code.
For example, you can use
User-agent: *
Disallow: /*?utm*
Disallow: /*?sid=*
And so on.
WWW vs. Non-WWW
Believe it or not, there are still websites out there that still show both versions of their website www and non-www. This is almost always an incredibly easy fix. If you’re on a Linux based server and have access to your htaccess file, you can simply add these lines:
RewriteEngine On
RewriteCond %{HTTP_HOST} ^www.yourdomain.com [NC]
RewriteRule ^(.*)$ http://yourdomain.com/$1 [L,R=301]
Product Feeds / Affiliate Programs / Syndicated Content
All three of these items are designed to make your life easier by allowing you to use, or allow to be used, content created elsewhere and use it on your site.
Product Feeds make it possible for you to launch large quantities of products with little effort. The issue is that there are probably dozens, if not hundreds, of other people doing exactly the same thing and creating a mess of duplicate content on all of these websites.
Look at this snippet of a search:
This search goes on for nearly 9,000 entries. It’s likely that most don’t rank very well.
Affiliate programs that copy your content are just as bad. If you choose to run an affiliate program I highly recommend putting a “write your own content” rule for your affiliates.
Syndicated content is just as bad. Either you’re using syndicated content on your website, or allowing your content to be used elsewhere. In both cases, you’re creating a potential duplicate content issue.
Staging and Development Servers
Yes, people are still allowing their staging and development servers to be crawled and indexed by the search engines. The best solution is to put these servers behind passwords to keep the bots and uses out. But if you don’t want to do that, you should be adding:
<meta name=”robots” content=”noindex” />
to the website. just be sure to remove it from the live version of your website.
You Can Solve Many of These Issues with Great Product Descriptions
The number one thing that you can do to avoid dealing with duplicate content issues is to write your own great product and service descriptions. When you have interesting products and services this can be an easy task. Unfortunately, most eCommerce sites are selling simple products that are likely sold by dozens, if not hundreds of other people. The key here is to not copy the manufacturer’s description directly, but rather, to create your own description.
Here are a few ways you can contribute to making your product pages unique.
- Encourage reviews on your product pages. Customer reviews, especially those that mention the product or describe them can add to your product pages;
- Explain how the product can help the customer rather than “just” a description of the product. Sure, everyone knows that a rubber tourniquet is a short length of rubber tape or tubing that can be used to reduce or stop blood flow. But how about explaining how it saved your life that one time when you fell into a pit full of snakes?
- Provide examples of your product in use. Bolts aren’t sexy. But showing a tricked-out bog block with a mile-high blower and pointing out how your fasteners are keeping that blower from exploding off the top of the motor is far more interesting than just saying it is a 1/2″ by 5/8″ fine thread, Grade 8 bolt.
- Explain why your products are better than similar products. Don’t just say “˜this is a half carat green emerald’. Say that your green emerald is a natural green, untreated emerald that sparkles with a deep green that reminds you of the eyes of that person in high school that you had a crush on.
- Stop using basic words. Expand your vocabulary. Don’t limit yourself just because writing product descriptions can be dead boring sometimes. Keep a tab open to http://www.thesaurus.com/ when writing to help.
[youtube https://www.youtube.com/watch?v=DK8ednS0skQ]
How Do You Discover if Your Content is Being Used by Someone Else
The most popular method of search for duplicate content is by using a service such as copyscape.com. You can enter a URL to be checked, or, if you are outsourcing some of your content creation, you can upload the document and verify that it is unique before you even post it.
In the end, having unique content is the best way to ensure that your site is completely indexed and has the best chance of ranking well in the search engine results. You may not rank #1 because of it, but at least you won’t be stuck on page 734 with everyone else that thought using the manufacturer’s description for their product page was a good idea.



