Last week we published the third instalment of our complete guide to Google ranking signals.
It concentrated on ‘content freshness’ and the practical ways you can ensure your webpages are recognised by Google as being relevant and up-to-date.
This week we continue diving into on-page content factors, with duplicate content and syndication.
What is duplicate content?
Duplicate content refers to a webpage’s content that appears in more than one place on the internet.
Let’s say you write an article on all the animals discovered and named after Frank Zappa. Then someone else copies and pastes your exact text into a new webpage on their own website. Then you’re both going to have duplicate content issues. As well as a big ‘falling out’ over multiple emails.
Duplicate content isn’t necessarily a bad or wrong thing, apart from in the above example where it’s just straight up theft.
Google won’t penalise duplicate content. Instead it will decide which version of the duplicate post should appear in search results and ignore the other.
Chances are Google won’t care which came first, it’s more likely to care which site has the highest authority.
There is technically an in-built safety net here. A high-profile site will theoretically never indulge in outright stealing other smaller site’s articles. It’s just not worth the damage.
Conversely many large sites are constantly scraped of their content to be republished elsewhere. Our own SEW articles are ripped off more than 10 times each by different scraper sites, but it never affects our rankings because those sites are of such poor quality they’re ignored by Google.
However there are perfectly acceptable ways of dealing with duplicate content in a ‘white hat’ manner.
How to manage duplicate content
1) Obvious one to start… do not copy another site’s content without asking permission first. It’s bad for you, the site you’re copying from and the reader too.
2) If you’re using segments or quotes from another webpage in reference to your article, then give them credit and a link back to the original source.
3) If you have duplicate content on your own site, set up a 301 redirect so Google only indexes your preferred page.
4) Ensure that Google is only indexing your preferred domain, i.e. either with the www prefix or without it: http://www.example.com or http://example.com. Google may treat the www and non-www versions of your domain as separate sites with separate pages, thus harming your visibility.
You can set your preference in Search Console.
5) You may experience duplicate content issues if you use a separate mobile version of your site. Using a mobile responsive website instead will solve the problem.
6) Before you accept posts from guest writers, double-check they haven’t been published elsewhere. Bloggers probably aren’t acting unscrupulously if they have been previously published, they may just not understand that this can cause search visibility problems.
However, if you have permission from the original author and website, there are ‘safe’ ways of publishing duplicate content that will benefit you, the author, the original website and Google happy.
Content syndication is the term used for the tactical republishing of your original article on another third-party website. It’s particularly useful if you’re a smaller publisher or an up-and-coming writer who wants a larger audience.
If content syndication is carried out correctly by the site republishing the content, there should be no reason why this will lead to duplicate content issues.
Here are a few SEO friendly methods for content syndication…
- rel=canonical tag
The safest way to ensure there are no duplicate content problems is to use a rel=canonical tag on the republished article.
This will tell Google that the linked article is the original and therefore should be indexed, and any ranking benefits will be passed through.
- meta noindex tag
According to Eric Enge’s Whiteboard Friday video, this is the same principle as the canonical tag. The republishing site implements a meta noindex tag on the page, instructing the search engine to remove the page from its index.
- Clean link to original article
You can also just use a clean text link within the article itself.
For example: many of these content syndication tips originally appeared in what is content syndication an how do I get started?
This is an adequate method if you have limited access to the HTML of your article and can’t implement a rel=canonical tag.