What is Duplicate Content?

In search engine optimization, duplicate content is content that appears identically on more than one website. If these same contents are accessible via different URLs, this can cause great problems for Google and the respective site operators.
  • Same content can be accessed and indexed via different URLs
  • Usually has a negative effect on the SERP ranking
  • Should definitely be avoided by the site operator

Duplicate Content – Dangers and avoidance

Duplicate content is an important topic for search engine optimization. Indexing pages with duplicate content can have a negative impact on SERP rankings. The content of a website should always be accessible only via a unique internet address. Duplicate content presents Google and other search engines with the big problem of which URL to display and which ranking signals are assigned to which address. Therefore, duplicate content proves to be dangerous for website operators. We are not necessarily talking only about copied texts, but also about completely identical individual pages. To avoid ranking problems, websites must have enough unique content, i.e. content that was created exclusively for this one page and only appears on this one page. A distinction is also made between internal and external duplicate content. The former means that the same content is on one domain. The latter means that the corresponding content exists on several domains. If search engines come across duplicate content, there are problems. This is because it makes it harder to find content on the respective page or even filters it out completely. Googlebot & Co. therefore do not like duplicate content. Especially if too much of it is found on domains: The consequences are usually punishments like a reset in the ranking. After all, Google must be able to decide which website is more relevant. The aim is to achieve the perfect score of 100% unique content for a search engine optimized site, which is not always possible in practice. The easiest way to avoid this is to simply search for text modules to be sure. But even very similar pages (“Near Duplicate Content”) can be dangerous, even if they are usually unwanted. The most basic strategy for avoiding duplicate content is to prevent this duplicate content from occurring in the first place. This starts with clean crawling control. Do not link duplicate content in the first place, so that search engines like Google do not have to deal with it.

Duplicate Content in Practice

But if the duplicate content is already there, then ideally you should redirect to the desired original URL directly via “301”, which keeps the website slim and healthy. By the way, duplicate content has many faces: URLs accessible via lower and upper case, additional PDFs with product information, product detail pages, parameters for affiliate URLs and much more. Often it is Duplicate Content that has a benefit for the user. Such pages can be retained, but should refer to the original URL via Canonical Link. If you have duplicate content that should be equally discoverable via search, it is best to individualize this content. This applies, for example, to product descriptions in online shops or service offers that are always repeated. As far as recurring text modules are concerned: It is best to minimize them! Short summaries with links to more detailed information are the best solution here. Because even single smaller paragraphs that appear identical on many pages are a kind of duplicate content for Google. Shipping information, for example, and similar things are also Duplicate Content, to which search engines can react sensitively. External duplicate content confronts Google with the decision of what is to be considered original when the same content appears on different domains. Usually this is the website on which the bot first found the respective content. Source links and other signals can also be indicators for Google & Co. Best and safest is always the one who publishes a content first. This can be accelerated by applying for indexing immediately. Quotations are no big problem if they are marked as such in the source code.

