Duplicate Content

Substantially identical content appearing at multiple URLs, which can cause Google to consolidate them or choose the wrong canonical.

Duplicate content occurs when the same or very similar content is accessible at multiple URLs — whether within a site (internal duplicates) or across different sites (external duplicates). It is not a manual penalty but causes Google to choose which version to index and rank, often unpredictably.

Common causes: HTTP vs HTTPS versions, www vs non-www, trailing slash vs no trailing slash, parameters (session IDs, tracking codes, sorting/filtering parameters), printer-friendly versions, pagination, syndicated content (same article on multiple domains), and scrapers copying your content. The solution is canonical tags for internal duplicates and HSTS + 301 redirects for protocol/subdomain variations.

The "duplicate content penalty" myth: Google does not penalise sites for having duplicate content unless it appears to be deliberately manipulative (e.g., large-scale scraping and republishing). What Google does is consolidate duplicates — pick one URL to rank — which means you may lose rankings for a URL you wanted indexed if Google prefers a different version.

Related SEO Terms

Canonical Tag

An HTML link element that tells search engines which version of a page…

Hreflang

An HTML attribute that tells search engines which language and region …

Indexing

The process by which search engines add crawled pages to their searcha…

Thin Content

Pages with little original value — low word count, duplicated content,…

← All SEO Terms·Technical SEO terms →