Noindex
A directive that instructs search engines not to include a page in their index.
Noindex can be set in two ways: as an HTML meta tag (<meta name="robots" content="noindex">) in the page's <head>, or as an HTTP response header (X-Robots-Tag: noindex). Both methods tell Googlebot (and other crawlers) to crawl the page but not index it. This is different from robots.txt Disallow, which prevents crawling entirely.
Important distinction: a page blocked by robots.txt cannot have its noindex directive read (since Googlebot never fetches it), but it can still appear in Google's index if other pages link to it. A noindex directive on a crawlable page reliably removes it from the index.
Common use cases: pagination (noindex on page 2+ of paginated lists), thank-you pages, account/login pages, admin pages, thin category pages, printer-friendly versions, and staging environments. The noindex directive can be combined with nofollow (noindex,nofollow) to also tell Googlebot not to follow links on the page.