Crawling

The process by which search engine bots systematically browse the web to discover and fetch web pages.

Crawling is the first step in how search engines index the web. Googlebot starts from a set of known URLs (seed URLs from sitemaps, previously crawled URLs, and links discovered during past crawls), fetches those pages, renders them (executing JavaScript), extracts links, and adds new URLs to its crawl queue. This process repeats continuously across the entire web.

Googlebots respect robots.txt rules, crawl-delay directives (though Google uses its own crawl rate algorithm), and HTTP status codes. A 200 response leads to parsing; a 301 leads to the redirect target; a 404 or 410 signals the URL should be removed from the crawl queue; a 500 or 503 causes a retry later.

For SEO, ensuring pages are crawlable means: no robots.txt blocks on important pages, correct HTTP responses, working internal links, and fast server responses. Use Google Search Console's URL Inspection tool to see when a page was last crawled and what Googlebot saw.

Test this on your site

Check Crawling issues on any URL — free, no signup

Crawl Site →

Related SEO Terms

Crawl Budget

The number of pages Googlebot will crawl on your site within a given t…

robots.txt

A plain-text file at the root of a domain that instructs crawlers whic…

XML Sitemap

A structured XML file listing your site's important URLs to help searc…

Indexing

The process by which search engines add crawled pages to their searcha…

JavaScript SEO

The practice of ensuring JavaScript-rendered content is crawlable and …

← All SEO Terms·Technical SEO terms →