Robots.txt
Robots.txt is a file that tells search engine crawlers which pages to access or avoid.
Definition
Robots.txt provides directives for web crawlers about which parts of a site to crawl or ignore.
Why it matters
Improper settings can block important pages or expose sensitive ones.
Example: Disallowing /admin/ in robots.txt prevents bots from crawling admin pages.
Use Cases
Crawl Control
Guide bots on what to crawl.
Prevent Indexing
Block private or duplicate sections.
Optimize Crawl Budget
Focus bots on valuable content.
Validate file settings with tools.
Testing
Frequently Asked Questions
What is robots.txt?
A text file that sets crawler rules.
Can it block Google?
Yes, if misconfigured.
Where is it located?
At site.com/robots.txt.
Does it guarantee privacy?
With GSC or robots testing tools.
How to test robots.txt?
With GSC or robots testing tools.