Robots TXT Generator

Disallow Directories:

Each path must contain: "/"

Sitemap:

Usually xml file

Grab interval:

unlimited

Search engines:

Allow
Refused

search engine

Google
Default
Allow
Refused

Bing
Default
Allow
Refused

Yahoo
Default
Allow
Refused

Ask/Teoma
Default
Allow
Refused

Alexa/Wayback
Default
Allow
Refused

Cuil
Default
Allow
Refused

MSN Search
Default
Allow
Refused

Scrub The Web
Default
Allow
Refused

DMOZ
Default
Allow
Refused

GigaBlast
Default
Allow
Refused

Special search engine(robot)

Google Image
Default
Allow
Refused

Google Mobile
Default
Allow
Refused

Yahoo MM
Default
Allow
Refused

Yahoo Blogs
Default
Allow
Refused

MSN PicSearch
Default
Allow
Refused

Please copy and save the results below as robots.txt

# robots.txt generated at /robots-txt-generator User-agent: * Disallow:

Related Generators

About Robots TXT Generator

Robots.txt is a plain text file stored in the root of the site. Although its setup is simple, it works very well. It can specify that the search engine spider only crawls the specified content, or it can prevent the search engine spider from crawling some or all of the content of the website.

The Robots.txt file should be placed in the root of the website and accessible from the Internet. For example, if your website address is https://www.yourdomain.com/ then the file must be able to open and see the content inside https://www.yourdomain.com/robots.txt.

User-agent:

Used to describe the name of a search engine spider. In the "Robots.txt" file, if there are multiple User-agent records indicating that multiple search engine spiders are subject to the protocol, there must be at least one for the file. User-agent record. If the value of this item is set to *, the protocol is valid for any search engine spider. In the "Robots.txt" file, there can only be one record for "User-agent:*".

Disallow:

Used to describe a URL that you don't want to be accessed. This URL can be a complete path or part of it. Any URL that starts with Disallow will not be accessed by Robot.

Example:

Example 1: "Disallow:/help" means that /help.html and /help/index.html do not allow search engine spiders to crawl.

Example 2: "Disallow:/help/" means that search engine spiders are allowed to fetch /help.html instead of /help/index.html.

Example 3: The Disallow record is empty, indicating that all pages of the website are allowed to be crawled by the search engine. In the "/robots.txt" file, at least one Disallow record is required. If "/robots.txt" is an empty file, the site is open for all search engine spiders to be crawled.

#:Robots.txt The comment character in the protocol.

Comprehensive example:

Example 1: Use "/robots.txt" to prevent all search engine spiders from crawling the "/bin/cgi/" directory, as well as the "/tmp/" directory and the /foo.html file. The settings are as follows:

User-agent: *
Disallow: /bin/cgi/
Disallow: /tmp/
Disallow: /foo.html

Example 2: Only one search engine is allowed to crawl through "/robots.txt", and other search engines are prohibited from crawling. For example, only search engine spiders named "slurp" are allowed to crawl, and other search engine spiders are refused to crawl the contents of the "/cgi/" directory. The setting method is as follows:

User-agent: *
Disallow: /cgi/
User-agent: slurp
Disallow:

Example 3: Any search engine is prohibited from crawling my website. The setting method is as follows:

User-agent: *
Disallow: /

Example 4: Only one search engine is forbidden to crawl my website. For example, only the search engine spider named “slurp” is prohibited from crawling. The setting method is as follows:

User-agent: slurp
Disallow: /