How to give instructions to google bot
How to give instructions to google Bot
Usuallly all the search engines will have web robot who crawl on the web pages.. Web site owners has to use the /robots.txt file to give instructions about their site
to web robots(Robots Exclusion Protocol).
For ex : If a robot wants to vists a Web site URL, say http://www.smartideasforlife.com/index.html.
Before it can see any other page it has to check the http://www.smartideasforlife.com/robots.txt, and finds:
User-agent: * Disallow: /
The "User-agent: *" means this section applies to all robots.
The "Disallow: /" tells the robot that it should not visit any pages on the site.
There are two important considerations when using /robots.txt:
- robots can ignore your /robots.txt. Especially malware robots that scan the web for security vulnerabilities, and
- email address harvesters used by spammers will pay no attention.
- the /robots.txt file is a publicly available file. Anyone can see what sections of your server you don’t want robots to use.
You can Blocking user-agents
The Disallow line lists the pages you want to block. You can list a specific URL or a pattern. The entry should begin with a forward slash (/).
- To block the entire site, use a forward slash.
- To block a directory and everything in it, follow the directory name with a forward slash.
- To block a page, list the page.
- To remove a specific image from Google Images, add the following:
User-agent: Googlebot-Image Disallow: /images/dogs.jpg
- To remove all images on your site from Google Images:
User-agent: Googlebot-Image Disallow: /
- To block files of a specific file type(for example, .gif), use the following:
User-agent: Googlebot Disallow: /*.gif$
How to create a /robots.txt file
You may also like: