Definition:
This file tells the search engine crawler to
which files, folders or web pages are going to crawl or not to be crawled or
indexed.
- The robots.txt is allowing everything with the help of a sitemap.
- Without updating the sitemap file the robots.txt file is not working to allow the URLs. i.e. your page can’t show and it is not indexing.
Important
Tips:
Without sitemap file don’t do the robots.txt. So first create sitemap then create robots.txt
General
Syntax for Robots.txt file
User-agent: *
Allow: /
Sitemap: https://www.-----.com/sitemap.xml
- The above syntax is allowed every folder and every file and every image.
- Disallow:/img/service1.jpg --- here we can tell to search engine don’t crawl the service1.jpg image in the IMG folder
- /facilities.htm l--- here we can tell to search engine don’t crawl the facilities page
1. Write the above
syntax in notepad
2. Save this file.
3. Check the previous file
in your webpage
Syntax: www.digitalvishnu.in/robots.txt
4. Suppose if you want
to modify, go to public.html in
cpanel and find robots.txt then right
click and click edit option then you
can insert or delete all disallow
options then finally click save changes
and refresh the webpage.
5. Disallow everything
Syntax:
Disallow:*
6. Disallow particular
image in folder
Syntax:
Disallow: /img/img1.jpg or /img/img1.jpeg
7. Suppose if you want to disallow
particular page
Tips: don’t copy the
whole URL. Only copy the extension (Ex: /Home) and put in Disallow syntax
Syntax: Disallow: /Home
--- Right
8. Suppose if you want
to disallow the whole website
Syntax:
Disallow: /
Sitemap:
https://www.-----.com/sitemap.xml
0 Comments