How to add Custom Robots.txt To Blogger

What is Robots.txt?


Robots.txt is a text file which contains few lines of code. It is saved on the website or blog’s server which instruct the web crawlers on how to index and crawl your blog in the search results. That means you can restrict any web page on your blog from web crawlers so that it can’t get indexed in search engines like your blog labels page, your demo page or any other pages that are not as important to get indexed. Always remember that search crawlers scan the robots.txt file before crawling any web page.


THIS IS FOR EDUCATIONAL PURPOSE ONLY, I AM NOT RESPONSIBLE FOR ANY ILLEGAL ACTIVITIES DONE BY VISITORS, THIS IS FOR ETHICAL PURPOSE ONLY



Here is My Video Tutorial on How To add Custon Robots.txt in Blogger


How To Add Custom Robots.txt in your Blog -Blogger

1) Login to your Blogger account
2) open the blog for which you want to add Robots.txt
3) Go to Settings
4)Go to search preferences
5) Crawlers and indexing
6) Custom robots.txt - -Edit
7) Enable custom robots.txt content?  -- Yes
8) Copy the following data and paste it (you need to change "yourdomain.com")
     to your blogger domain name.
9) Click Save Changes 
10) That's it you are done with adding Robots.txt to your Blog

User-agent: Mediapartners-Google
Disallow:
User-agent: *
Disallow: /search
Allow: /
Sitemap: http://www.yourdomain.com/atom.xml?redirect=false&start-index=1&max-results=500

What Does the Above Lines State?
 

Mediapartners-Google: 


 Media partner Google is the user agent for Google adsense that is used to server better relevant ads on your site based on your content. So if you disallow this they you will won’t able to see any ads on your blocked pages.

User-agent:* 

 So you all know what user-agent is, so what is user-agent:*. The user-agent that is marked with (*) asterisk is applicable to all crawlers and robots that can be Bing robots, affiliate crawlers or any client software it can be.

Disallow: 

By adding disallow you are telling robots not to crawl and index the pages. So below the user-agent:* you can see 

Disallow: /search which means you are disallowing your blogs search results by default. You are disallowing crawlers in to the directory /search that comes next after your domain name. That is a search page like http://yourdomain.com/search/label/yourlabel will not be crawled and never be indexed.

Allow:/ 

Allow: / simply refers to or you are specifically allowing search engines to crawl those pages.

Sitemap:

 Sitemap helps to crawl and index all your accessible pages and so in default robots.txt you can see that your blog specifically allowing crawlers in to sitemaps.  There is an issue with default Blogger sitemap.





===========    Hacking Don't Need Agreements    ==========
Just Remember One Thing You Don't Need To Seek Anyone's Permission To Hack Anything Or Anyone As Long As It Is Ethical, This Is The Main Principle Of Hacking Dream
            Thank You for Reading My Post, I Hope It Will Be Useful For You


I Will Be Very Happy To Help You So For Queries or Any Problem Comment Below Or You Can Mail Me At Bhanu@HackingDream.net

Bhanu Namikaze

Bhanu Namikaze is an Ethical Hacker, Security Analyst, Blogger, Web Developer and a Mechanical Engineer. He Enjoys writing articles, Blogging, Debugging Errors and Capture the Flags. Enjoy Learning; There is Nothing Like Absolute Defeat - Try and try until you Succeed.

No comments:

Post a Comment