Good afternoon TechWelkin readers. Today, I am going to tell you about the importance of a small text file called robots.txt in context of search engine optimization (SEO). Last month, I was doing SEO for a client. My objective was to increase search engine traffic on client’s website.
When I analyzed their site, I found most of the things were in order but there were a few obvious problems too. After analysis, I advised them to plug a few holes from which search ranking was leaking. Now after a month, on that website, search traffic has picked up and is still going upward.
One of the major problems that I noticed with that website was absence of robots.txt file. Let’s see how this file could affect Google’s view of your website.
What is robots.txt ?
robots.txt is a plain text file that tells search engines what to crawl and what to leave on your web server. This file contains directives for search engines bots, like Googlebot. These directives tell a bot about files and directories that it is allowed to crawl. Rest of the things are put beyond the reach of bots.
This file is called robots because it deals with the search (ro)bots. Search bots (aka robots) are automatic programs that crawl websites and make an index of what they find there.
Do search engines honor robots.txt ?
Good and responsible search engines (like Google, Bing, Yahoo! etc.) do honor your commands written in this file. However, this file can not force stop a search bot if it is bent upon spidering / crawling the disallowed locations.
Why robots.txt is important for SEO?
In recent year, search engines have been increasingly focusing on quality of a website. Everyone talks about content quality and it is true, without an iota of doubt, that high quality content is the biggest determining factor of your site’s search rank. You write excellent stuff on your website. But if you think that search engine can only find what you post on your website —you’re wrong! If you do not have a properly configured robots.txt in place —the search engines can snake through all the files and directories present on your server.
These files don’t contain material that you want to offer to your website users. But search engines index them nonetheless; because you’re not stopping them. Then upon analysis, search engines may find the content of these files irrelevant to the overall theme of your website. As a result of this irrelevance, your rank will tank.
Let’s understand it by an example.
You have been working on your blog for several months and now, let’s say, you’ve 100 good posts. Search engines have indexed them all. But because robots.txt is absent, search bots may also be indexing your “system files” (e.g. theme files, CMS files etc.) and other files that you have placed on server… now assuming that you’ve 100 such files available on server —the total number of entries in search index could be as high as 200 (100 posts + 100 other files present on server). Of these entries, 50% are irrelevant to your website’s users and you would want to get them removed from Google’s index. For this you should configure your robots.txt file
This is why it it very important to block certain portions of your web server from search bots; so that they index only what you really want to show to your visitors.
If Google has indexed significantly large number of entries than the number of your actual posts, then you would want to look into it. You should do a deep analysis of what Google has indexed and what it should actually be indexing.
I hope you got the idea why you need to have a well configured robots.txt -you’ll definitely see good results if you configure it properly. Let me know if you have any questions or comments.
Dear Lalit,
I have a classified site on wordpress. It is around a year old, As it is a classified site I feel I do not have to do much of posting user will do.
In this case I am not getting desired traffic, I do some promotion activity but is not enough. I know what is robots.txt, crawl, webmaster, SEO but not an expert, could you please let me help me to know how to do proper checking and sort it out, So that I could get desire traffic.
Your personal assistance will highly appreciated.
Regards,
Vijay Sharma
Vijay, promoting free classified websites up the index of Google search is a tough task as Google look down upon most of such sites. Way too many people have created such websites and most of such websites are low quality. This is the reason why you get little traffic. When you say “I feel I do not have to do much”… you lay the foundation of failure. You have to act like a webmaster and work hard. Anyway, robots.txt is a simple file. On your server you have only a clean installation of your website, you may even not need a robots.txt file. If you have folders that you want to hide from search engines, then you can use robots.txt.