Write the reason you're deleting this FAQ
I'm frequently amazed by the number of webmasters and programmers out there unable to tell the difference between nofollow and noindex tags as well as robots.txt.
Let's start with the nofollow tag.
If you really want to stop Google or any main search engine from crawling a certain section of your website you need to implement a nofollow attribute. A lot of people mistake the nofollow tag with the noindex tag, hoping nofollow will prevent search engines from indexing certain pages. If a page has only the nofollow tag implemented crawlers can still be fetched manually but the spider won't go past the nofollow page.
I personally use the nofollow tag for backlinks I don't want to transfer authority to, but also on pages I don't want crawlers to find. I never used the alone nofollow attribute on internal pages without also using the noindex tag simultaneously.
I usually have two option. The first option is to place a page on noindex but allow robots to crawl it because I probably want some lower level pages to get indexed. The second option would be to place both nofollow and noindex to make sure the page won't get indexed and the links on the page itself don't get crawled.
Second, comes the noindex tag.
The soul purpose of the noindex tag is to prevent search engines from indexing a certain page. If really want to prevent a certain page for turning out in Google search results, implement the following line into the page source code: <META NAME="robots" CONTENT="noindex"> - it worked for me 100% of the time. I had problems with Google my robots.txt but I never saw Google or any other search engine index a page that had robots noindex implemented.
Contrary to popular belief the noindex attribute won't prevent search engines from crawling your website at will. Sure it won't index anything, but it will crawl and index anything they find without a noindex tag.
To ensure crawlers don't index and don't pass a certain page you need to implement both a noindex and a nofollow attribute, like this: <META NAME="robots" CONTENT="noindex,nofollow">
If you have a WordPress blog you can easily make both nofollow and noindex implementation with the help of an SEO plugin like Yoast SEO.
Robots.txt
Robots.txt is the perfect way to prevent certain crawling robots and other spiders from crawling or indexing certain portions or sections of your website. Although robots.txt is usually respected by search engines, I've seen plenty of times Google indexing the homepage of a website I've placed "Disallow: /" which means I don't want any crawler coming and going through my website.
So yeah, placing something at disallowed in robots.txt doesn't necessarily mean it won't get index. Google can overpass that and index it nevertheless. If this happens you might see results that have the following meta description: "A description for this result is not available because of this site's robots.txt – learn more"
If you really want to stop crawlers from finding and indexing a page you need to implement the above robots, noindex and nofollow tags.
Unless I need to tell crawlers no to index and crawl a long list of URL filters or something along the lines of URLs that are generated which Google understand perfectly and doesn't ignore, I usually have a very simple robots.txt that always looks the same.
Example of a simple and proper robots.txt that allows everything:
Are you sure you want to delete this post?
Are you sure you want to delete this post?
Tronia
Noindex = don't access that site (for example Google), so its content won't appear on the search engine,
Nofollow = used for links, self-explanatory name.
There are also other META tags like nosnippet and noarchieve.
Definitly worth reading and learning. Thank you for giving us more detailed information regarding these three things. I can honestly admit that I only knew the very basic functions of each one but I could still tell them apart. Noindex = don't access that site (for example Google), so its content won't appear on the search engine, Nofollow = used for links, self-explanatory name. There are also other META tags like nosnippet and noarchieve. Definitly worth reading and learning.
Are you sure you want to delete this post?