By downloading the “Robots Exclusion Checker” extension from the Chrome Web Store, you can see for each page you visit whether your domain’s robots.txt file is blocking crawling for Google.
Robots meta tags.
Although there are also more things
You should know about the meta robots tags that Iñaki talk about in the blog, if you were not familiar with them, you should know that “index/noindex” and “follow/nofollow” are directives us to guide search engines on how they should treat a specific page and that they always follow.
What exactly does it consist of?
Robots meta tags are sort of traffic lights in your page’s HTML code that tell search engine robots whether they should stop and take note of your content or just move on. In more technical terms, when robots crawl your page, they pull metadata from the HTTP and/or HTML headers to find out if they are block from being index and/or crawl, botim database so their saving of page information in their databases depends on these directives if there are any.
Decides whether a page should be includ
In the search engine index. The “index” tag is like giving the green light for the page to appear in search results, while “noindex” is a dead ringer, keeping the page out of the reach of searches.
follow/nofollow: These determine whether develop a referral plan search engines should follow links on a page, i.e. whether or not they should be mov into their crawl queue. “follow” invites robots to explore the links they find, extending the discovery network, while “nofollow” tells them to ignore the links, as if to say “nothing to see here.”
These tags can be plac in
The X-Robots-Tag HTTP header (although this is not so common) as you can see in this example:
X-Robot-Tag: noindex,follow. Or more often, in tg data the <head> section of the HTML and would look something like this.