Have you ever come across the error message, ‘Indexed, Though Blocked by Robots.txt,’ on the Google search console?
How about ‘New Index coverage issue detected?’
Then you’re not alone!
At this point, we believe you may be concerned about the error or on the lookout for the best ways on how to fix ‘Indexed, though blocked by robots txt’ error.
If that’s the case, here’s a post that has been specially tailored to answer these queries and more.
You’ll also get to know how to block sites in robots txt, if robots txt is legally binding, and whether ignoring robots txt is illegal.
So, walk with us and we’ll guide you through the whole process.
Why ‘Indexed, Though Blocked by Robots.txt?’
Admittedly, you’ll want to know why ‘Indexed, though blocked by robots.txt’ shows up on your Google search console.
Therefore, here’s something to work with.
This error alerts you that Google bots or even those from Bing and Duck Duck Go cannot access certain pages on your site.
Read Also: How to Delete Google Analytics Account
The alert helps you to take the right steps to make the needful changes especially if these pages were mistakenly blocked in the Robots.txt file.
The code that would’ve been used in this case is: <meta name = “robots” content = “noindex”>
About the Robots.txt File
Robots.txt file instructs search engine bots about pages on your site they should crawl and which should be ignored.
And that means pages that have been marked crawlable will be displayed in search engine result pages.
The reverse is the case if these bots have been instructed to ignore certain pages, which means neither will the pages be indexed nor will they be ranked alongside others
To that effect, it is important to fix this error since it could potentially affect the visibility of your blog on search engines.
It may also be worth noting that a page that has been blocked with the robots.txt file may still show in Google’s search result.
However, the description of the page will not be shown and the same goes for the images, PDFs, videos, and any other content on the page.
This means that one way to know that the page has been blocked in the robots.txt file is when you see it in SERPs this way.
The soultion is to remove the entry in the robots.txt file that is blocking Google bots from accessing the page.
How to Find Robots.txt File on WordPress
To find the Robots.txt file on WordPress, follow these steps:
- Access your Wp-admin section.
- Choose the Yoast SEO plugin
- Select Tools
- Navigate to the File editor
The steps above is for users of WordPress and Yoast SEO plugin.
However, if you’re using Rank Math SEO plugin, here are the steps you should opt for:
- Access the Wp-admin section.
- Navigate to Rank Math
- Access to General Settings
- Access Edit robots.txt.
Finally, use these steps to acess the robots.txt file if it’s WordPress + All in One SEO:
- Access the Wp-admin section
- Goto All in One SEO
- Goto Robots.txt.
But then, here’s what the Robots.txt file contain:
Sitemap: [URL location of sitemap]
User-agent: [bot identifier]
User-agent: [another bot identifier]
How to Fix ‘Indexed, Though Blocked by Robots.txt’
Whether you had set the block on certain pages intentionally, mistakenly, or can’t wrap your head around how it occured, there’s an easy way to get a fix.
Follow the tips outlined below to fix ‘Indexed, Though Blocked by Robots.txt’
Find the Affected Pages
The first step is to find the affected pages that are prompting the error message.
To do this:
- Log into your Google Search Console account
- Click on the section ‘Coverage’
If there is no warning message, then there are currently no errors on your website.
However, if you see an error message of this nature:
“To owner of ‘URL’
‘Search console has identified that your site is affected by 1 new index coverage related issue. This means that Index coverage may be negatively affected in Google Search results. We encourage you to fix this issue.’
- Copy out the affected URLS which you want to be indexed
- Go back the robots.txt file on your blog and make an update
Use the Robot.Txt Tester
If you are still unable to ascertain the section of your robots.txt file that is blocking the URLS, do the following:
- Copy the affected URL
- Try using the Google robots.txt tester
- The next page will show the section in your robots.txt file where the blockage comes from
- Make changes to your robots.txt file
- Finally, click on ‘Validate Fix’ button
Clicking this button will allow Google to recheck these URLs and your robots.txt.
And if there are still errors, you’ll be notified.
Changes in Theme
We do not rule out the fact that certain templates can cause this error.
Therefore, if you had not encountered the error before this time, and you recently changed the template, it might be useful to check it out.
All you have to do, is fix the theme or change it totally to ensure it does not impact on your traffic negatively.
Why are Pages Not Indexed?
There are important pages on your Blogger or WordPress pages which should be indexed.
But in certain cases, these pages may not be indexed for the following reasons:
Wrong URL format
It is possible to have the wrong URL to a page, thereby causing redirection error.
Accordingly, you can make changes to point to the right URL. But first, it is important to know where the wrong URL points to.
If it is pointing to a page you will like to show to users on Google, then it the right URL needs to be recrawled.
Your robots.txt file could have directives preventing certain pages from being indexed.
These pages could be categories or tags (/search/label pages), and given that either of these has URLs of their own, it could impact on posts within those tags.
Redirection helps point one URL to another URL.
However, if there is a redirect chain where one URL points to another and the other points to another, this could cause the final page to be unreachable.
The deep redirection may even have errors at some point, thereby stopping Google bots from following it.
If a wrong canonical link is used, it could also lead to errors.
There’s the canonical tag, for instance, which is employed as an HTML header.
The header tells Googlebots the most preferred canonical page to be used in time where duplicated content is spotted.
There could be a French version of a page written in English, hence, the canonical tag may be used in this case.
The tag could point to the page written in English.
Is Robots.txt Legally Binding?
According to the platform https://www.robotstxt.org/, no law has mandated that a person must use /robots.txt.
In the same vein, using /robots.txt does not serve as a binding contract for a webmaster and its site visitors.
Despite this, it may be useful to have a /robots.txt, which could be used in legal cases.
An error such as ‘Indexed, Though Blocked by Robots.txt’ can easily be fixed if you know your way around it.
There are several reasons why the message could’ve popped up in the Google search console, and we’ve outlined the possible reasons.
And now that you know the right places to look, it means you can potentially solve this issue in no time.
Therefore, give it a try.
But if you encounter issues or you’re still unable to resolve this issue, please let us know by commenting below.