There are countless powerful and effective SEO techniques used by specialists and amateurs all over the place. But what if I would tell you that there is one simple thing that can annihilate your SEO at one blow. Scary? Yeah, the unknown is always terrifying, but let’s get closer to the point.
Below, we are going to discuss the soft 404 error, what it is, why it occurs, how it can ruin your SEO, and how to fight it with your bare hands. So, let’s get it started!
What is Soft 404 Error?
Simply put, a soft 404 error is a label search engine gives your URL if it is not returning the relevant signal about 404 error back to the HTTP server. Nevertheless, the definition is pretty easy, what it covers behind is not really the most obvious thing on earth.
Basically, each response-code starting from 400 and all the way to 499 designates that either the browser wasn’t able to load the page, or the page doesn’t exist at all. However, the 404 code has an individual meaning – the current page is gone and most likely, will not be accessible within the closest time.
Note: Unfortunately, not always the page with HTTP 404 Not Found error means that it is actually 404 page, thus, this should be checked manually by using Google Search Console, SEMrush, WebCEO, Majestic SEO or other backlink checker tools.
Referring to Google Search Console Help, the catch in soft 404 error in particular, is that the page with a 404 response code sends the other code than 404 or 410 (commonly the 200 (OK) code) to the browser and the bot, yet, the page remains unreachable to the client and as the result, the useless URL will be scanned and added to the search engine’s index.
In most cases, soft 404 error occurs because:
- Your 404 page, which has been previously deleted or temporarily removed, does not return relevant HTTP 404 response code
- Your 404 page redirects the client to a completely irrelevant page (e.x. page from another category or your homepage)
- Your page lacks content, or there is no content at all
You should always be aware of 404 and soft 404 pages on your website. In order to find them manually, you have to crawl your website on a regular basis and make sure that there are no URLs that might be harmful to your website’s ranking.
Where are Your Soft 404 Errors Listed?
To illustrate how you can check the fact of soft 404 error presence within your website’s margin, let’s use the Google Search Console.
After logging into the Google Search Console, pick your website and proceed by clicking on the Crawl tab. Finally, select Crawl Errors report and you will see the particular pages containing soft 404 errors.
How Soft 404 Error Impacts Your Website Rankings?
As I’ve already mentioned, when your 404 URL sends back a 200 success response to the browser, while still not available to the user – you get a soft 404 error. But how can it influence your organic search, how is it connected to the crawl rate limit, and what in the world is the crawl limit? Let’s figure this out.
First of all, soft 404 error is literally telling the crawler to index the removed, deleted or the pages with little or no content and show it in the SERP. Accordingly, instead of indexing the pages that are more strategically important for your SEO, the crawler will waste its important crawl budget resource to worthless pages with soft 404 errors which also dramatically impacts your organic search results. That is why it is crucial to incorporate every effort into eradicating soft 404 errors out of your website as soon as possible.
What is a Crawl Budget?
Crawl rate limit or crawl budget – is a concept that: “represents the number of simultaneous parallel connections Googlebot may use to crawl the site, as well as the time it has to wait between the fetches”(Google Webmaster Blog). Even though it is not a big deal, if we are talking about small websites, it is an extremely important notion to consider while referring to the large ones because for Google it doesn’t make much sense to let the crawler spend the whole eternity on your resource when there are billions of others.
For instance, imagine that you own a huge e-commerce web store which consists of a dozen of thousands of pages. Meanwhile, approximately one thousand of them are soft 404 pages that are not giving back the relevant error response code. Now, the crawler would strive to intentionally waste its resources on the additional one thousand pages that are completely useless and even adverse for SEO. As a result, you have a thousand of 404 Page Not Found pages in SERP (instead of a thousand important pages with your products on), and the crawler is gone until the next scheduled fetch.
For this reason, a high percentage of pages with soft 404 error can become a real challenge for a website with millions of pages that are already consuming chunks of crawl budget.
How to Fix Soft 404 Error?
Initially, usual 404 errors are a common thing to have on your website. Some pages may have been intentionally or unintentionally deleted a long time ago or, for example, some pages were temporarily removed on purpose. Or perhaps, you have decided to change the structure of your URLs.
If you are aware of that, open your Google Search Console, then Crawl -> Crawl Errors -> URL Errors and mark them as fixed.
On the other hand, dealing with soft 404 errors is a little bit more sophisticated than it might seem at first glance. First, what you need to do is to unload the list of all the 404 errors from the Google Search Console.
Since Google allows us to unload only 1,000 errors per time, right after you have processed the first thousand, mark them as fixed. After a while, Google will update this section and you will be able to unload the next thousands of errors.
Next, after unloading the list of URLs, you have to figure out why the following pages are designated as 404 and especially, pay attention to the soft 404 errors.
Avoid checking numerous URLs one by one because it might take too much of your time, it will be more efficient to use a specialized service for that, e.x. httpstatus.io.
With further analysis, you will notice that more often soft 404 errors return 200 (OK) code. That is a nice example of what generally soft 404 error is: HTTP response code says that this page does exist and should be added to the index, however, there is nothing to show to the user and the browser.
However, in particular cases, soft 404 is triggered by inappropriate 301/302 redirect. Sometimes, webmasters are intentionally redirecting all the deleted pages to the homepage (probably trying to save the link equity), which is puzzling and annoying for the crawler.
There are a couple of ways to fix the soft 404 problems:
- Think ahead of time to prevent soft 404 errors from happening
- Grant all the unneeded and non-existent pages with an HTTP Status Code 404 (entirely delete them and wipe out the data from the sitemap for good)
- Redirect your problem page (301/302 redirect) to the URL with the most RELEVANT content (not to the customer 404 error page or homepage)
- If the soft 404 error URL has a plenty of link equity, type <META NAME=”ROBOTS” CONTENT=”NOINDEX, FOLLOW”> in the <head> of the page and leave it as it is.
- Recover the page if it was removed accidentally
- Add content to the pages which were misidentified by a bot as missing
Soft 404 Error is what a lot of SEO practitioners, webmasters, and marketers are not aware of. However, eventually, ignorance does not release from the responsibility. Therefore, it is extremely important to track your website’s URL health in order to apply an appropriate and timely first aid if needed, especially when you own a web-resource with a few hundred thousand pages.
This is a guest contribution from Jenna Brandon. She is a blogger, content creator, and digital marketer at Writology Custom Writing Service. When she’s not busy writing or studying the latest marketing trends, she cooks pizza or goes hiking with her friends. Jenna is also an avid traveler, and she is secretly Italian at heart.