This guide will help you with the blessed task of finding external backlinks to your website, WITHOUT missing out, and will go in detail how to fix them PROPERLY.
If you only have time or resources for just one SEO technical recommendation to implement – just go with THAT one.
Here is a traffic graph from one of my clients, that the only thing we have asked was to simply perform 301 redirections for some website 404 pages:
Pretty neat, right?
What causes broken links on a website?
Broken backlinks are basically website pages that do not exist anymore (usually, we expect them to return a 404 not found status code) which also have backlinks from external websites.
It is quite common to find broken links on almost EVERY domain, as most of the websites out there change from time to time, it can be a design refresh, infrastructure update, moving to a new domain, or sometimes even deciding to remove some categories, articles or useful pages from their site.
Why should you fix your site’s broken links?
- Reclaiming link equity/Link juice – in 2018, links are still an important SEO aspect. REALLY important aspect.According to Backlinko, the number of unique referring domains has the strongest correlation with organic search rankings. So, in case your one of your old URLs return a 404, it will not benefit from the links that are pointing to it and will not receive any link juice/equity.In other words, if you have referring links to broken pages, you should handle them ASAP.
- Traffic – in some instances, you are in a double whammy situation. Not only is your domain not receiving the value of the backlinks, but you are also missing out some potential traffic.How can you find it?We will assume that every 404 page has the same Meta Title.Check out yours to see if it says “Page not Found” or something similar. In Google Analytics, you can find pages by their Meta Title.
Behavior – > Site Content – > Landing PagesIn the secondary dimension, look for Page Title:
Then you can filter those pages by the page Meta Title of a page not found:
This way you can see all the 404 pages with traffic that you should redirect
Common pitfalls with broken links and how to handle them
In the ahrefs tool, if you go to pages -> best by links
And then, filter them by 301 or 301 redirects, you can see the list of redirected pages.
However, there is a catch.
The tool will only display the first header (status) code. Which means if the URL is being redirected, without further investigation we would not be able to tell its final path.
Therefore, there can be two scenarios that will require our attention:
- Redirect chain – according to John Muller, Google will follow up to 5 redirects (not more):
In the meantime, he also suggests keeping any redirect chain as short as possible.
So in case you have more than two redirects, it is highly recommended to fix it and reduce the number of redirect chains to a minimum.
- Redirect leads to 404 pages – as I have already mentioned, the ahrefs tool will show you only the first status code, not the final one. So what happens in cases that after the first (or few more) redirects the final URL will lead to 404? How can we identify it?
Step 1 – in the Screaming Frog tool go toConfiguration >> spider >> advanced
And tick the “always follow redirects” option:
Step 2 – Export the full list of URL’s with links and filter it to urls with at least one referring domains.
Step 3 – In the Screaming Frog go to :
Mode -> List
And then upload the excel sheet or txt file with all the filtered links.
Step 4 – Export the following report:
Reports – > Redirect and canonical chains
Step 5 – To fix the redirect chain – Check the number of redirects loop for every URL and try to reduce it to a minimum (ask to redirect from URL A to URL B without unnecessary loops)
To fix 404’s – check the status code for the final destination URL – if it returns 404 – no link equity has passed and you should definitely consider fixing this link.
2. 301 redirects to the homepage
As stated by Google do not ever link all your 404 pages to the homepage by default. If there is no relevancy, Google will consider them as soft 404’s and devalue them.
Do the same process mentioned above in Screaming Frog to follow the redirects final destination and check if the redirected path makes sense. Is it a product page or an old blog post? Does it make sense to redirect it to your homepage, or maybe there might be other pages that will be more relevant?
For example, think about a subcategory page that does not exist anymore – will it be better to redirect it to the homepage or a similar/closer category page?
3. Soft 404 pages
What are soft 404 pages?
The definition of soft 404 page is “not found” page which returns a 200 header instead of 404 (or 410 on some instances).
How can you check if you have soft 404’s on your site?
On the domain, type a random string after the root domain:
Check the header for this page (I use Ayima chrome extension) – if the header returns 404 you can easily find them with Ahrefs and redirect the:
However, if you see a 200 header, it means your pages are being considered as soft 404 and you will not be able to identify broken links if there are any.
Step 1 – fix it. Ask your developer to return 404 header for all error pages.
Step 2 – export the list from the Ahrefs tool again (this time only with the 200 headers) and run them with Mode -> List
Step 3 – Check all page titles – you will expect to see the same title for your 404 page (404 not found or something similar).
Another option is to check the error pages by their H1:
Filter it by the H1 column in the screaming frog and you are basically done.
Check them by either the Meta Title tab or the H1:
Step 4 – After identifying all soft 404’s, you can redirect them to the most relevant pages.
4. Page is currently not in our index
According to ahrefs, that indicates that the target page is not in their index.
While I have not found any issues that should prevent these pages from getting indexed, we need to find a solution for this.
It is problematic because we currently do not know what is the status code of those pages – are they 200, 404 or being appropriately redirected?
The solution – export the list from ahrefs and crawl it in Screaming Frog. You will get the status codes for every URL on the list which will give us the information which URL’s need to be redirected.
What if I will tell you about a scenario that all your broken links are being redirected properly, but still do not pass any value? (AKA link juice)
This can happen much more often than you would expect, in case of overlooked deployment of the robots.txt file.
In case you disallow specific folders or URLs via the robots.txt, Google will not crawl it and will not pass the link juice
So how can you check it?
After adding your robots.txt you can export the list of backlinks and check if some of the rules block any of these urls.
If some lines are blocking the redirects, make sure you remove it (but carefully).
6. Meta robots noindex, nofollow
Similar to section no. 5 (regarding robots.txt), this one can be easily missed, as most of the link analysis tools do not state the status of the robots.txt nor the Meta Robots tag.
So even if these pages return a 200 status code, it still does not guarantee Google will index and crawl them.
Let’s say you have dozens of out-of-stock product pages that you have decided to block with noindex a while ago. How can you make sure you’re not missing any of those?
Back to the Screaming Frog again, after crawling the list of given URLs, go to the “directives” column and filter by noindex or nofollow:
If you find pages with backlinks and contradicting Meta Robots tag (noindex/nofollow or both) make sure to decide what to do with this page:
- Remove the Meta robots tag – in case you find this pages useful
- Perform a 301 redirect to the most relevant pages
7. Unknown old domains
You are starting to work with a new client and their link graph is nothing to brag about. But wait!
You do not need any fancy SEO tools to check it. Just ASK your client if there are any old domains or properties, it can be either their main website or other old domains that are in their disposal.
You will be surprised to find how many times clients don’t understand the importance behind their old domains and do not even bother to mention it.
And NO, you will not see it in the Ahrefs or any other given tool if nobody had bothered to redirect them.
Lately, I myself even accidentally found an old domain for one of my clients. In their mind, it was an old website, and they were not slightly aware of its potential (although that’s always the first question I ask).
The result? They forgot to renew the domain ownership, someone else had bought it, and just like that – more than 400 relevant referring domains were flushed down the drain:
8. Toxic links
Before blindly asking to redirect 404 pages, you should first ask yourself –
It’s not only the NUMBER of referring domains, but also their QUALITY.
Take into account that in some instances, if a given page returns a 404 header (or 410 for that matter), maybe it was intentionally.
Perhaps the former SEO company was smart enough to remove some old pages that had toxic spammy links that had been built manually in the past.
So the idea is to really analyze those links BEFORE deciding whether to redirect them or not.
Make sure you understand well what is an unnatural link before taking action (although this guide is originally from 2013, it is still relevant to 2018).
- If you have valuable broken links to your website – make sure you properly fix them.
- Make sure you are not missing out on anything.
- Analyze the links. Fix them properly.
- It’s not enough to just go over your backlinks.
- What do you think? After checking your link graph, did you find any nice surprises? 🙂 Hit me in the comments section!