As your website grows, the number of big and small issues with your content will snowball. It’s inevitable as we face challenges, be it CMS limitations, lagging behind new SEO trends, or simply the human factor. For whatever reason they occur, all content issues should be addressed on time to minimize the damage they cause to your website’s SEO.
However, a technical SEO audit can take ages unless you use a tool that scans the entire website in minutes and crafts a comprehensive automated report. For this purpose we’re using Website Audit by SE Ranking to see what content-related SEO issues we can spot and fix.
What to Expect from SE Ranking’s Website Audit
If you have an account in SE Ranking, its Website Audit makes detecting your website’s technical SEO issues just a matter of one click. Once launched, the audit runs in the background and checks your website against 100 parameters at a speed of 500 pages per minute. You get the results in the form of an Overview Dashboard, an Issue Report, and several lists of crawled pages, links, and resources.
To see if your website has any content-related SEO issues, you should go through several sections of the Issue Report, where you can access the full list of pages affected by every issue. Now let’s check what problems you can detect and resolve.
Misused Crawling and Indexing Restrictions
Your website will have pages that you want to see in search results, such as blog posts or product pages, as well as pages that you don’t want to see there, such as login pages or site search results. To forbid search engine bots from processing some pages, add page-level restrictive directives to:
- The robots meta tag. To the head section of the HTML file, add <meta name=”robots” content=”noindex” />. This directive blocks all robots from indexing the page.
- The X-Robots-Tag. To the HTTP header response for a given URL, add X-Robots-Tag: noindex or X-Robots-Tag: none. These directives block robots from indexing or both indexing and link crawling, respectively.
- The robots.txt file. To the root directory of your website, add disallow: [path]. This directive shows search bots what pages not to crawl.
Crawling and indexing restrictions may be mistakenly applied to the wrong pages and block your SEO content from search engines. And adversely, some of the pages that should be blocked might get overlooked and accidentally make it to the index.
Read: How Search Engines Rank
To ensure that your indexing restrictions work as intended, you should regularly look through lists of pages blocked from search bots. If any sitewide changes are being made to your code, indexing restrictions deserve a double check. Every time you run a website audit, you’ll find all pages blocked from bots in the Crawling section of your Issue Report.
In the Crawling section of the report, you’ll see how many pages have been blocked from crawling by your robots.txt file and how many of them have been blocked from indexing in the robots meta tag, or X-Robots-tag. Click the number in the Pages column to see the full list of such pages. You can check and filter it separately in the Crawled Pages section.
If you see that some of the listed pages shouldn’t be there or some pages are missing, go to your site and fix restrictive directives for these pages. Before fixing these issues, keep in mind how the directives in robots.txt, robots meta tag, and X-Robots-tag correlate. There are two things to remember:
- You should use the disallow directive in the robots.txt file to limit crawler traffic to the site, but you shouldn’t expect it to prevent your pages from indexing. Search engines can find and index a disallowed page if other websites link to it.
- If you noindex some pages using the robots meta tag or X-Robots-tag, you shouldn’t disallow these pages in your robots.txt file. If you forbid the bots to access these pages, they won’t be able to read your noindex directives, and as a result, these pages may get indexed if other websites link to them.
After you fix all the issues detected, you can restart the audit to see if the old problems have been solved or if any new problems have arisen. You can restart the audit manually at any time. But if you configure it to run on schedule—once a week or once a month—you’ll never miss any critical issues.
Similar Content on Multiple Pages
To face the issue called duplicate content, you shouldn’t necessarily engage in any deceptive practices, like scraping or copying someone else’s content. You can unintentionally get pages with identical or overlapping content created on your site for some technical reasons, including:
- Using UTM parameters for tracking your content distribution or marketing campaigns. A parameterized URL that you share on social media (something like example.com/page?utm_source=facebook) will have 100% identical content as the original clean URL.
- Adding site search, filtering, and sorting features to your site and having them show results on parameterized URLs (https://onlinestore.com/clothing/shirts.html?Size=S). Different combinations of filters or search parameters may return almost similar results and create thousands of nearly identical pages on different URLs.
- Maintaining several versions of your website. If you haven’t redirected your site’s www version to non-www, or the HTTP version to HTTPS, you’ll have at least one duplicate for every page. You’ll also have duplicated pages if you have URLs with and without trailing slashes or index.html segments or similar issues.
By all means, you should get rid of duplicated pages because they waste your site’s crawling budget, compete against each other for the same target keywords on SERPs, and dilute link equity if different variants of the same page acquire backlinks.
SE Ranking’s Website Audit will help you find all the sources of identical content on your site and get the lists of affected pages. You’ll find them in the Duplicate Content section of the Issue Report.
When dealing with the coexistence of www and non-www versions of your website or similar technical errors, setting 301 redirects is a go-to solution. After you set up a redirect, you’ll have only one version of the duplicate page available to both visitors and search engines.
When you deal with parameterized URLs, the best solution is to canonicalize original URLs by adding the <link rel=”canonical” href=”https://example.com/original-page” /> tag in the <head> section of every duplicate page. This is how you tell search engines that a given page is a copy of a specified URL and should be consolidated with it.
However, when using canonicals, ensure you don’t add more than one rel=”canonical” to a page. SE Ranking will flag pages with two or more canonicals pointing to different URLs. It’s considered an issue because when a page specifies different URLs as canonicals, search engines ignore these instructions and consolidate the page with none of them.
On-page SEO Issues Concerning Text Content
The lion’s share of SEO efforts are centered around the textual content of every page. There are rules for every aspect of the content, like the recommended word count per page, formatting best practices, and compression techniques for text-based resources.
SE Ranking’s Website Audit can help you keep an eye on the most measurable on-page SEO metrics. You’ll find information on 10 types of common text-related SEO issues in the Textual Content, Title, and Description sections of the Issue Report.
Some issues reported by the audit will indicate the pages that don’t follow general SEO content rules. You’ll see if you have any pages affected by the following malpractices:
- Missing or empty H1 tag. Every page should have a headline marked up with the first-level HTML header tag. Search engines use the <h1> tag to define what the page is about. If <h1> is missing from the HTML or doesn’t contain anything, you lose an opportunity to add the primary keyword to a prominent place.
- Multiple H1 tags. Although you won’t be penalized for having multiple <h1> tags on a page, the best practice is to have just one. Having multiple <h1> tags on a page can dilute the primary keyword value for too much content being marked up with this tag.
- Duplicate H1 tags. Every page should have a unique <h1> tag. If multiple pages on a site have identical header tags, it will be more difficult for search engines to decide which page is more relevant to a particular search query.
- Title matches H1 tag. Every page should have a <title> and <h1> tags that present it from slightly different angles. Both tags should incorporate keywords but can use different variations. By having identical <title> and <h1> tags, you’re missing out on a good optimization opportunity.
- Missing or empty H2 tags. Pages with long content should be organized and structured. By adding <h2>, <h3> tags, and next-level subheadings to the text, you divide it into subsections representing subtopics covered in the text. If a long text doesn’t have <h2> tags, it looks unstructured and difficult to navigate for readers and search engines.
Other issues the audit tool report will flag pages with parameters that fall outside the optimal range that you specify in the audit settings. You can either rely on the default values or adjust them to meet the specific requirements of your niche:
- Low word count (less than 250 words by default). The optimal text length depends on the type of page. For example, a blog post should be more than 750 words to fully cover the topic, while product pages in online stores rarely reach 500-word length. Yet, most search engines will probably consider content with less than 250 words insufficient.
- Wordy H1/H2 heading (over 100 characters by default). Headings of all levels should be concise and informative. Wordy headings can fail to effectively communicate the main message or topic of the page or its section. Headings can be shown in some SERP elements (for example, Also covered on this page, in featured snippets). Therefore it’s good if headings are concise enough to stay informative when they get cut in snippets.
One more critical aspect of the text content is its weight. Every redundant byte in your HTML file size wastes the user’s network traffic and slows down the page loading speed. Therefore, you should apply text compression algorithms, such as GZIP to make your pages more lightweight. The Website Speed section of the Issue Report will show if you have any pages with uncompressed content.
On-page SEO Issues Caused by Visual Content
On the one hand, visuals are often the heaviest elements on a web page and can slow down your site, unless you properly compress them. On the other hand, visuals are a valuable SEO asset, as they can rank in image search results if search engines correctly identify what’s on them using the alt text.
Those two aspects, namely file size, and alt text, are the two most important and basic aspects of images that you should control and optimize. The image size affects the page loading speed. Alt text defines what users see on the page before the image gets loaded. On top of that, alt texts give an additional optimization opportunity because they can contain keywords related to the image.
To check if your images and alt texts are properly optimized, you should look at the Images section of your Issue Report. There, you’ll be able to see the list of images with problems for every page, as well as their location, size, and load time.
If you see that some images are flagged as too big, check their current file size and see if they can be optimized. The potential savings from compressing images can make hundreds of megabytes sitewide. To maximize savings, make sure that you choose file formats that have the smallest file size at a minimum loss of image quality. Instead of traditional JPEG, GIF, and PNG formats, consider using the new-generation formats, such as AVIF or WebP.
Another thing that can be fixed quickly is adding alt text to images that don’t have them, especially if you download the full list of pages that contain <img> tags missing the alt=” ” attribute. The alt attribute helps search engine robots understand what is depicted in the image and is displayed to users if the browser can’t load it. Adding alt texts makes your content more friendly to visually impaired people that use screen readers, so they improve your website’s accessibility.
There’s one more thing to keep in mind when checking your image SEO parameters. Images are stored separately from the HTML file of the page. They can be in special folders on your site or even on a third-party resource. An image from a different site may load through the insecure HTTP protocol. This can potentially result in a mixed content issue and affect your site’s user experience. To check if you have this issue, look at the Mixed Content in the Website Security section of the Issue Report.
If your page’s images load through HTTP, search engines may flag your website for insecure connection, even though the main HTML file is loaded through HTTPS. To avoid unpleasant surprises on SERPs, make sure that images load through HTTPS.
There’s a lot to keep in mind when creating and optimizing your site’s content — keyword research to overall content quality assessment. An SEO audit tool helps you control small issues that are difficult to track manually. While some of them are just troublesome, others can cause a huge amount of damage to your SEO. Therefore, running regular automated audits is an excellent way to ensure that you spot and fix issues in a timely manner.