Clean up your website
Bill Says: “In 2023, we need to sit down and carve out time in our schedules to clean up our websites. Whether that’s optimising current images on site to load faster, cleaning up widgets and website templates, or looking back at what content is getting indexed and what isn’t. For example, there are lots of large websites that allow their internal search pages to get indexed and deindexed. However, do you really need all this extra thin content on your website?
Our main goals as SEOs are to get more optimised pages, more links, and create more content. However, have you considered the content you have currently? Always look through your existing content and update it accordingly - whether that’s page titles, adding FAQs, or other general cleanup tasks. There’s always lots to clean up, so why not address your current website in 2023 and ensure it’s at its best in all aspects? It’s not always about creating new content.”
One of the things you mentioned is to look at what pages are indexed and not indexed. Is this a good place to start this cleanup exercise?
“Yes, that’s a very good place to start. There are several ways to look at Search Console data, and various other data as well. Some tools enable you to see which pages are indexed and which pages are not. With Search Console, you can look at pages like coverage reports and go on to see which pages are crawled but not indexed.
There will be reasons why pages are not indexed, and sometimes pages get dropped out of the index for no apparent reason. If it’s an important page to you, go back and refresh it. If it’s an article or blog written years ago that’s still relevant as evergreen content, you can go back and refresh and resubmit it to make it rank. Google might already know about it, so it will just be a matter of refreshing it so you can actually get it back in the index and so forth. This is all part of the cleanup process.”
Why might Google deindex a page that has good content on it? Would it be because the internal links aren’t directly from the homepage or significant pages to a site anymore and it’s not getting that deemed authority from your site?
“That’s one possibility but, in all honesty, we don’t know why certain pages just get dropped. It could be internal links, it could be fewer clicks from search results. Sometimes random pages get dropped as a hiccup in Google’s crawling and indexing processes. The main thing is: if it’s a page where you can go in and change one thing and resubmit it, that page could be indexed again within 10-15 minutes. In this event, it might not have been the quality of the page that caused it to fall but some other reason that Google came up with.
It’s important to do a bit more investigation into crawled pages that are not indexed. There are some cases where certain pages are just not indexed.”
Could it be as simple as updating your titles if they’re not deemed as relevant as other pages?
“Yes. By going back and tactically crawling sites you can figure out whether there’s a reason why certain pages drop. Analytics comes into play too. If you have a list of URLs and there haven’t been any traffic clicks to that page in a year or two, you need to determine why. Is it because it’s not indexed anymore or is it because it’s just not relevant content anymore?”
How much content is thin nowadays? Is it a certain number of words or is it just not answering a question sufficiently enough?
“When it comes to thin content, there could be a page that ranks very well and that’s very appropriate, but has just 100 words on it. If this answers the question for the user then that’s great. However, it’s always best to look for duplicate content issues.
Many people make the mistake of writing a blog without using the alt tag that splits the first paragraph. In instances like these, 1,000 words of the blog post would appear on the blog post URL only. If you don’t use that alt tag to split off the first paragraph - so that only the first paragraph will appear on the /blog page - you could have a situation where you have all of the content on the blog URL and all of the content on the main blog page. You’d have the same content in two places and could run into classic duplicate content issues that need cleaning up.
There’s a correlation between thin content and duplicate content, and it is very close when assessing what pages to remove.”
If you’ve got widgets on your blog, how can you tell if they’re having a detrimental impact on user experience and even the search engines’ perceptions of your site?
“Let’s say you have a site you created ten years ago and it still gets traffic, however, there’s a widget that pulled data or information from another site. If that site goes down, you’d have a widget that’s just a black rectangle when the site loads. This happens with things that are supposed to load information somewhere or pull a logo from another site.
It could be another code, StatCounter code or any other code. This is all part of the cleanup. Another really important thing is who you’re making it out to. Let’s say you use a crawler that crawls all the internal pages on the site, but somewhere else you have an old page or other pages linking out to another URL. For example, most sites are now HTTPS, so if you can scroll through the external links you’re linking out to you’ll see an HTTP URL. All those main sites are now HTTPS and you’ll be linking out to a URL that’s a 301.
On the flip side, there’s a link building tactic: to crawl your competitors’ websites and find everybody that they’re linking out to. In most 500-page or longer sites, you’ll find they’re linking out to a link to a domain name that’s not registered anymore. In many cases, people would register that domain name and get the link back so that their competitor links to them. Your competitor could be doing the same to you, where you’re linking out to a domain name that doesn’t exist anymore and your competitor buys the domain name to get some of the traffic. It’s always important to take a look at things and clean up.”
What shouldn’t SEOs be doing in 2023? What’s seductive in terms of time, but ultimately counterproductive?
“Fairly recently we’ve been thinking about AI, machine learning, and AI-generated content. You can essentially put a list of keywords in and generate various articles and content. Some tools are better at generating AI content than others, but it appears to be a racing game between Google and the SEOs. As SEOs begin to generate AI content, how long will it be until Google fully understands what content is AI-generated and what isn’t?
We had this situation years ago when people would create hundreds and thousands of doorway pages. People were curious as to how long it would be until Google figured out these were doorway pages and whether it would ignore them or penalise people accordingly.
There’s a time and a place for using AI. If you’re creating a site and your articles and content are all AI-generated, that’s something you shouldn’t do unless you’re creating content for another site. However, on your main site, you should take a hard look at how much AI-generated content is being used.
AI can be very helpful if you have 100,000 products on your eCommerce site that need product descriptions, meta description tags, and a few accompanying sentences. AI could be very useful for crafting simple product descriptions, rewriting, and generating those.
What we shouldn’t do as SEOs is rely too much on AI-generated content. Use it sparingly and only when appropriate.”
Bill Hartzer is the CEO of Hartzer Consulting and you can find him over at hartzer.com.