What Is Technical SEO? TOP 5 Important ElementsOrhun Karadag
In this post, we will answer the question of “What is Technical SEO”? First of all, we will start our post with how search engine bots are crawling our website and indexing logic. Then we will talk about Robots.txt file and site maps which are very important for bots to crawl and index our site.
After that, we will focus on HTTP response codes that inform the bots about the status of our pages, and finally, take a look at the page-speed optimizations and mobile-first indexing that are vital to technical SEO.
As a note, we are an Edinburgh based digital marketing agency, we use cutting-edge techniques for all of our clients, taking into account all the improvements and updates. This way we provide high-quality SEO services. If you have any queries, contact us!
Now, let’s delve into crucial technical SEO elements.
1) Crawling and Indexing
What is Crawling and Crawling Budget?
“Crawling” is the process of search engine spiders (bots) crawling the pages of our site in a certain period of time. This review process is carried out by following the links that the pages give to each other in our site(internal links) or by following the links given to our pages from outside our site. (external links)
It is not possible for bots to explore every single page on the internet. It is possible to assume that they are using a quality value that they give to each site. We can say that the value given by bots to our site represented by the crawl budget in the SEO world.
The search budget does not have a specific numerical value, but using search engine optimization (SEO) techniques, we can efficiently use this budget to allow bots to discover, crawl, and index our pages more frequently.
Assuming that our page can be crawled by the bots during the review we mentioned at the beginning of our article, the bots will associate our page with the keywords in our page and perform the indexing process. We call this process indexing.
What is Indexing?
Indexing means search engine bots storing our pages in a database considering various metrics.
When a user searches for a keyword in the search engine, indexing doesn’t perform live-time. Google (or any other search engine) sort the page groups that were previously associated and indexed with the relevant search query (keyword) and brings you the most relevant search query-related results.
When search engines sort these page groups, they use many algorithmic factors and ranking signals, which is called “ranking“. After the search query, the first 10 results are ranked as organic results on the first page.
Just because we have a page does not necessarily mean that bots will crawl or index this page. Therefore, if we want a page to be in the search engine database and show up on search results, it should be crawlable and indexable. A the same time by optimizing the search budget with technical SEO we can make the bots discover the pages we care and value more.
2) Robots.txt and Sitemap
What is a robots.txt File and What Does It Do?
The Robots.txt file is a text file located in the main directory of our site, which allows us to edit the access and crawling capabilities of search engine spiders to our pages. In the main directory, our file can be accessed via the URL “our site.com/robots.txt”.
There are certain commands to be aware of when preparing your robots.txt file. These:
The above command tells us which user-agents the robots.txt file we prepared applies to.
With the “User-agent: *” command, you can target and control all user-agents. If you want to command a boat from a specific source, all you have to do is give the bot’s information as a user-agent. For example: “User-agent: Googlebot” or “User-agent: Yandex”.
The above command is the command that allows the user-agent to visit our site to browse our site or a specific page.
Adding “Allow: /” line to our robots.txt file indicates that our entire site is crawlable by the relevant user-agent.
Adding “Disallow:” to our robots.txt file indicate that our entire site cannot be crawled by the relevant user-agent.
The above command is still a topic of discussion in the SEO ecosystem, but I’ve seen it work in some of the tests I’ve done personally. The command “Noindex: /” or “Noindex: / sub-folder / means deleting the entire site or a certain page path from the index to the relevant user-agents without making any redirects to the scan.
With the Allow: and Disallow: lines in the robots.txt file, of course, we don’t have to give site-wide commands always. We may block or grant permissions for certain pages on our site. To do this, we need to specify our page extensions as ”/ sub-folder / instead of using“/” after the command. It is also possible to use certain regex commands in our robots.txt file.
What is a Sitemap and what does it do?
Sitemaps, as the name suggests, are the files that transfer the link architecture of a site to the bots that visit the site in XML format on a regular basis. Site maps can be created in different structures for URLs, images, videos or news within our site. It is generally provided on “website.com/sitemap.xml” but it can be presented to visitors and bots with different names either. You can check Google’s guide for sitemaps quite detailed information.
Most important issue in creating sitemaps is that sitemaps should only contain URLs from our site.
In order to use our crawling budget efficiently, we should not include any non-200 response code URLs in our sitemap other than URLs with 200 status codes as the response code.
We can continue our technical SEO article with response codes.
3) HTTP Response Codes and Redirects
An HTTP status code is a response sent by the server when a request by the browser is completed or not. This is why we call them HTTP response code or status code.
HTTP response codes are one of the most obvious ways to see what happens between the browser and the server. That’s why search engine spiders, like Googlebot, read these codes when they first request the server to upload a site to see the health/status of that page. Among these codes, the most important ones on the SEO side are; 2xx, 3xx, 4xx and 5xx are response codes.
2xx Response Codes
200 means a successful response code. The overall goal in this case code range is to meet visitors and bot with a working web page. All messages starting with 200 actually mean some sort of successful loop.
3xx Response Codes
300 response/status codes refer to routing codes to bots and visitors. The most important on the organic side; 301 and 302, although 307 is also seen frequently.
301: It refers to permanent guidance to us. 301 HTTP status is used when the address to be reached is permanently moved to a new address. This allows the old URL to be replaced by new search engine directories.
Another one is the 302 temporary routing code. The 302 response code means that the address to be reached is temporarily moved to a new address. In the scenario where the migrated URL still exists, we may prefer the 302 response code. In this case, the old URL will remain in the search engine indexes. Quickly explain two examples;
For example, we are closing/deleting our page A based and there is page B as the closest page in the site. If we want to permanently transfer all authority power of page A, including the backlinks, to page B, the redirect code we use is 301.
If we are temporarily closing page A and we plan to open it again after a while, we can temporarily redirect it to page B, which is the closest equivalent. In the 302 redirects, we inform the bots that page A will come back later and there is no need to transfer any page authority.
4xx Response Codes
The HTTP error codes in the 400 range are 403, 404 and 410. Of these, the most common is the 404 error code. The server specifies to search engine spiders and visitors that the requested information is not available.
The presence of pages with a 404 response code on our site is not a problem. If a page giving a 404 error has a relevant counterpart in our site, it would be logical to distribute the potential power of the relevant 404 pages within the site. In this case, permanent redirection to a relevant page in the site using 301 redirection method can be used as a solution in some scenarios. You can see 404 errors on your site in your Search Console account under the “Crawl Errors” section.
5xx Response Codes
500-server error codes are 500 and 503. Typically, the error in this response code range is directly from the server. It is generally the developer’s or server’s job to eliminate this problem.
The most common of these is the 500 response code, which means “Internal Server Error ve and a general error message appear, indicating that the server is experiencing an unexpected situation that prevents it from fulfilling the request.
503 actually means that the service is unavailable and indicates to bots and visitors that the server cannot fulfill the request made at that time due to downtime or overload. It is possible to see all the errors that the server bots encounter through Google Search Console either.
4) Site Speed Improvements as a Part of Technical SEO
Site speed is an important factor for user experience. At the same time, it is crucial for technical SEO too. It is known that the page opening and loading speed of web pages has been important elements since 2010. However, as mentioned in the latest announcement published by Google on January 17, 2018, mobile site speed in the mobile-first indexing process is now a ranking factor.
In this case, it is important for us to have high-speed scores in all versions of our website and especially in the mobile version in order to have better rankings on search engine results pages. Accordingly, we see that Google’s speed tool PageSpeed Insights (PSI) has recently updated itself using the Chrome User Experience Report, which is actually real-world data.
Of course, the only tool available for site speed side checks is not Google PageSpeed Insights. With Google Lighthouse, GTmetrix, Pingdom, Speedcurve, you can test your page/site-wide speed. Two of these tools are distinguished from the others by certain features.
Instead of taking advantage of Chrome’s real-world data, such as PageSpeed Insights, Lighthouse is doing instant stress tests. It performs instant tests according to your instant connection or the type of connection you choose.
Speedcurve, which has been continuously improving itself especially in the recent period, presents you with detailed graphics, visualizing your page speed and metrics required to load your page with various graphics, as well as third-party resources that cause a slowdown in your page load.
What Can be Done to Increase Site Speed?
Site speed should be taken into consideration during web design. However, there are certain techniques aimed at increasing the page speed of existing websites as well. Let’s detail some of these techniques below.
- Reduce redirects since when a page redirects to another page, visitors wait for an additional time for the HTTP request-response cycle to complete.
- Improve server response time by using a high-quality hosting solution.
- Use a content distribution network(CDN). In this practice, copies of your site are stored at multiple, geographically diverse data centers.
- Optimize images. Be sure that your images are compressed for the web. Furthermore, they should be in the right file format For instance, PNGs are better for graphics. JPEGs are better for photos.
5) Mobile-First Indexing
Before Google switched to mobile-first indexing, both mobile and desktop search results were determined by using desktop pages. Factors such as mobile clickthrough rates, mobile site speed could result in slight differences between desktop and mobile rankings.
Starting from 2018, Google has switched to mobile priority indexing. With this change, the position of the desktop pages began to be determined based on the content offered by the mobile pages. Google will again have a single index in particular. There will be no mobile index separate from the home directory.
Here is a checklist below for making a site compatible with mobile-first indexing. It is vital to go through these steps meticulously.
- Are all category links also available on mobile pages?
- Are all in-site links included in the mobile version?
- Are the number of products/content shown on the listing pages the same on mobile and desktop?
- Are category and product descriptions used properly on mobile pages?
- Can mobile user-agents see all of the configured data markups on the desktop
- Are breadcrumb links included in the mobile version?
- Is user-generated content (comment fields, reviews) included in the mobile version?
- Are CSS, JS resources open to indexing by mobile user-agents?
- Are metadata such as an open graph, twitter cards, meta robots included in the mobile version?
- Are the annotations such as Canonical, prev / next, hreflang placed on mobile pages?
- Is the Sitemap accessible by Googlebot Mobile?
- Mobile Site Speed Performance
Thanks for reading our technical SEO guide.