Compiling Robots.txt for WordPress

Hello! Today, what should be the correct robots.txt for WordPress . About what robots.txt is and what it is eaten with, I already wrote two days ago. And now specifically for WordPress. This file has the ability to set basic rules for indexing a blog for various search engines, as well as apply different access rights for individual search bots.

For example, I will explain how to create the right robots.txt for WordPress. The basis will take the two main search engines – Yandex and Google. I want to note that Yandex prefers when it is addressed separately and the User-agent directive will help us with this. Bots read the contents of the file (as well as the source code of any page) from top to bottom, so the User-agent should be the first line.

one
User-agent: *

– if you put an asterisk in front of the directive, then all subsequent rules will apply to any robot. You can write separately the rules for the required bots, for example, for google, the line will look like this:

one
User-agent: yandex

Let’s remember that WordPress , like any content management system (CMS), has its own administrative resources, administration folders, etc., which should not be included in the index. To protect such pages, which may contain personal data, various logins and passwords, it is necessary to prohibit their indexing in this file in the following lines:

one
2
3
Disallow: / cgi-bin
Disallow: / wp-admin /
Disallow: / wp-includes /

Theme files, plugins, and WordPress cache are also hardly needed, we apply the corresponding rules to them:

one
2
3
Disallow: / wp-content / plugins
Disallow: / wp-content / cache
Disallow: / wp-content / themes

The next rule for writing the correct robots file is not to allow the index, and then the search results, such pages that duplicate the main content, thereby reducing the uniqueness of the content within the same domain.

You should get rid of such pages as soon as possible, otherwise there is a chance of getting under the filter. Where on the blog Wordpress goes duplication? First of all, these are tags, comment pages, rss feeds of comments, entries by various authors of the blog (even if it is one – there is still duplication on the page / author / name of the author /, etc.
).

one
2
3
four
five
6
7
eight
9
ten
eleven
Disallow: / wp-trackback
Disallow: / wp-feed
Disallow: / wp-comments
Disallow: / category /
Disallow: / author /
Disallow: / page /
Disallow: / tag /
Disallow: / feed /
Disallow: * / feed
Disallow: * / trackback
Disallow: * / comments

Further, I would like to pay attention to one aspect … If human-readable links are used on your blog , then the pages containing question marks in their URLs are often “redundant” and very often duplicate the main content. Therefore, they should also be prohibited:

one
2
3
Disallow: / *?
Disallow: / *? *
Disallow: /*.php

Please note that separate files with the .php extension are also prohibited, this is due to the fact that the same main page is accessible at several addresses and one of them is /index.php. This ban also includes administration files – install.php, login.php and others.

This doesn’t end the editing of the robots:!:. It is possible to register additional information data that improves the quality of indexing. Among them, the Host directive – sets the main mirror (this directive is only taken into account by google, naturally list your blog address):

one
Host: fastandsocial.com

To speed up and complete indexing of all pages, we will add a path to the sitemap site map (write your address, for example give my own):

one
Sitemap: https://www.alert2web.com/sitemap.xml

Based on all of the above, I got the following picture:

one
2
3
four
five
6
7
eight
9
ten
eleven
12
13
14
15
sixteen
17
18
nineteen
20
21
22
23
24
25
26
27
28
29
thirty
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
User-agent: *
Disallow: / cgi-bin
Disallow: / wp-admin /
Disallow: / wp-includes /
Disallow: / wp-content / plugins /
Disallow: / wp-content / cache /
Disallow: / wp-content / themes /
Disallow: / wp-trackback
Disallow: / wp-feed
Disallow: / wp-comments
Disallow: / category /
Disallow: / author /
Disallow: / page /
Disallow: / tag /
Disallow: / feed /
Disallow: * / feed
Disallow: * / trackback
Disallow: * / comments
Disallow: / *?
Disallow: / *? *
Disallow: /*.php
 
User-agent: Yandex
Disallow: / cgi-bin
Disallow: / wp-admin /
Disallow: / wp-includes /
Disallow: / wp-content / plugins /
Disallow: / wp-content / cache /
Disallow: / wp-content / themes /
Disallow: / wp-trackback
Disallow: / wp-feed
Disallow: / wp-comments
Disallow: / category /
Disallow: / author /
Disallow: / page /
Disallow: / tag /
Disallow: / feed /
Disallow: * / feed
Disallow: * / trackback
Disallow: * / comments
Disallow: / *?
Disallow: / *? *
Disallow: /*.php
Host: fastandsocial.com
Sitemap: https://www.alert2web.com/sitemap.xml

Remember: the indexing process should be monitored constantly and in time to make its own adjustments in relation to the robots.txt file for WordPress and not only.

Seo Consultants London