Robots.txt для SEO — полное руководство

Robots.txt for SEO: a complete guide

Robots.txt for SEO: a complete guide

Content:

What is robots.txt?

robots seoRobots.txt – This is a simple text file that contains directives for search engines. It tells them which pages and files should be indexed and which should not. This helps manage your site's SEO by preventing unwanted or sensitive data from being indexed.

Be careful when making changes to Robots.txt: an incorrect directive may block the sections of the site you need from indexing.

Search engines regularly check sites' robots.txt file to see if it contains instructions. If a file is missing or does not have applicable directives, search engines will traverse the entire site.

How to create a robots.txt file?

Creating a robots.txt file is simple: open a text editor (notepad, notepad+, vs code) and save the file with the name "robots.txt". Then upload it to your site's root directory so search engines can find it.

The file should be available at: https://yoursite.ru/robots.txt span>

Syntax robots.txt

The syntax of robots.txt is simple: each rule must begin with a "User-agent" directive, followed by a "Disallow" directive or "Allow" and the path to the page or file that needs to be blocked or allowed.

Example rules for robots.txt:

User-agent: *

Disallow: /private/

Allow: /public/

In this case, all crawlers cannot index pages in the "private" directory, but can index pages in the "public" directory.

Symbol meanings:

* is any sequence of characters.

$ is the end of the line.

# - comments.

Let's take a closer look at all the directives for robots.txt

Directive

What is he doing

User-agent

Yandex google

Indicates a robot for which the rules listed in robots.txt apply.

Disallow

Yandex google

Prohibits crawling sections or individual pages of the site.

Allow

Yandex google

Allows indexing of sections or individual pages of the site.

Sitemap

Yandex google

Specifies the path to the Sitemap file located on the site.

Clean-param

Yandex

Indicates to the robot that the page URL contains parameters (for example, UTM tags) that do not need to be taken into account when indexing.

Crawl-delay

Yandex

Sets the minimum time period (in seconds) for the robot between finishing loading one page and starting loading the next.

Yandex requirements for the Robots.txt file

  1. The file size should not exceed 500 KB.
  2. The file must be in TXT format and named robots.txt.
  3. The file must be placed in the root directory of the site.
  4. The file must be accessible to robots. The server on which the site is located should return an HTTP code with a 200 OK status

Errors in robots.txt and their consequences

Blocking important pages

Errors in robots.txt can lead to important pages on your site being blocked, reducing your visibility in search engines. Make sure you check the rules in the file carefully to prevent such problems.

Conflicting rules

Conflicting rules in robots.txt can confuse robots, which will affect the indexing of your site. Make sure the rules in your robots.txt file are clear and consistent.

Skip rules for crawlers

If you haven't specified rules for a particular crawler, it may not process your site correctly, which can lead to indexing problems. Make sure all important crawlers have appropriate rules in their robots.txt file.

You should not block JavaScript or CSS files using robots.txt. Bots may not display content correctly on your site if they don't have access to these resources.

How to check and fix errors in robots.txt

To check the robots.txt file for errors, you can use various tools, such as Google Search Console or Yandex.Webmaster. These tools provide information about the rules in your robots.txt file and help you determine if there are any problems.

Yandex Robots.txt file analysis tool from Yandex

https://webmaster.yandex.ru/tools/robotstxt/

google Google's robots.txt file verification tool

https://support.google.com/webmasters/answer/6062598

Cyrillic characters in robots.txt

The use of Cyrillic characters is prohibited in the robots.txt file. To convert a Cyrillic domain, use the Punycode converter.

Converting Cyrillic urls

To convert Cyrillic names, use Unicode converter

unicode converter

Wrong

User-agent: *

Disallow: /directory/

Sitemap: https://yoursite.rf/sitemap.xml

Right

User-agent: *

Disallow: /%D0%BA%D0%B0%D1%82%D0%B0%D0%BB%D0%BE%D0%B3/

Sitemap: https://xn--80aae4a1bi2b.xn--p1ai/sitemap.xml

Manuals on robots.txt from Yandex and Google

Yandex Yandex documentation on the robots.txt file

https://yandex.ru/support/webmaster/controlling-robot/robots-txt .html

googleGoogle documentation on the robots.txt file

https://developers.google.com/search/docs/crawling -indexing/robots/intro?hl=ru

Example robots.txt to ban an entire site

User-agent: *

Disallow: /

Example robots.txt for Wordpress

User-agent: *

Disallow: /cgi-bin

Disallow: */?

Disallow: /wp-

Disallow: *?s=

Disallow: *&s=

Disallow: /search

Disallow: /author/

Disallow: *?attachment_id=

Disallow: */trackback

Disallow: */feed

Disallow: */embed

Disallow: */page/

Allow: /wp-content/plugins/

Allow: /wp-content/themes/

Allow: /wp-content/cache/

Allow: /wp-includes/

Allow: */uploads

Allow: /*/*.js

Allow: /*/*.css

Allow: /wp-*.png

Allow: /wp-*.jpg

Allow: /wp-*.jpeg

Allow: /wp-*.gif

Sitemap: https://example.ru/sitemap.xml

Example robots.txt for Bitrix

User-Agent: *

Disallow: */index.php$

Disallow: /bitrix/

Disallow: /personal/

Disallow: */cgi-bin/

Disallow: /local/

Disallow: /test/

Disallow: /*show_include_exec_time=

Disallow: /*show_page_exec_time=

Disallow: /*show_sql_stat=

Disallow: /*bitrix_include_areas=

Disallow: /*clear_cache=

Disallow: /*clear_cache_session=

Disallow: /*ADD_TO_COMPARE_LIST

Disallow: /*ORDER_BY

Disallow: /*?print=

Disallow: /*?list_style=

Disallow: /*?sort=

Disallow: /*sort_by=

Disallow: /*?set_filter=

Disallow: /*?arrFilter=

Disallow: /*?order=

Disallow: /*&print=

Disallow: /*print_course=

Disallow: /*?action=

Disallow: /*&action=

Disallow: /*register=

Disallow: /*forgot_password=

Disallow: /*change_password=

Disallow: /*login=

Disallow: /*logout=

Disallow: /*auth=

Disallow: */auth/

Disallow: /*backurl=

Disallow: /*back_url=

Disallow: /*BACKURL=

Disallow: /*BACK_URL=

Disallow: /*back_url_admin*

Disallow: /*?utm_source=

Disallow: */order/

Disallow: /*download

Disallow: /test.php

Disallow: */filter/*/apply/

Disallow: /*setreg=

Disallow: /*logout

Disallow: */filter/

Disallow: /*sphrase_id

Disallow: */search/

Disallow: /*type=

Disallow: /*?product_id=

Disallow: /*?display=

Disallow: /*?view_mode=

Disallow: /*view=

Disallow: /*min_price=

Disallow: /*max_price=

Disallow: /*&page=

Disallow: /*?path=

Disallow: /*?route=

Disallow: /*?products_on_page=

Disallow: /*?PAGEN_1=1$

Disallow: /*?PAGEN_1=1/$

Disallow: /*?new=

Disallow: /*?edit=

Disallow: /*?preview=

Disallow: /*SHOWALL=

Disallow: /*SHOW_ALL=

Disallow: /*SHOWBY=

Disallow: /*SPHRASE_ID=

Disallow: /*TYPE=

Disallow: /*?utm*=

Disallow: /*&utm*=

Disallow: /*?VIEW=

Disallow: /*?SORT_TO=

Disallow: /*?SORT_FIELD=

Disallow: /*set_filter=

Disallow: */auth.php

Disallow: /*?alfaction=

Disallow: /*?oid=

Disallow: /*?name=

Disallow: /*?form_id=

Disallow: /*&form_id=

Disallow: /*?bxajaxid=

Disallow: /*&bxajaxid=

Disallow: /*?view_result=

Disallow: /*&view_result=

Disallow: */resize_cache/

Disallow: /*?linerow=

Disallow: /bitrix/panel/

Disallow: *?sort_ord=

Disallow: *?sort_dir=

Disallow: *?category_id=

Disallow: *?item_id=

Disallow: *?pn_pr=

Disallow: *?page=

Disallow: *?tab=

Disallow: *?display=

Disallow: *?linerow=

Disallow: *?year=

Disallow: *?oid=

Disallow: */filter/

Disallow: *showElements*

Disallow: *PAGEN_2*

Disallow: *?ORDER_ID=

Disallow: *how=*

Disallow: */form/?name=

Disallow: *?name=

Disallow: /*gclid*

Disallow: /*yclid*

Disallow: /*ymclid*

Disallow: /test*

Disallow: /404.php

Disallow: /api/*

Disallow: /*?RID*

Disallow: *?preview=

Disallow: *bitrix_*=

Disallow: *auth=

Disallow: /*?tag

Disallow: /*set_filter*

Disallow: /*?showElements=

Disallow: /*?tid*

Disallow: /*&tid*

Disallow: *?FILTER*=

Disallow: *?ei=

Disallow: *?p=

Disallow: *?q=

Disallow: *?tags=

Disallow: *B_ORDER=

Disallow: *BRAND=

Disallow: *CLEAR_CACHE=

Disallow: *SECTION_ID=

Disallow: *section_id=

Disallow: *SECTION[*]=

Disallow: *SHOW_ALL=

Disallow: *SHOWBY=

Disallow: *SORT=

Disallow: *SPHRASE_ID=

Disallow: *TYPE=

Disallow: /*?from*

Disallow: /*&from*

Disallow: /*block=*

Disallow: *r1=

Disallow: */?_ym_debug

Disallow: */apply/*

Disallow: *&by*

Disallow: *?by*

Disallow: *?id=*

Disallow: *?a=*

Disallow: *?amp*

Disallow: *IBLOCK_ID=*

Disallow: *RESULT_ID=*

Disallow: *PROPERTY=*

Disallow: *IN_STOCK=*

Disallow: *SECTION_CODE=*

Disallow: *SIZE=*

Disallow: *added=*

Disallow: *position=*

Disallow: *callibri=*

Disallow: *gtm_debug=*

Disallow: *placement=*

Disallow: *source=*

Disallow: *&adv=*

Disallow: *?adv=*

Disallow: *option=*

Disallow: *?hhtmFrom=*

Disallow: *?_r=*

Disallow: *sort_order=*

Allow: /upload/*

Allow: /bitrix/components/

Allow: /bitrix/cache/

Allow: /bitrix/js/

Allow: /bitrix/templates/

Allow: /bitrix/*.js

Allow: /bitrix/*.css

Allow: /local/components/

Allow: /local/cache/

Allow: /local/js/

Allow: /local/templates/

Allow: /local/*.js

Allow: /local/*.css

Allow: /local/*.jpg

Allow: /local/*.jpeg

Allow: /local/*.png

Allow: /local/*.gif

Sitemap: https://example.ru/sitemap.xml

Robots.txt example for ModX

User-agent: *

Disallow: /cgi-bin

Disallow: /manager/

Disallow: /assets/

Disallow: /core/

Disallow: /connectors/

Disallow: /index.php

Disallow: *?

Allow: /assets/*.jpg

Allow: /assets/*.jpeg

Allow: /assets/*.gif

Allow: /assets/*.png

Allow: /assets/*.pdf

Allow: /assets/*.js

Allow: /assets/*.css

Allow: /assets/*.svg

Sitemap: https://example.ru/sitemap.xml

Example robots.txt for OpenCart

User-agent: *

Disallow: /*route=account/

Disallow: /*route=affiliate/

Disallow: /*route=checkout/

Disallow: /*route=product/search

Disallow: /index.php

Disallow: /admin

Disallow: /catalog

Disallow: /download

Disallow: /export

Disallow: /system

Disallow: /*?sort=

Disallow: /*&sort=

Disallow: /*?order=

Disallow: /*&order=

Disallow: /*?limit=

Disallow: /*&limit=

Disallow: /*?filter_name=

Disallow: /*&filter_name=

Disallow: /*?filter_sub_category=

Disallow: /*&filter_sub_category=

Disallow: /*?filter_description=

Disallow: /*&filter_description=

Disallow: /*?tracking=

Disallow: /*&tracking=

Disallow: /*?page=

Disallow: /*&page=

Disallow: /wishlist

Disallow: /login

Sitemap: http://example.ru/sitemap.xml

Robots.txt example for Joomla

User-agent: *

Disallow: /administrator/

Disallow: /bin/

Disallow: /cache/

Disallow: /cli/

Disallow: /components/

Disallow: /includes/

Disallow: /installation/

Disallow: /language/

Disallow: /layouts/

Disallow: /libraries/

Disallow: /logs/

Disallow: /media/

Disallow: /modules/

Disallow: /plugins/

Disallow: /templates/

Disallow: /tmp/

Disallow: /index.php* # Only if you have SEF enabled

Allow: /index.php?option=com_xmap&sitemap=1&view=xml

Sitemap: http://example.ru/sitemap.xml

Robots.txt example for Drupal

User-agent: *

# CSS, JS, Images

Allow: /core/*.css$

Allow: /core/*.css?

Allow: /core/*.js$

Allow: /core/*.js?

Allow: /core/*.gif

Allow: /core/*.jpg

Allow: /core/*.jpeg

Allow: /core/*.png

Allow: /core/*.svg

Allow: /profiles/*.css$

Allow: /profiles/*.css?

Allow: /profiles/*.js$

Allow: /profiles/*.js?

Allow: /profiles/*.gif

Allow: /profiles/*.jpg

Allow: /profiles/*.jpeg

Allow: /profiles/*.png

Allow: /profiles/*.svg

# Directories

Disallow: /core/

Disallow: /profiles/

# Files

Disallow: /README.txt

Disallow: /web.config

# Paths (clean URLs)

Disallow: /admin/

Disallow: /comment/reply/

Disallow: /filter/tips

Disallow: /node/add/

Disallow: /search/

Disallow: /user/register

Disallow: /user/password

Disallow: /user/login

Disallow: /user/logout

# Paths (no clean URLs)

Disallow: /index.php/admin/

Disallow: /index.php/comment/reply/

Disallow: /index.php/filter/tips

Disallow: /index.php/node/add/

Disallow: /index.php/search/

Disallow: /index.php/user/password

Disallow: /index.php/user/register

Disallow: /index.php/user/login

Disallow: /index.php/user/logout

Disallow: /drupal-9-migration

Disallow: /drupal-migration-services

Disallow: /drupal-7-end-of-life

Disallow: /drupal-migration-rescue

Sitemap: http://example.ru/sitemap.xml

Robots.txt example for Magento

User-agent: *

Disallow: /catalogsearch/

Disallow: /search/

Disallow: /customer/account/login/

Disallow: /*?SID=

Disallow: /*?PHPSESSID=

Disallow: /*?price=

Disallow: /*&price=

Disallow: /*?color=

Disallow: /*&color=

Disallow: /*?material=

Disallow: /*&material=

Disallow: /*?size=

Disallow: /*&size=

Sitemap: http://example.ru/sitemap.xml

Share this article

Vkontakte Odnoklassniki Twitter

Our cases