On Monday, Google announced about posting a Request for Comments to its Internet Engineering Task Force to formalize the Robots Exclusion Protocol specification.
What Was The Announcement About?
On its blog, Google mentioned “Together with the original author of the protocol, webmasters, and other search engines, we’ve documented how the REP is used on the modern web and submitted it to the IETF. The proposed REP draft reflects over 20 years of real-world experience of relying on robots.txt rules, used both by Googlebot and other major crawlers, as well as about half a billion websites that rely on REP.”
Nothing Is Going To Change
Gary Illyes from Google, who was part of the announcement stated that no change has been made by Google.
Why Is This Being Done?
As Robots Exclusion Protocol has never been a formal standard on the internet, there is no guide to keep it updated or ensure that a particular syntax must be adhered to. All major search engines have adopted robots.txt as a crawling directive. However, it hasn’t been considered an official standard until now. But all this going to change.
Google Opens Sources To Its robots.txt parser
Google also announced that it is open sourcing the part of its robot.txt which parses the robot.txt file. You can view this library on Github if you want to.
What Is The Significance of This Announcement?
Nothing, in particular, has changed since Monday, but with this announcement to make Robots Exclusion Protocol Specification a formal standard, there is an indication that things are going to change. The internet has been utilizing it as a standard for the last 25 years though it is not an official standard. Therefore, it is still unclear what might change in the future. But as of now, if you are building your own crawler, you won’t be able to utilize Google’s robots.txt parser.