Managing Search Robots



phpBB 3 introduced a new system for managing search indexer and bot accounts. It allows you to identify these automated bots by their IP or a part of their user-agent, which is a setting that normally identifies the browser of a user. After you add a bot and it is recognized, phpBB does not treat the session as anonymous, but uses the created bot account. Bots use permissions set by the predefined Bots group. Identifying bots is important so that phpBB can serve them content which is more appropriate for search engines - dead links to pages without content are omitted, e.g. posting pages, report post pages etc. Bots never receive a session ID in the URL, which should not appear in the search results. You can also assign a specific style and language to bots.

You can easily track if a specific search indexer visited your site recently by checking the Last Visit column on the botlist page.

Note

Bots do not use permissions from the Guest group, but permissions from the Bots group. For more about predefined groups, please read Section 3.6.1, “Group types”

Adding a bot

  • Bot name: This is the title of the bot that will be used on the forum. You will see it in the list of bots in the ACP and in the Who is Online lists.

  • Bot style: You can select the style served to the bot from the list of installed styles on the board.

  • Bot language: You can do the same with the language. The bot will use the language selected here.

  • Bot active: The bot session will be created only if a bot is active, if not, the data for a bot set in this form will not be used anywhere.

  • Agent match: You can match a bot by either its user-agent or its IP. You can specify a part of the user-agent to be looked for. For example, the Google search indexer has "Googlebot" in its user-agent, so you would enter it here to identify when Google crawls your board.

  • Bot IP address: This field is also used to identify the bot. If a bot cannot be recognized by the user-agent, you can specify what IP address should be used to identify it. Partial matches are allowed, that means you can include only the first two or three octets of the IP if the rest dynamically changes. You can also enter multiple IPs separated by a comma.

Note

If you enter both a user-agent and an IP address, both have to match to identify the bot.