phpBB3 SEO Sitemap

Does it create the sitemap.xml? - phpBB3 SEO Sitemap

Does it create the sitemap.xml?

by nhseacoast » Wed May 01, 2019 8:33 pm

I can't find it via FTP at my root, but when I browse to it, it's there. How does that work exactly? Thanks.
User avatar
nhseacoast
Registered User
Posts: 521
Joined: Sun Sep 22, 2002 10:31 pm
Location: NH, USA
Contact:

Re: Does it create the sitemap.xml?

by globetrotting » Sat Jan 25, 2020 2:46 am

nhseacoast wrote:I can't find it via FTP at my root, but when I browse to it, it's there. How does that work exactly? Thanks.
Same here. The browser seems to trigger the creation and displays it, but it does not get saved as root/sitemap.xml.
When I submitted the address root/sitemap.xml to google, it couldn't see it neither and seemingly did not trigger the generation of it.
But I found a different file on my server: /store/shredder/1.xml, which is obviously the file that my browser displayed under domain.tld/sitemap.xml.
So I copied 1.xml as sitemap.xml into root and hope that this will initiate the process.

But that does not seem to be the way this ext. is supposed to work, right?
Das Sein ändert das Bewußtsein
User avatar
globetrotting
Registered User
Posts: 198
Joined: Thu Jan 15, 2004 8:14 pm
Location: globetrotting
Contact:

Re: Does it create the sitemap.xml?

by KYPREO » Sat Jan 25, 2020 7:30 am

Sitemap.xml is created dynamically when you call site.com/app.php/sitemap.xml. This is why you cant find it saved locally.

It basically states the total number of URLs and an index with a link to each sitemap file.

Under the sitemap standard, there is a maximum of 50000 URLs per XML file, so they need to be split up. During sitemap generation, the extension creates each 50000 URL sub-sitemap and saves them in store/shredder as 1.xml, 2.xml etc.

When you go to site.com/sitemap.xml (in truth, a redirection to site.com/app.php/sitemap.xml it dynamically creates the sitemap.xml index pointing to each sub-sitemap file, 1.xml, 2.xml etc.

If you have specified a cache and you are within the cache period, then accessing sitemap.xml will just point to the existing sub-sitemap ls that have already been created.

If caching is disabled, or the cache period has expired, accessing sitemap.xml will trigger the extension to generate all the sub-sitemap files.

This way, the extension is basically set and forget. You upload to Google, then the Googlebot will access the sitemap.xml URL every few days. Depending on your cache setting, this will trigger the sitemap to be updated periodically as board content changes.

I recommend caching for 72 hours. This is based on how often Googlebot accesses my sitemap. Bingbot never updates it!i hope this helps.
phpBB user since 2002
www.AusRotary.com
KYPREO
Registered User
Posts: 388
Joined: Fri Feb 02, 2018 9:56 am
Contact:

Re: Does it create the sitemap.xml?

by globetrotting » Sun Jan 26, 2020 8:51 am

Thanks again for the clarification KYPREO, it helped indeed.
It got me wondering when I submitted the sitemap- link to google search console and google couldn't find it at first.
It succeeded now, but after what you wrote that is rather proof of a properly working extension than the result of my upload.

Concerning the cache setting:
If it's mainly google which calls the link every few days - why use a cache at all?
Just as the last-modified-date this seems to me like a setting that might only have negative effects- namely If google is early, finds your sitemap unchanged and adapts its algorithm to a lower visiting frequency.
Das Sein ändert das Bewußtsein
User avatar
globetrotting
Registered User
Posts: 198
Joined: Thu Jan 15, 2004 8:14 pm
Location: globetrotting
Contact:

Re: Does it create the sitemap.xml?

by KYPREO » Sun Jan 26, 2020 9:08 am

It depends on the size of your forum and the frequency of posting. If you have a smallish database, sitemap generation doesn't take long. In that case, caching might not be necessary. Otherwise, caching saves server resources.

In my case I have 165,000 submitted URLs and it takes 10 minutes to build the sitemap. During this time, a lot of memory and CPU is used. I'd rather that not happen frequently.

Even if Google was getting a refreshed sitemap more frequently, it is only actually crawling most pages once every 3-14 days (in some cases much longer). IMO it is not necessary to keep the sitemap updated any more frequently than individiual pages are being crawled.

Perhaps start with zero caching, monitor Google Search Console for a while, then adjust from there.
phpBB user since 2002
www.AusRotary.com
KYPREO
Registered User
Posts: 388
Joined: Fri Feb 02, 2018 9:56 am
Contact: