Improve schema.org Markup

https://www.phpbb.com/ideas/
Post Reply
rrlevering
Registered User
Posts: 4
Joined: Fri Mar 17, 2023 5:19 pm

Improve schema.org Markup

Post by rrlevering »

My name is Ryan Levering and I handle structured data ingestion at Google (this guy). I also hosted a couple of phpBB forums many, many years ago. I'm not sure what forum to post this in, but since it's a feature request, I'll stick it here.

Right now phpBB only provides so#BreadcrumbList markup on the webpages (put one of your URLs in http://validator.schema.org to see what I mean). Ideally the thread pages would have markup structured something like:
so#WebPage -> so#mainEntity -> so#DiscussionForumPosting (for OP) -> so#comment -> so#Comment for all the responses with so#author links pointing to profile pages for those authors. As a rough example, you can look at Reddit's markup in http://validator.schema.org. They have a more threaded model, but the same rough schema (with flat comments) would be appropriate for phpBB.

Anyway, I happen to know that a number of our systems at Google consume schema.org markup beyond what phpBB provides right now. Giving more, valid schema.org markup could potentially fix issues in search engines like date detection, which post is the main content of the URL, and better answering questions based on comments in the threads. It is likely this would result in promoting your content better in search engines or at least fixing accuracy problems.
User avatar
[Dimetrodon]
Registered User
Posts: 438
Joined: Tue Aug 30, 2022 3:29 am
Location: Paleozoic Era
Contact:

Re: Improve schema.org Markup

Post by [Dimetrodon] »

If we have someone at google posting here, I feel like this warrants more attention.
Avatar by someone named AdmiralRA on Reddit. (No, I don't have a Reddit account)
When seeking support, please consider filling out the Support Request Template. It makes it easier for anyone trying to help.
User avatar
AmigoJack
Registered User
Posts: 6108
Joined: Tue Jun 15, 2010 11:33 am
Location: グリーン ヒル ゾーン
Contact:

Re: Improve schema.org Markup

Post by AmigoJack »

It's highly ironic that a Google guy advertizes schema markup without using one tiny bit of formatting markup (BBCode) himself and not using correct terms either (phpBB has no threads, only topics). Let alone linking the validator website right away with sample URLs (like https://validator.schema.org/#url=https ... dea%2F4071 and... don't know, because neither https://validator.schema.org/#url=https ... 2FphpBB%2F nor https://www.reddit.com/r/phpBB/comments ... _of_phpbb/ gives me any schema markup).

So, could someone else do it properly and at least link us to a reddit example? Unless @rrlevering edits his post...
  • "The problem is probably not my English but you do not want to understand correctly. ... We will not come anybody anyway, nevertheless, it's best to shit this." Affin, 2018-11-20
  • "But this shit is not here for you. You can follow with your. Maybe the question, instead, was for you, who know, so you shoved us how you are." axe70, 2020-10-10
  • "My reaction is not to everyone, especially to you." Raptiye, 2021-02-28
rrlevering
Registered User
Posts: 4
Joined: Fri Mar 17, 2023 5:19 pm

Re: Improve schema.org Markup

Post by rrlevering »

We are releasing official docs in the upcoming months that will be very thorough, feel free to wait for those. This was just a heads up for everyone since it takes so long to roll out something like that. And sorry for the Reddit link, you're right that doesn't work.

There are a number of forum softwares that have reasonable schema.org markup to use as examples (even without clear documentation). Here's Discourse for instance: https://validator.schema.org/#url=https ... ts%2F71471. They use inline Microdata which is probably preferable if you can do it right from scratch, because it's mostly annotations around existing HTML. But injected JSON-LD is usually easier to work with as a standalone change.
rrlevering
Registered User
Posts: 4
Joined: Fri Mar 17, 2023 5:19 pm

Re: Improve schema.org Markup

Post by rrlevering »

As I promised, the guidelines and Search Console tooling have been released for Discussion Forums (your posting pages): https://developers.google.com/search/do ... sion-forum and Profile Pages: https://developers.google.com/search/do ... ofile-page. I would recommend you implement the schema.org markup as an option for your forums if they are interested in inclusion in any of the forum-oriented features (and traffic) we are building. We will be attempting to extract these things automatically at some point, but markup is always going to work more accurately.
Nekstati
Registered User
Posts: 30
Joined: Tue Mar 17, 2009 11:08 pm

Re: Improve schema.org Markup

Post by Nekstati »

Yes, we need to implement this, but --

Any HTML document is already structured data by its origin.

Google, by forcing us to add this so-called Structured Data, forces us to duplicate our data, just in a different form. This is ridiculous, this is absurd.

Code: Select all

<title>phpBB • Improve schema.org Markup</title>

<li class="breadcrumbs">
    <span class="crumb"><a href="./index.php">Board index</a></span>
    <span class="crumb"><a href="./viewforum.php?f=451">Extensions Forums</a></span>
</li>
Why on Earth should this clean and absolutely self-explaining code be polluted by itemscopes, itemtypes, itemprops and other metadata? Why in hell can't Google, with all its claimed power of AI, understand this code as is?

Moreover - Google, as a search system, works worse and worse every year. No metadata seems to help it work properly and find exactly what people are asking for - like it worked 15 years ago.

But Google is actually a monopolist. Unfortunately, we must listen to them, whether we want it or not, and implement every absurdity they come up with.
rrlevering
Registered User
Posts: 4
Joined: Fri Mar 17, 2023 5:19 pm

Re: Improve schema.org Markup

Post by rrlevering »

We probably could extract that fine without markup. A large amount of our extraction for a number of features (including breadcrumbs) comes from extraction and some of it is coming from more sophisticated general models these days like you are alluding to.

But...
  1. sometimes there is metadata that is not included in the page contents (like time precision on post dates)
  2. model-based extraction is never 100%, like a data feed that is sort of what markup is
  3. wouldn't you rather have some control over fixing your own search appearance rather than whatever we extract?
The web has changed dramatically in the past 15 years and it's harder to sort through the junk. But as a user myself, I really do think that the results have gotten better in the past year. This is partly because we're highlighting content in forums like those that run on phpbb and have less junk on them than generic, over-optimized webpages. So please help us.
Post Reply

Return to “phpBB Ideas”