The semantic web and schemas?

TL;DR: how RDFaL and HTML interact.

For a while now, I have been working on my Jekyll theme, structRDFaL, which is meant to make generating structured data in accordance with schema.org easier. As stated, improving SEO for your site is a secondary concern for paying attention to schemas. It should be about best practices and being a good netizen with just a little bit more effort.

The "a little bit more effort" is key to my current dilemma. schema.org has lots of information and examples, but presented in a way that more or less expects that you know what is going on. I have yet to find a bridge between the list of types and examples on a random schema page and the hard core (?) mostly unreadable (to me) technical standards type of documents. I would very much like to find a popsci type explainer/implementation describer on this topic. I expect that part of the problem is that I chose to use RDFa Lite vs JSON which would not (?) have the "envelope" problems I have encountered. Thus, I continue to be a cut-and-paste keyboard monkey and not a properly informed technician.

The problem with being a keyboard monkey is the endless and repetitive testing by doing and checking the results, instead of knowing and doing (and then testing to see if it's been properly implemented 🥸). In the midst of the Sisyphean coding by validation testing, I had a crisis of faith. Why was I doing this? If SEO is not the goal, then could/should I make do with proper structured HTML usage and call it a day? What did the Semantic Web want in the first place???

W3's Semantic Web Best Practices document was not helpful. More than half the links on the tutorial page are currently dead. I'm not trying to make my own technical vocabulary ... or am I???.

In any case, I'm certainly not trying to help you (or anyone who might choose to use my structRDFaL Jekyll theme) write your own formal 'net recognized vocabulary. So, why am I doing this? schema.org schemas are not even proper standards type things! Part of it, perhaps, is suffering from sunk cost (fallacy?) problems. So much time already wasted trying to do this thing, so just finish it. Most schemas like Blog or Book were relatively easy to implement (just a little more effort). Trying to make a generic HowTo tool has been a bear, though. Are schemas actually useful to the semantic web or no? What am I even doing and why??

This is my existential crisis, however, and you should not care. 😅 So, let me share what I have discovered about RDFaL for my fellow keyboard monkeys and leave it at that, for now.

RDFaL and HTML interaction

HTML element tags envelope/contain schema structures in what I find to be an inconsistent way. Sometimes when you tag a schema property 2 layers down, it is not recognized as part of the same structure.

Modified example from HowTo
<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="1"/>
  <a href="more.html">
    <p property="text">Turn on your hazard lights and set the
      flares.</p>
  </a>
</div>
      
https://validator.schema.org/ output
Detected               1 ERROR   0 WARNINGS  2 ITEMS
Unspecified Type       1 ERROR   0 WARNINGS  1 ITEM
HowTo                  0 ERRORS  0 WARNINGS  1 ITEM
      
The ERROR is on the on the <p> line and says
Unspecified Type (The @type is required and cannot be an empty string.)

The broken-ness is not always an error. Sometimes structures are recognized as separate fragments such as a howTo and 3 other howToDirections rather than a single howTo with however many sections, steps, and directions inside of it.

Operating on the assumption that you must have schema tags in every layer in order to end up with a single complete structure instead of disparate fragments, I began testing. I wanted to know what to do if I had an image that could expanded (<a>). What to do if I were to tag a structure as a <figure> and what happens if an image was a <picture> vs <img>?

Additionally, in the official example code, there is

Tagging the img directly
<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="1"/>
  <img alt="image showing positioning of jack" property="duringMedia" 
    src="position-jack.jpg" />
  <div property="text">Position the jack underneath the car, 
    next to the flat tire.</div>
</div>	
      
as well as
Tagging the img as an object
<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="1"/>
  <div property="beforeMedia" typeof="ImageObject">
    <img alt="image showing car while still on the ground" 
      property="contentUrl" src="car-on-ground.jpg" />
  </div>
</div>
      
Are they both correct? Is one actually a mistake?!? 🤷🏻‍♀️

The following "work", but are they "correct"? At least they validate with schema.org's validator:

Sequence tagging an <img>
<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="1"/>
  <img alt="..." property="duringMedia" src="position-jack.jpg" />
</div>
      
Generic <figure> tagging with sequence
<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="2"/>
  <figure property="beforeMedia" typeof="MediaObject">
    <img alt="..." property="contentURL" src="car-on-ground.jpg" />
  </figure>
</div>
      
Specific <figure> tagging with sequence
<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="3"/>
  <figure property="beforeMedia" typeof="ImageObject">
    <img alt="..." property="image" src="car-on-ground.jpg" />
  </figure>
</div>
      
Tagging text vs images
<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="4"/>
  <figure property="duringMedia" typeof="textObject">
    <p property="text">...</p>
  </figure>
</div>
      
Media without a specified sequence can be used, but if for some reason you want/need a tag to fill in the layers you can use AssociatedMedia as a generic non-sequence property. If your media is an image, the you can use typeof="mediaObject" if you don't care to get more specific and the <img> will take care of the type. Unfortunately, not so in the case of text. There you will need to add both a typeof="textObject" and a property="text".
<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="5"/>
  <figure property="AssociatedMedia" typeof="textObject">
    <p property="text">...</p>
  </figure>
</div>
      

Having struggled and wrangled to discover all that, I recently discovered that while an <a> layer disrupts the recognized structure of the schema, neither <figure> nor <picture> do. In fact, you can use both <figure> and <picture> then just property tag the <img> within and it still works properly. 🤦🏻‍♀️

<div property="itemListElement" typeof="HowToDirection">
  <meta property="position" content="6"/>
  <figure>
    <picture>
      <img property="image" src="..." />
    </picture>
  </figure>
</div>
    

If linking an expanded image and you want the linking anchor (image) to be recognized in the data structure, then sequence tag the link and give the image anchor the thumbnail property.

Just image
<a property="beforeMedia" href="bigger-before.jpg">
  <img property="thumbnailURL" alt="..." src="before-thumb.jpg" />
</a>
      
Unnecessarily tagged picture layer because I don't understand what the thumbnail property is doing (it's certainly not recognized here but doesn't break anything if it's included?).
<a property="duringMedia"  href="bigger-during.jpg">
  <picture property="thumbnail">
    <img property="thumbnailURL" alt="..." src="during-thumb.jpg" />
  </picture>	
</a>
      
If you were to use them all in a single Direction, though, there is no particular recognition of association between a particular thumbnail to a particular media. Is this therefore "wrong"?? 🤷🏻‍♀️
Validator output. Even though the before/during/after medias were coded in that order, the recognized structure doesn't see them that way? What does it mean??
itemListElement
   @type                      HowToDirection
   position                   2
   text                       some instructional text
   afterMedia                 after text
   beforeMedia                https://example-test.site/bigger-before.jpg
   thumbnailUrl               https://example-test.site/before-thumb.jpg
   duringMedia                https://example-test.site/bigger-during.jpg
   thumbnailUrl               https://example-test.site/during-thumb.jpg	
      

Post a New Comment






?

Note: for security reasons, a mailto link is being used. If configured on your end, this is the safest way for both parties. This activates your mailer to send the data entered. See here or here for why that might not work and what to do about it.