The Semantic Web Rabbit Hole
So, I was working on implementing the Book schema and had gotten pages and posts working with test data off the top of my head. I had put a hook in for book lists and decided that this would be a good place to test out Jekyll collections.
Now, I could fabricate more data, but since the inspiration for wanting book lists in the first place was sitting right there, no need to imagine a list. I'd take a few books from the page, create them as collection records, test out the whole collection stuff, decide how to template/style a list entry (I was already thinking about how to activate/deactivate fields for a nice listing format), and Bob's your uncle! Of course, life is not so simple.
The second book (which in retrospect, I should maybe have skipped?) was one of the more complicated types you're likely to encounter:
- there are multiple editions
- it is the translation of another book
- many elements have multilingual versions
Add more fields. Define them inside the schema and just for the
page. Book part, per se, done but then I have all this
publication/publisher information that I wanted to retain. You can.
You can define a publisher as
an Organization and then their address can
be a PostalAddress and, and, and... but do you need to?
Examples on schema.org
use WorldCat
and VIAF
as resources. If I could point to an official record
somewhere with all the publisher information or even all the author
information, then I wouldn't have to be storing or coding it myself:
go there if you're curious or need to know for whatever.
VIAF is hideously confusing at first glance. It was quick to narrow down from "author with this name" to "author of this book with this name". The record had tonnes of other entries that I believe are librarians from various different instutions doing data entry ever so slightly differently, but no so much that the authority's software doesn't collate them together (without winnowing them down to a canonical one). But they only had one of her five books.
WorldCat on the other hand looked to have all of her books, but very few had the cover images, in particular the non-English editions making me less certain of my interpretation. Also, in WorldCat, I was unable to winnow out the other authors with the same name, so... what to do???
I had hoped that the "official" sources of the world would save me from formally formatting a bunch of data and let me just point to the references but I have more data than they do, and I am begrudgingly willing to encode it. If I go to the trouble, can I contribute it to the world's formatted data? VIAF and WorldCat are library based places. Of course, you don't want random mischievous users polluting your rigorously formatted formal data pool, but you still need help, no? I found a document that mentions that VIAF takes "contributions" from WikiData which does take user contributions... should I?
The lazy (correct??) voice in my soul says that I could retain this extraneous information, unstructured, in the description of my book data and call it a day (or even not bother retaining it, who cares?). Does the publisher even exist anymore? Would/will anyone ever need to make use of the information? These are some of the thoughts that plague an info-pack rat.