Refactoring Bio With Einstein Part 4: Employment and Families

I'm sure regular readers of this series read part three and thought "nice, but do I have to wait five months for the next part again?". Well I thought I'd buck the trend and just get on with part four :)

I thought I'd go back to the original paragraph about Einstein from Wikipedia:

Einstein was born at Ulm in Württemberg, Germany; about 100 km east of Stuttgart. His parents were Hermann Einstein, a featherbed salesman who later ran an electrochemical works, and Pauline, whose maiden name was Koch. They were married in Stuttgart-Bad Cannstatt. The family was Jewish (and non-observant); Albert attended a Catholic elementary school and, at the insistence of his mother, was given violin lessons.

I just rechecked Wikipedia this paragraph has hardly changed which is encouraging. Presumably wiki-consensus has been reached on that aspect of Einstein's life. Here's what I have so far, expressed in Turtle rather than RDF/XML.

# People

_:albert a foaf:Person ; foaf:name "Albert Einstein" ; bio:father _:hermann ; bio:mother _:pauline ; bio:event _:albert-birth ; bio:event _:albert-taking-violin-lessons ; bio:event _:albert-attending-elementary-school .

_:hermann a foaf:Person ; foaf:name "Hermann Einstein" ; bio:event _:hermann-and-pauline-marriage ; bio:event _:hermann-and-pauline-being-married .

_:pauline a foaf:Person ; foaf:name "Pauline Einstein" ; bio:event _:hermann-and-pauline-marriage ; bio:event _:hermann-and-pauline-being-married ; bio:condition _:pauline-married-name ; bio:condition _:pauline-maiden-name .


_:albert-birth a bio:Birth ; rdfs:label "The birth of Albert Einstein" ; bio:date "1879-03-14" ; bio:place "Ulm, Württemberg, Germany" ; time:intDuring _:hermann-and-pauline-being-married .

_:albert-attending-elementary-school a bio:Event ; rdfs:label "The event of Albert attending elementary school" ; time:intAfter _:albert-birth .

_:albert-taking-violin-lessons a bio:Event ; rdfs:label "The event of Albert taking violin lessons" ; time:intAfter _:albert-birth .

_:hermann-and-pauline-marriage a bio:Marriage ; rdfs:label "The marriage of Hermman Einstein and Pauline Koch"; bio:place "Stuttgart-Bad Cannstatt" ; time:intMeets _:hermann-and-pauline-being-married .

_:hermann-and-pauline-being-married a bio:Event ; rdfs:label "The time during which Hermann and Pauline were married" ; time:intContains _:albert-birth .


_:pauline-maiden-name a bio:Condition ; time:intMetBy _:pauline-birth ; time:intMeets _:pauline-married-name ; foaf:name "Pauline Koch" .

_:pauline-married-name a bio:Condition ; time:intMetBy _:hermann-and-pauline-being-married ; foaf:name "Pauline Einstein" .

There are some anomalies such as foaf:name still being applied to the foaf:Person and some of those events perhaps should be replaced by or enhanced with conditions. But I want to examine the rest of the paragraph for what I haven't expressed yet. I can see:

  1. Ulm is about 100km east of Stuttgart
  2. Hermann was a featherbed salesman
  3. Hermann later ran an electrochemical works
  4. The family was Jewish
  5. Albert's elementary school was Catholic
  6. Albert's violin lessons were at the insistence of his mother

I'm going to dismiss 1, 5 and 6 as out of scope for Bio and move straight onto 2. Bearing in mind my earlier characterisation of an Event as something that brings about a change in condition of an individual I can model it is as a Condition and two Events. The Condition is that Hermann is employed:

  a bio:Condition ;
  rdfs:label "Working as a featherbed salesman" .

Then there is the event of Hermann starting this period of employment:

  a bio:Event ;
  rdfs:label "Starts work as featherbed salesman" .

and the event of him finishing this employment:

  a bio:Event ;
  rdfs:label "Starts work as featherbed salesman" .

I can relate the three things together like this:

  time:intMetBy _:hermann-starts-working-as-featherbed-salesman ;
  time:intMeets _:hermann-stops-working-as-featherbed-salesman .

_:hermann-starts-working-as-featherbed-salesman time:intMeets _:hermann-as-featherbed-salesman ; time:intBefore _:hermann-stops-working-as-featherbed-salesman .

_:hermann-stops-working-as-featherbed-salesman time:intMetBy _:hermann-as-featherbed-salesman ; time:intAfter _:hermann-starts-working-as-featherbed-salesman .

I can use the same model for item 3 on my list

  a bio:Condition ;
  rdfs:label "Running an electrochemical works" ;

_:hermann-starts-running-electro-works a bio:Event ; rdfs:label "Starts running an electrochemical works" ; time:intMeets _:herman-as-electro-works-runner ; time:intBefore _:hermann-stops-running-electro-works ;

_:hermann-stops-running-electro-works a bio:Event ; rdfs:label "Stops running an electrochemical works" ; time:intMetBy _:herman-as-electro-works-manager; time:intAfter _:hermann-starts-running-electro-works ;

I know that Hermann was running an electrochemical works after he was a featherbed salesman, but the description doesn't give me any more than that:

  time:intAfter _:herman-as-featherbed-salesman .

Since employment is a very common condition for a person to be in, it's probably appropriate to define a class for it:

  a bio:Employment .

_:herman-as-electro-works-manager a bio:Employment .

I'd also like to be more explicit about the actual job Herman was doing during those periods of employment. I've used rdfs:label to describe the events and conditions but I'd like to structure this information in some way. My first guess is simply to introduce a bio:jobTitle like this:

  bio:jobTitle "Featherbed Salesman" .

_:herman-as-electro-works-manager bio:jobTitle "Electrochemical Works Manager" .

Now, these events and conditions are related temporally by the OWL-Time properties I'm using but I think it would be useful to relate them causally too. Doing this modelling is helping me understand the relationshops between the concepts I'm using. If a Condition is the state of being for an individual at a particular period of time and an Event is something that brings about a change in condition of an individual then there's a causal relationship between the two concepts. An event brings about a new condition and a condition is a consequence of an event. Here's how I could express that latter idea:

  bio:consequence _:herman-as-featherbed-salesman .

_:hermann-starts-running-electro-works bio:consequence _:herman-as-electro-works-manager .

Clearly events can have multiple consequences:

  a bio:Marriage ;
  bio:consequence _:hermann-married-to-pauline ;
  bio:consequence _:pauline-married-to-hermann .

_:hermann-married-to-pauline a bio:Condition ; rdfs:label "Being married to Pauline" .

_:pauline-married-to-hermann a bio:Condition ; rdfs:label "Being married to Hermann" .

But can conditions be the consequent of multiple events? I'm not sure. A literal reading of my definition of event suggests than every event brings about a change in condition of an individual without regard to any other events. I think I'm going to wait until I have more experience before deciding on this one.

So that's items 2 and 3 described, so onto 4 which is the family's faith. I could apply a new bio:faith property to a condition for each person but the paragraph clearly says that the family's faith was Jewish (non-observant). It makes no mention of the individual family members' beliefs. Clearly people can be born into families of one faith without necessarily subscribing to it themselves. So perhaps I have to model a family unit. I could invent a bio:Family class and make the Albert and his parents members of it. I don't want to assume any particular social structure but a family seems to be a universal unit of social organisation. The extent of it and the degree of involvement does vary consideraby between cultures but the definition of bio:Family can be kept as unconstrained as possible.

It just so happens that another schema I've been involved with has a class that could be useful here. The Relationship schema, with which I had a minor and mostly editorial role, defines a Relationship class which is described as A particular type of connection existing between people related to or having dealings with each other. I added that class hoping to use it for things like family units, marriages and partnerships. It was kept very open to enable use across all kinds of social situations. Here's my definition of a bio:Family class with a definition drawn from the Wikipedia page on Family:

  a owl:Class ;
  rdfs:subClassOf rel:Relationship ;
  rdfs:label "Family"@en ;
  rdfs:comment "a domestic group of people" .

The Relationship vocabulary provides the rel:participant property to declare involvement in a Relationship class. It can be used like this:

  a bio:Family ;
  rel:participant _:albert ;
  rel:participant _:hermann ;
  rel:participant _:pauline .

Now here's an interesting thing: there was a time when this Einstein family consisted only of Hermann and Pauline. Then Albert was born and he became a member too. So families have a state of being too which changes over time and those changes are triggered by events such as the birth of Albert. So perhaps I should be modelling the family in terms of Conditions too. Here's how it could work:

  a bio:Family ;
  bio:condition _:family-with-hermann-and-pauline ;
  bio:condition _:family-with-hermann-and-pauline-and-albert .

_:family-with-hermann-and-pauline a bio:Condition ; rel:participant _:hermann ; rel:participant _:pauline ; time:intMetBy _:hermann-and-pauline-marriage ; time:intMeets _:family-with-hermann-and-pauline-and-albert .

_:family-with-hermann-and-pauline-and-albert a bio:Condition ; rel:participant _:albert ; rel:participant _:hermann ; rel:participant _:pauline ; time:intMetBy _:albert-birth ; time:intMetBy _:family-with-hermann-and-pauline .

_:hermann-and-pauline-marriage time:intMeets _:family-with-hermann-and-pauline ; bio:consequence _:family-with-hermann-and-pauline .

_:albert-birth time:intMeets _:family-with-hermann-and-pauline-and-albert ; bio:consequence _:family-with-hermann-and-pauline-and-albert .

Here I'm saying that there is a family that has two conditons (states of being): one when it consisted of Hermann and Pauline and another when it included Albert too. Of course there may be others but I'm not considering those yet. The first condition started as soon as the marriage of Hermann and Pauline was complete and in fact was a consequence of that marriage (I'm simplifying here because they were a type of family before marriage too). This condition existed until the birth of Albert whereupon a new condition arises. This new condition is a family consisting of Hermann, Pauline and Albert and is a consequence of Albert's birth. I think that makes sense.

Now what about that faith property I wanted to use, how do I do that? Well it seems obvious now that it has to be part of a condition. All I want to say is that at some point, and certainly when Albert was born, the family was of the Jewish faith:

  bio:condition _:family-faith .

_:family-faith bio:faith "Jewish (non-observant)" ; time:intContains _:albert-birth .

It doesn't give me a great deal of information about ordering of events. It would be good to anchor that to some event that defines the formation of the family:

  a bio:Event ;
  time:intBefore _:family-faith ;
  time:intBefore _:family-with-hermann-and-pauline .

But I have no way of saying that the family didn't exist before this event. I have the same problem with people. There could be some small value in having properties that represent the start and end events of a family or a person and adding constraints so that entities can have only a maximum of one of each type. For a start event like a birth, I'd like then to be able to say that there can be no events or conditions that start before that event - another area I believe is not supported by OWL yet.

But that's the end of the list and I think I've managed to model the entire paragrah of biographical information. It remains to be seen how useful this information really is and what holes there are in the sequencing of event. That's the end of this posting, but I already have plans for several more. Now I'm worried that I've jinxed them by mentioning them in public :) Remember also that all of the development I'm doing here is experimental. I haven't made any changes to the Bio schema as yet and I may choose not to. I'm still thinking this through and trying to understand where the limits of this all-in-the-rdf-model approach lie. I'm actually pretty pleased with it so far and I'm gaining confidence that it's going to end up expressive enough to answer the kinds of genealogical questions I have in mind.

P.S. While writing this it occurred to me that I forgot to mention another candidate for a temporal invariant in the previous post and it's one that already exists in the Bio schema - bio:olb. This property was the whole reason why Dave and I created Bio in the first place. It's a simple enough concept - a one line biography of the person. It's a potted history, a summary of that person's life achievements. Now, it may change over the lifetime of the person but it does so in a specialway. It's a cumulative record of that person's life. It doesn't depend on a particular state of being of a person, but on all of them up to that point in time. It's not strictly invariant but it's not something that gets replaced over time - it simply grows.

See also: posts in the "Refactoring Bio" series: Part 1: First Steps, Part 2: Conditions, Part 3: Temporal Invariants, Part 4: Employment and Families


Earlier Posts