Discussion/MentalLockIn

Mental Lock-In

The other day I stumbled upon a web page defining some notions like generalization/specialization and inheritance. (Intentionally no link provided). Under the heading "Single/multiple inheritance" I found the following explanation (my translation):

"The two classes Student and Employee (at a University) are super classes of class StudentAssistant." etc.

Similar statements I have seen in several text books.

Let my outline the discussion that followed and why I feel that such definitions cause more harm than helping anybody.

Discussion

I reported my dissent to the authors of said web page, stating the problem of, e.g., hiring a Student as a StudentAssistant, which in this model would require to delete the Student instance and create a new instance of type StudentAssistant.

I got as a reply that this would not be needed because one could simply keep the Student instance and additionally include it in the set of StudentAssistants (thus adding some properties to the existing object). The reply concluded by saying that this may not be supported by all OO-languages but a terminology for OO modeling should have a broader scope in order to capture more real world phenomena.

Skewed argumentation

This model is skewed in such a subtle way that many people will readily accept the following conclusions (my wording):

Generalization/Specialization is about modeling the real world, i.e., putting words into a graph which could be called a taxonomy.

In real world objects can change their classification, thus Gen/Spec should account for such dynamism too, which can be accommodated by considering classes as sets and instances as elements that can be added to/removed from sets.

A variant of Gen/Spec is multiple inheritance where a class can have multiple super classes.

If an object acquires new properties belonging to a class of which it previously was not a member, it should be re-classified to a sub class of its original class, where the sub class combines original and newly acquired properties (using multiple inheritance).

If you happen to use a programming language that does not support dynamic re-classification you are just not very lucky and need to work harder to make your program work.

Unfortunately, the designer of your favorite OO language will tell you, that adding dynamic re-classification to his/her language will make any static type checking useless, because moving an object up in the inheritance hierarchy will result in illegally typed references (those still using the more specific type), with no chance that the type checker could give further assistance.

Mental Lock-In

Among the natural conclusions from this elaboration people will learn that Gen/Spec and static type checking just don't match. Or on a more general notion conclude that the world simply isn't perfect.

Alas, the solution is so simple. But once you've subscribed to the above reasoning, you're locked in, believing that this is already the best possible solution.

Proposed Solution

Readers knowledgable in Object Teams certainly know the answer but let me be very explicit for passers-by:

Using the instanceof relationship between an object and its class, GenSpec can be defined as the implication

  C Specialization_of B =
      o instance_of C => o instance_of B

If Specialization should be mapped to inheritance, and if inheritance should be useful for static type checking, we must require all involved relationships to be "rigid", i.e., once a Person, always a Person.

If, OTOH, Specialization in the broader sense should also include dynamic ("anti-rigid") relationships, we must add one link to the chain that supports dynamic addition/removal. In order not to spoil static type checking this link has to be a link between objects, not classes. We call this link "base" and the resulting relationship "Role_of" (or "playedBy" in Object Teams speak ;-) ) :

  R Role_of B =
      o instance_of R => o.base != null && o.base instance_of B

Now the object "o.base" is independent of the coming and going of role instances. We may still want to say: "Each R is a B" (each Student is a Person, each StudentAssistant is a Student) thus interpreting the relationship as a Specialization. But it's nothing that can soundly be mapped to inheritance!'''

Thus, in order to implement the two flavors of GenSpec we need two implementation relationships: a rigid and an anti-rigid one.

Saying GenSpec can be implemented using inheritance is just only half the story. But once we refine our wordings from "GenSpec?" to "Inheritance" we have opted for a rigid relationship. If we don't mean that, we must refrain from inheritance and use roles instead.

Mental Lock-In revisited

The skewed argumentation presented above is dangerous because each single step seems to be (sufficiently) correct. Like  Escher's endless stair case, each step of argumentation convincingly takes you to the next step. At the end you have the uneasy feeling that something is not quite perfect, but since you "know" all steps were correct, you "know" that no better solution is possible. End of story.

(Exercise: can you name the little inaccuracies in the argumentation?)

Practitioner's view

Presenting a model of "Employee extends Person" to practitioners I was told that the model is bullshit to begin with. Nobody would be so insane as to write such code. Follows a description of tedious implementation details of how to maintain separate instances of Person and Job with additional "Manager" entities safe-guarding the consistency between instances etc.pp.

Good. It works. I promise you won't recognize any "is_a" relationship is this implementation.

Conclude: modeling using GenSpec is for wimps only. Real programmers throw a handful of collections and managers at you to handle instances of several types that can directly be stored in an RDBMS.

Communication

Terminology is meant as a means to support communication. The above example shows how terminology can be used to preclude communication between concept enthusiasts and real programmers.

Remember: real programmers can write FORTRAN programs in any language, including the UML.

What's the converse? Perhaps: concept enthusiasts can define notions that are so broad that they span the whole universe and will prove any implementation attempt to be futile from the very outset.

Please: I don't mean to offend anybody. I just wish we could reset some bits in peoples "understanding" so that they can appreciate the simplicity of a solution where conceptual relationships also respect a seemingly unrelated dimension: time. But, can anything we do be unrelated to time??

Stephan