The World of Conceptual Data Modeling

The virtual world. 

The world of conceptual data modeling.

It has rules. Rules as undeniable and irrefutable as the laws of physics. 

There are ways that concepts can be assembled, and ways that they can't.

#SoftwareEngineering #DataModeling #Software #Data

Trouble is, most of us data modelers do not fully understand these rules. 

And so we keep trying to do magic.

It's important to understand these things about your data.

Is it a Reference Type?

Or is it a "scalar value" type? 

   The latter mans that if an identical value of this conceptual type exists in two or more places, it exists as a copy. ( and importantly SHOULD  exist as a copy )

Is it a "primary" or "singular" concept, or is it a relation between two distinctly and independently referenceable concepts? 

Is it a "concrete" concept or an abstraction/type union? ( this can apply to both singular and relation type concepts )

"Child Entities" are a myth. At least child entities that are "reference" type entities are a myth. If an entity is a reference entity, and can be directly referenced independent of its purported "parent" context, then  it is an associated "peer" entity not a "child" entity. True "Child entities" must exist only as non reference "copy on read" types because by definition they cannot be referenced or understood independent of their "parent" context. 

Peer Associations are Real. Don't hide them by copying something that SHOULD be referenced, into every entity that references it and calling those observers "documents"  ( I'm looking at you Mongo... )

Mongo is fine. And in many ways preferred because we should not enforce foreign key constraints ( doesn't scale ) . But it perhaps learned a lesson too well. We need to not forsake the concept of a "relationship" entity.   The problem we needed to get away from, was foreign keys.  

We also needed to not forsake the concept of polymorphism, concept abstraction, and inherited obligation. Declaring burdens that an entity must bear and support if they are to be able to satisfy a given abstract need.

These are hard things. Good data is hard, and most engineers do not know these things. 

Most engineers assume that we can do whatever we want with data. Model it however we want. Stuff it in any box we want. 

This is "magic" and it will fail at scale. 

I need to add that there is a place for foreign keys, ACID compliance and pessimistic locking and such, but typically not in externally accessible (directly accessible) data. 

This gets into another topic though, separating async, business domain processing compute from state transfer, data i/o heavy compute. The same goes for schema and data store choices.


Oh, and scalars are Fungible Tokens. Reference types are Non Fungible Tokens. 

Comments

Popular posts from this blog

Define "Scalar Value"

AI Self Awareness in Video Games