Technology  >  Data Modeling Rules

Data Modeling Rules (DMR)

We defined the following Data Modeling Rules (DMR) with justification by Project Objectives (PO) and Project Principles (PP).

DMR 1 Never reference (FK) a meaningful value (PK/UK).
When primary key (PK) values or unique key (UK) values in a parent record are referenced by a foreign key (FK) in a child table, the PK/UK-value in the parent key can no longer be updated. This introduces an unacceptable penalty for end users: When they make typos when entering PK/UK-value in the parent table, the typos can't be corrected afterwards. The resolution is to "never reference (FK) a meaningful value (PK/UK)".
DMR 2

One meaningless column (with PK) and never changing value for each table. Since we can't reference a meaningful value (DMR 1), we need 1 meaningless column PK for each table. This is the only column to be used for referencing FK's.
Justification: PP 9

DMR 3 n meaningful columns (with UK) for each table. This implements meaningful business UK's.
If more meaningful business keys are available, implement one UK for each meaningful business key.
Justification: PP 4
DMR 4

Recordsets (tables, views etc.) get plural-name since they can have multiple records. The (meaningless) PK-column is alway's on the first postion en gets the singular name of the recordset.
Example:
create table cities
( city raw(16)     default sys_guid() not null primary key
name varchar2(80) not null unique
);

DMR 5

Choose sys_guid() instead of sequences for populating meaningless PK-columns.
Traditional sequences are used for populating an ID-column.
Sequences have several disadvantages:
• Sequences have to be created, maintained and deployed. This violates PP 4 (MAWND)
• Sequences need PL/SQL-code to be fetched (the ‘select into’ construction is obsolete in 11g, but defaulting a column with sequence.nextval still does not work). Again, maintaining PL/SQL violates PP 4 (MAWND)
• When you introduce triggers to fetch the .nextval value, you introduce significant undesired performance penalties
• When moving your data from data from one environment to another (e.g. copy PRODUCTION data to ACCEPTANCE), you have to synchronize the .LASTVAL of all sequences. Synchronizing does not conform to PP3
• Fetching sequences can be a performance-critical in big environment (bulk fetch in Data warehouse for instance) which has to be fixed by increasing the CACHE property
Use guid's to overcome the disadvantages of sequences:
create table employees
( employee raw(16) default sys_guid() not null primary key);

Justification: PP 2

DMR 6 Self documenting data-model
Applies to data model & software. Maximize the effort in choosing only the best table and column names, describing exactly what they mean. Consequently follow your naming conventions.
Justification: PP 2
DMR 7

Use unambiguous semantics in data model.
Consider the following parent/child tables:
create table cities
( city raw(16)     default sys_guid() not null primary key
, name varchar2(80) not null unique
);
The city table is referenced 3 times (all different meaning) by table persons
create table persons
( person     raw(16)      default sys_guid() not null primary key
, name       varchar2(80) not null unique
, born_in    raw(16)      references cities
, working_in raw(16)      references cities
, living_in  raw(16)      references cities
);
By embedding the semantic of FK-relation ship in the FK-column name, the semantics of the data model cannot be misunderstood. What we often see is that the semantics are hidden (redundant) in comments, documents and multiple software components. (Conflicts PP4 and PP7)
Restrict the usage of comments for those cases where the semantics are not 100% covered by table and column names. (Justifcation:PP7)
Self documentation is better then hiding the documentation in separate documents that are often not maintained consequently, if used and maintained at all.
Justification: PP 2

DMR 8

Use defaults where possible.
Defaults make the writing of PL/SQL unnecessary and keep the insert statements
simpler and more compact.
Example:
create table employees
( employee         raw(16)      default sys_guid()   not null primary key
name             varchar2(80)                      not null unique
, create_timestamp timestamp    default systimestamp not null
, create_user      varchar2(30) default sys_context('USERENV','OS_USER')||'@'||userenv('TERMINAL')                                         not null
);
(Justifcation:PP3,PP6)

DMR 9 Use guid-value tables in stead if code-value tables.
(Justifcation:DMR1)
DMR 10 Avoid generic guid-value tables (Justifcation:PP4, PP5)
For understanding generic guid-value tables you need additional semantic knowledge to understand/implement generic guid-value tables which violates (DMR6)
DMR 11 Use Oracle 11g virtual columns (Justifcation:PP6)
Use Oracle 11g virtual columns instead of views if possible
DMR 12

Used hard-coded (in stead of soft-coded) allowable values for domains when::
1) you are sure allowable values never change e.g. ('Yes','No') or ('Monday','Tuesday','Wednesday','Thursday','Friday','Saturday','Sunday')
2) Software relies on the allowable values e.g. ('Succeeded','Failed')
Justifcation:PP4, PP6
In all other cases, use soft coded implementation using a GUID-VALUE table per domain.
Background: when you soft code the allowable values of a domain, while software relies on it, soft coding gives suggestion that extra allowable value can easily be added (true) but of course you don't get the new software, matching with the new allowable value for free.

DMR 13 All meaningful UK-columns in the application should use the same column-name, preferably "NAME" (Justifcation:PP9)
This increases the application consistency.
The column-name "NAME" should be chosen as the default unless a better (more meaningful) column name is available.
DMR 14

One application parameter table, one column for each parameter (PP4, PP5)
create table system_parameters
system_parameter raw(16)      default '1AF787E2EF084C51A064490B8D30D2EC' not null
, log_level        varchar2(11) default '0-Emergency' not null
, parameter_xyz    integer      default 7             not null
, parameter_abc    date         default sysdate
, parameter_def    number       default 3.14
);
Use database constraints per parameter:
alter table system_parameters add(
constraint sys_par_one_row_chk
check(id in('1AF787E2EF084C51A064490B8D30D2EC')));
alter table system_parameters add(
constraint sys_par_log_level_chk
check(log_level in('7-Debug','6-Info','5-Notice','4-Warning','3-Error','2-Critical','1-Alert','0-Emergency')));
alter table system_parameters add(
constraint sys_par_parameter_xyz_chk
check(parameter_xyz between 6 and 9 )); 

DMR 15

One journaling table per application. (PP3)
Only purpose: Be able to simply query record-history for gui.
Journaling table is not suitable for querying historical data
Only journal changed columns
Much easier to build a query showing the changed column values
Use list partitioning on the journal table, one per base-table
create table application_journal
( application_journal raw(16)      default sys_guid()   not null primary key
, dml_timestamp       timestamp    default systimestamp not null
, dml_user            varchar2(30) default user         not null
, dml_type            varchar2(6)                       not null 
                      check(dml_type in('Insert','Update','Delete'))
, owner_name          varchar2(30)                      not null
, table_name    
      varchar2(30)                      not null
, column_name         varchar2(30)                      not null
, row_guid            raw(16)                           not null
, old_value           varchar2(4000)
, new_value           varchar2(4000)
, reason              varchar2(4000)

);
alter table application_journal add
constraint application_journal_uk1
unique(owner_name, table_name, column_name, row_guid, dml_timestamp)
enable validate;
Note: Column “Reason” can be used (automatically) for logging Business Rules in case of change-event rules

DMR 16 Use Oracle Flashback/Total Recall for querying historical data. (PP8)
Don’t build this yourself by introducing ”valid from” and ”valid_to” column, just use of the shelf Oracle Flashback/Total Recall solution.
DMR 17 Don’t use public synonyms. (PP9)
Strict object naming conventions for all objects ensure avoiding naming conflict in the public namespace. Grants implement security. Only if an object is very generic, a public synonym could be considered to minimize typing in the objects that reference the generic, public object. In such a case the public synonym name and object name should be 1:1.
DMR 18 Don’t use private synonyms. (PP 2)
DMR 19 Sequences are only allowed for generating quasi-meaningful data
The sequence generates a number that is meaningless at create time but gain’s meaning during it’s lifetime like account-numbers, client-numbers etc.
DMR 20 Apply all known constraints to all tables and views (PP4)
For documentation and to help the optimizer to do clever tricks, particularly in the area of query rewrite.
DMR 21 Don’t spoil valuable character positions
Oracle has 30 positions for embedding semantics in object-names. Don’t spoil these valuable positions with strings like ‘_PCK’ for packages or ‘_TRG’ for TRIGGERS. all_objects.object_type holds this information for you and tools like Toad use that information to properly group your objects
DMR 22 Don’t use entity-model (PP 2)
For discussing relational information model with client/end-user use the information model (=datamodel) together with representative data and an automatic generated gui. The automatic generated gui must incorporate all database constraints, hide all technical details like meaningless guid-columns and automatically replace meaningless values (PK) by meaningful values (UK). The disadvantage of using tooling like Oracle designer your business rules are defined on entities so you have no other choice but to maintain both the entity-model and data-model (manual redundancy) resulting in violation of PP 2.

 

 
Copyright (c) 2013 www.datajini.com