Data Modelling Patterns

From Glitchdata
Jump to navigation Jump to search

Relational Schema Patterns

Standard Relational Schema Patterns, ER designs. Some 6 design patterns for Relational database are:

  1. First Normal Form (1NF)
    • includes definition of a relation
  2. Second Normal Form (2NF)
    • defined in terms of functional dependencies
  3. Third Normal Form (3NF)
    • defined in terms of functional dependencies
    • Most OTLP systems are in 3NF
  4. Boyce-Codd Normal Form (BCNF)
    • defined in terms of functional dependencies
  5. Forth Normal Form (4NF)
    • defined using multivalued dependencies
  6. Fifth Normal Form (5NF) or Project Join Normal Form (PJNF)
    • defined using join dependencies

OLTP Patterns

Online Transactional Processing (OLTP) systems are characterized by a large number of short on-line transactions (INSERT, UPDATE, DELETE). The emphasis in OLTP Systems is on efficient and high-throughput query processing, strong data integrity in a multi-access environment. OLTP Systems are measured by number the number of transactions per second. OLTP database store detailed and current data. Transactional databases typically use the 3NF entity model. Master Data is typically stored in OLTP systems. Consider OTLP vs OLAP

  • Insert Only Pattern (eg. History Tables)

OLAP Patterns

Online Analytical Processing (OLAP) are the methods used to design systems and schemas widely used in Data Mining systems, and Business Intelligence. OLAP is characterized by relatively low volume of transactions. However, queries are often very complex and involve aggregations across multiple datasets. The performance of OLAP Systems is measured by response times. (to generate a report). The typical OLAP database stores aggregated, post-processed, historical data in a multi-dimensional schema. (usually star schema, snowflake). Consider OTLP vs OLAP

A common short-cut is to join all relational tables together into one large table, frequently resulting in a huge view/table. For mid-sized data warehouses, this may be feasible, but for very large data-warehouses, data marts are needed.

Data cubes are then created from this.

There are variants of OLAP. These are:

Star & Snowflake Schema Patterns

Spatial Schema Patterns

There are a variety of spatial database schemas available. This data is stored differently from Spatial file-based systems.

Predictive Analytics Schema

The ability to store simulation and forecast data. This is different from when forecasted data may be replaced with actual future data.

Ontology Schema Patterns

Ontology is the science of discovering relationship between entities. In this case data.

Several identified methods are:

  • String-based
  • Language-based
  • Linguistic-based (lexicons, thesaurus)
  • Constraint-based
  • Alignment-reuse
  • Upper-level (logic-based ontologies)
  • Graph-based
  • Taxonomy-based
  • Repository of structure-based
  • Model-based

Ontology schemas are developed with an industry in mind. Semantics depends on ontology to derive meaning.

Ontology Schema Patterns

  • ontology
  • owl_instances (value?)
  • owl_classes (needs subclasses)
  • owl_properties
  • owl_operators

Provenance Schema Patterns

Ways to determine how, when a certain dataset was derived. Provenance use caveats.

NoSQL Patterns

Here are some reusable Data Schemas

Process Management Schema

Ways of managing processes within an application environment. Consider design patterns:

Search Engine Schema

Data structures related to search

User Profile Schema

Standard user profile schemas for multi-user systems.

Spatial Schemas

  • See Oracle Spatial

Dataset Schemas

Specific schemas related to a dataset are noted with individual datasets. This is typically dependent on the Industry involved.