Data Integration
Jump to navigation
Jump to search
Data Integration centers around the transport of data across an organisation. This is generally performed by ETL tools.
- Data Ingestion / Data Extraction
- Data Transformation
- Data Loading
- Data Flow
- Process
- ETL Selection Criteria
- Data Integration Technology
Checks
- Data Cleasing
- Data Formats
- Errors - ie. missing records
- Error Tolerance - when does an ETL bail.
- Memory Usage
Challenges
- Data transfer volumes are growing exponentially.
- Disparate sources of data are becoming common place.
- ETL processes have to process large amounts of OLTP data.
- Some ETL process are smarter with incremental updating, but generally this is not good enough.
- BI data structures are varied. ETL requirements are different for data warehouses, data marts, and for specific visualisation needs (eg. analysis, reporting, dashboarding, scorecarding).
- Transformation needs are getting more complex.
- data needs to be aggregated, parsed, computed, statistically processed
- BI is tending towards realtime, so ETLs have to refresh data-warehouses and datamarts more frequently and within a smaller load time window.
- Off-peak ETL window is getting increasing small
- Incremental ETL transfer becoming commonplace.
- Realtime ETLs are nice to have
- Data Quality
Documentation
- ETL Selection Criteria
- ETL Architecture
- ETL Business Requirements
- ETL Functional Specifications
- ETL Technical Specifications
- Data Integration Selection Criteria
- Data Integration Architecture
- Data Integration Business Requirements
- Data Integration Functional Specifications
- Data Integration Technical Specifications