Data Provenance

Data Provenance is loosely referred to as where the data came from. This includes everything from the source application, database type, database instance, database name, schema name, table name, and attribute name. Data Provenance specifically refers to the first instance of that data or its source. Definition from DataVersity – Quoting W3C “Provenance is defined

Data Provenance Read More »

Change

Assumptions about Change when working with data Always be flexible because technology and vendors are constantly changing and evolving. Be a life-long learner because if you are not keeping up with the advancements in your industry, technology, tools, etc. then you are being left behind.

Change Read More »

Object Stores vs. RDBMS (Relational Database Management Systems)

Object Stores are better because: Cloud storage in object storage (AWS S3, Azure Storage, Google Cloud Storage) is widely used and it has well defined standards and protocols. Object Storage has the best cost structure of any data storage mode/method. Object Storage allows separating Data Storage from Compute which eliminates inefficiencies when both share a

Object Stores vs. RDBMS (Relational Database Management Systems) Read More »

Code

Assumptions about Code when working with data Owning your code is better than Renting, Licensing, or Subscribing to someone else’s code. Code built to perform an operation like ETL based on patterns and driven by property files and parameters to repeat the same operation on thousands of tables/files/objects is more consistent and efficient, easier to

Code Read More »

Data Platform Vendor Agnostic

Assumptions about Platform/Vendor Agnostic standards, structures, tools, and languages: Open standard formats, structures, tools, and languages are better than proprietary vendor specific formats, structures, tools, and languages. The ability to move data, code, workflows, processes, and solutions between vendors and platforms without re-architecting or lengthy migration efforts provides the flexibility needed to consider and change

Data Platform Vendor Agnostic Read More »

Maximize Automation

Guiding Principle: Maximize Automation – Human Intervention is Inconsistent and Error Prone Processes are not always started at the scheduled time as vacations, illness, job changes, and person availability impact the ability to kick off the process Humans often miss key values or make data entry mistakes when choosing date ranges, account ranges, well identifiers,

Maximize Automation Read More »

Scroll to Top