At the Teradata conference, one of the items we discussed was Metadata and best practices. We started off with a great example of what is metadata: the iPOD. This is a great example because the analogy is the iPOD is the warehouse, the music/video is the data and the listing and structure to be able to access the music/video is the metadata.
So with that out of the way, the real challenge has always been integrating the disparate systems that hold valuable metadata that we need to make our lives better from a analysis, support and insight point of view.
There are three types of metadata you want to track and leverage:
- Technical – the stuff inside the systems
- Business – business rules and the like
- Lineage – when, where, how the data got there and what has happened to it since it arrived.
So we talked about 4 architectures that they see:
- Bridged Approach:
Leverage the datawarehouse and the metadata repositories to be the bridge and centralized view to the disparate sources (copy the metadata to the EDW). To do this you build APIs to connect to data from the outside / tools / applications and glossaries. This appears from the discussion to be most popular.
- Physical Integration
- Integrated View:
Where you leverage a portal strategy to build feeds from the outside source metadata repositories. This is the one Project X leans to most.
The challenge with all of these is that the best place to update and manage each individual metadata source is at that source, so it is never truely a symbiotic relationship.