You’ll know the world is a changed place when dictionaries sound exciting, when encyclopedia are required reading, watching paint dry is made an olympic event, and metadata becomes popular.
"Not in my life time," you say. Well think again because metadata – that data about the data thing you’ve been ignoring all these years is coming into its own.
Metadata? It’s the blank form before you fill it out. It tells you what information to store and where to find it later. It’s the stuff no one thinks about until someone asks: "What does that mean exactly?"
We’ve ignored Metadata for so long that it’s one big mess to clean-up. One of those jobs you’ll have trouble starting because you don’t know where to start. But now you’ve got vendors at your door waving "Metadata products" in your face. And users screaming that they can’t find anything in your corporate databases because there no map, no directory, nothing to make the next big leep toward data nirvhana : the self-serve information & knowledge management environment. As of this writing none of these product meets expectations because no product can be all things to all people. Data is so pervasive and resides in so many different formats that no one product will do it all.
Add to that the troubling fact that the data is scattered all over the place. On people’s hard drives or in small databases that are never backed up. Or worse, you hear, "We keep than critical corporate data in a series of spreadsheets".
How do we identify, manage, administer, steward & govern this environment? Welcome to the information age. Metadata is to the information age what the duey (sp) decimal system and card catalogue are to Library Science. Looking for a book in your local library? Check the card catalogue before you enter the stacks. Looking for a piece of relevant data in the corporate data store? Check the metadata.
Like the card catalogue if it’s done right, the effort to maintain the metadata becomes invisible to the process. A simple look up instead of hours sifting through the volumes. Metadata will become the massive enabler of business & technical users.
Let’s take yourself for example, you’ve got a first name and a last name, street number, street name, postal code, a telephone number or two & maybe a cell phone number or two. Gender, height, weight, girth (several of those @ different heights), hair colour or no hair. Job title, bank account numbers, charge card numbers, license plate number(s) a spouse who has all those things that can be associated with you. You might smoke, have a mortgage balance, be a diabetic, own one north american car & one europeon car. You might be a member of a political party, have a religious affiliation, belong to a club or two, have recreational interests outside work, and on and on.
And that’s just people metadata. What about products. Everything you buy has characteristics – a function, .weight, dimensions, uses, colour, texture and on and on. More and more metadata.
And all of this information is of interest to people who want to sell you more stuff. The people who want to sell you stuff don’t want to waste time trying to sell you stuff you are not going to buy. That costs them money. They want to know you, know what stuff you have and what stuff you don’t. Then they can figure out what you might buy and try to sell it to you.
All that takes data. All that data requires metadata. Lots of metadata. Who’s going sort this mess out? Well that’s where the big mess I spoke of earlier comes in. Right now as a general rule few of us are sorting this mess out. In most organization a form of natural selection has taken hold. By this I mean for any given business fact a number of groups within an organization will maintain a copy of that fact. Each will use that fact for their own purposes if it agrees with their assumptions about the business. When a fact does not agree, or stops agreeing, with commonly held assumptions it is ignored (selected against in Darwinian terms). There is very little scientific method applied in business today. But there is a lot of natural selection going on. These are the numbers I have, they’ll do.
To end on a pessimistic note, as if this article wasn’t pessimistic enough. I don’t see this situation changing any time soon. With so many versions of the truth out their, people will in general continue to be deluded by their own set of comfortable facts and will ignore, or be unaware of, correct information because they lack the skills to recognize it into meaningful information. Add political agendas to that mess and no one will support the concept of corporate wide metadata us viable.
What I’m predicting is it will be a few years before corporation get their metadata house in order. The lack of effective methods to deal with disperate data.
Another metadata analogy. If metadata is giving you a headache consider the following. As you read the label on your Aspirin or Advil bottle realize that what you are reading is meta-information about the contents of the bottle. The pills are the object of interest. The label on the bottle instructs you as to what the pills are and how they can be used.
In the same way we go looking for data that we can use to relieve our Business Intelligence pain(s). Prior to using the data we need to understand how it can be used as information. Metadata (the data’s label) can tell us this.
The classic definition of metadata is: It’s the data about the data. This description always confused me.
The best analogy I’ve heard is this: In the real world, when we walk into a bank and apply for a loan we are given a form to fill out. The bank could provide us with a blank piece of paper and ask us to supply all the relevant information but instead they give us the form and ask that specific information be placed in specific boxes on the form. The information we enter into the box is the data, the labelled boxes on the application form are the metadata. They describe what information is inside the box.
Now the form is a simple example, because it’s fairly simple and relatively static. But in the data world there is a lot of information about data elements that need to be capture. This is metadata. It is used to define and or interpret the data in the database.
I was wondering why it is called metadata. I guess that it might have to something to do with the fact that one is looking not at the data itself but once removed view of the data. Could you clear that up for me?