I ran across a great article today published by the DM Review by Mark Thiessen "Encrypting the Data Warehouse". In the article he breaks down the areas of import and the drivers for consideration.
I like how he asks the question can it be done and then breaks two places of concern into it’s components in it he says:
- Data in Transit – generally accepted principles in play (SSL, VPN, …)
- Data at Rest – data not being used by an application and is either being stored on disk or in OS cache.
So why the fuss:
- Past issues as he cites
- Compliance to legislation (SOX, personal, financial, health)
And then why is it so hard:
- RDBMS searches and tools no longer work optimally
- Increases data volume, variety, and velocity (store data, transaction history,…)
- Need for performance – it has always been an issue till now, so slow it down further
I think you get the idea and I do not want to paraphrase the whole article. He then goes on to cover some great approaches to solving this problem which I think are great, but still bring a huge hit on performance, so unless users are willing to have thigs slow down and the IT warehouse are up for spending more, I think you will still find CIO’s being very selective about what they encrypt. So here are my thoughts on things to encrypt in the warehouse.
- personal information – make the matching to the details of the person the last step in your query.
- Credit card information – same as above
- HR information – low volume low access data
So none of the above are new right, well your right, if you have high volume data in and a large number of users accessing the data, then we need to find a new way to secure the house because otherwise compliance is going to cost way too much. Another thought that we at Project X suggest is to define your guiding principles around encryption and security.