As we approach the end of this series, it has become clear that I should really have started with use cases. Not only because they are the primary driver of interest in masking products, but also because many advanced features and deployment models really only make sense in terms of particular use cases. The critical importance of clustered servers, and the necessity for post-masking validation for some applications, are really only clear in light of particular usage scenarios. I will sort this out in the final paper, putting use cases first, which will help with the more complex later discussions. But here they are.
Companies understand that good data makes employees’ jobs easier. And employees are really crafty at procuring data to help with their day jobs, even if it’s against the rules. If salespeople can get the entire customer database to help meet their quotas, or quality assurance personnel think they need production data to test web applications, they usually find ways to get it. The same goes for decentralized organizations where regional offices need to be self-sufficient, or companies need to share data with partners. The mental shift we see in enterprise environments is to stop fight these internal user requirements, but find a way to satisfy this demand safely. In some cases this means automated production of test data on a regular schedule, or self-service interfaces to produce masked content on demand. These platforms are effectively implementing a data security strategy for fast and efficient production of test data.
For compliance masking is used to protect data with minimal modification to systems or processes which use the (now masked) data. Masking provides consistent coverage across files and databases with very little adjustment. Many customers layered masking and encryption in combination; using encryption to secure data at rest and masking to secure data in use. Customers find masking better at maintaining relationships within databases; they also appreciate that it can be applied dynamically and causes fewer application side effects. In some cases encryption is deployed as part of the infrastructure, while others employ encryption as part of the data masking process – particularly to satisfy regulations that prescribe encryption. But the key difference is that masking offers full control over the data lifecycle from discovery to archival, whereas encryption is used in a more focused manner, often at multiple different points, to address specific risks. Masking platform manage the compliance controls, including which columns of data are to be protected, how they are protected, and where the data resides.
It is worth mentioning a few use cases I expected to drive customer adoption, but which failed to drive significant interest – at least among the customers we contacted during our research. One model that has gotten some attention over the last couple of years is masking data for data warehousing and analytics. When I asked about this, several companies complained that the common security strategy of “walling off” data warehouses through network segmentation, firewalls, and access control systems, was seriously flawed at best; these users were looking for better ways to securing the data instead. Masking solutions led these evaluations over encryption and tokenization solutions, primarily because they can scale to very large data volumes and better at securing data while maintaining complex relationships within the database. But this is still unusual. And it is simply too early to lump data warehouse protection demand with the need to secure “Big Data”, or to claim this segment is driven by NoSQL platforms like Hadoop. Customer demand is just not there – at least so far. Vendors are just now releasing tailored solutions to the market, leading customer demand in anticipation that NoSQL security issues will match or be more severe than those on a typical large analytical system.
Concerns about running databases in “the cloud”, or pushing data into multi-tenant environments, are not driven by “Big Data” or traditional warehouse applications at all. We spoke with a handful of enterprise masking customers who adopted masking for cloud databases; but they only moved test systems, mostly in Iaas (Infrastructure as a Service) deployments. In other cases customers decided to encrypt data prior to moving it into the cloud, or to leverage their SaaS vendor’s identity management and encryption capabilities. We were surprised by one use case: masking to secure streams of data – specifically digitized call records for XML streams, which came up a number of times, but not enough to constitute a major trend. Finally, we expected more customers to mention HIPAA as a requirement, and to mask in order to secure complex data sets. In actuality only one firm citied HIPAA as driving masking.
Next we will wrap up with a buying guide on what to look for, how to evaluate solutions, and pitfalls to watch out for.