How to Survey Data Security Outcomes?

I received a ton of great responses to my initial post looking for survey input on what people want to see in a data security survey. The single biggest request is to research control effectiveness: which tools actually prevent incidents.

Surveys are hard to build, and while I have been involved with a bunch of them, I am definitely not about to call myself an expert. There are people who spend their entire careers building surveys. As I sit here trying to put the question set together, I’m struggling for the best approach to assess outcome effectiveness, and figure it’s time to tap the wisdom of the crowd.

To provide context, this is the direction I’m headed in the survey design. My goal is to have the core question set take about 10-15 minutes to answer, which limits what I can do a bit.

Section 1: Demographics

The basics, much of which will be anonymized when we release the raw data.

Section 2: Technology and process usage

I’ll build a multi-select grid to determine which technologies are being considered or used, and at what scale. I took a similar approach in the Project Quant for Patch Management survey, and it seemed to work well. I also want to capture a little of why someone implemented a technology or process. Rather than listing all the elements, here is the general structure:

Technology/Process
Not Considering
Researching
Evaluating
Budgeted
Selected
Internal Testing
Proof of Concept
Initial Deployment
Protecting Some Critical Assets
Protecting Most Critical Assets
Limited General Deployment
General Deployment

And to capture the primary driver behind the implementation:

Technology/Process
Directly Required for Compliance (but not an audit deficiency)
Compliance Driven (but not required)
To Address Audit Deficiency
In Response to a Breach/Incident
In Response to a Partner/Competitor Breach or Incident
Internally Motivated (to improve security)
Cost Savings
Partner/Contractual Requirement

I know I need to tune these better and add some descriptive text, but as you can see I’m trying to characterize not only what people have bought, but what they are actually using, as well as to what degree and why. Technology examples will include things like network DLP, Full Drive Encryption, Database Activity Monitoring, etc. Process examples will include network segregation, data classification, and content discovery (I will tweak the stages here, because ‘deployment’ isn’t the best term for a process).

Section 3: Control effectiveness

This is the tough one, where I need the most assistance and feedback (and I already appreciate those of you with whom I will be discussing this stuff directly). I’m inclined to structure this in a similar format, but instead of checkboxes use numerical input.

My concern with numerical entry is that I think a lot of people won’t have the numbers available. I can also use a multiselect with None, Some, or Many, but I really hate that level of fuzziness and hope we can avoid it. Or I can do a combination, with both numerical and ranges as options. We’ll also need a time scale: per day, week, month, or year.

Finally, one of the tougher areas is that we need to characterize the type of data, its sensitivity/importance, and the potential (or actual) severity of the incidents. This partially kills me, because there are fuzzy elements here I’m not entirely comfortable with, so I will try and constrain them as much as possible using definitions. I’ve been spinning some design options, and trying to capture all this information without taking a billion hours of each respondent’s time isn’t easy. I’m leaning towards breaking severity out into four separate meta-questions, and dropping the low end to focus only on “sensitive” information – which if lost could result in a breach disclosure or other material business harm.

Major incidents with Personally Identifiable Information or regulated data (PII, credit cards, healthcare data, Social Security Numbers). A major incident is one that could result in a breach notification, material financial harm, or high reputation damage. In other words something that would trigger an incident response process, and involve executive management.
Major incidents with Intellectual Property (IP). A major incident is one that could result in material financial harm due to loss of competitive advantage, public disclosure, contract violation, etc. Again, something that would trigger incident response, and involve executive management.
Minor incidents with PII/regulated data. A minor incident would not result in a disclosure, fines, or other serious harm. Something managed within IT, security, and the business unit without executive involvement.
Minor incidents with IP.

Within each of these categories, we will build our table question to assess the number of incidents and false positive/negative rates:

Technology
Incidents Detected
Incidents Blocked
Incidents Mitigated (incident occurred but loss mitigated)
Incidents Missed
False Positive Detected
Per Day
Per Month
Per Year
N/A

There are some other questions I want to work in, but these are the meat of the survey and I am far from convinced I have it structured well. Parts are fuzzier than I’d like, I don’t know how many organizations are mature enough to even address outcomes, and I have a nagging feeling I’m missing something important.

So I could really use your feedback. I’ll fully credit everyone who helps, and you will all get the raw data to perform your own analyses.

How to Survey Data Security Outcomes?

Section 1: Demographics

Section 2: Technology and process usage

Section 3: Control effectiveness

4 Comments