This article originally appeared in issue 9 of CREST Security Review. You can read or download the original article here.

At the heart of many scientific efforts to help security professionals is a mathematical challenge. One that has occupied the minds of biologists, sociologists, psychologists, and statisticians for decades. One that highlights both the power of cases and the limits of data. One that has no easy solution, though many have tried.

Its formal name is the ecological inference problem: the problem of making inferences about an individual from aggregate data (and data models). It is why evidence suggests that there is simultaneously no single terrorist ‘profile’ and yet common patterns across many terrorists’ lives. Patterns exist, but it is difficult to know which elements of those patterns are relevant to ‘that’ individual.

What is observed ‘on average’ speaks to what is true for only some. Fortunately, our uncanny ability to make sense out of this noise in this ‘social data’ means that investigators are able to provide the necessary, nuanced perspective.

The consequence of being able to ’predict some of the people some of the time,’ as a series of social psychology papers described it in the 1980s, depends on the investigative task. Data work well if you want to predict data. If your goal is to prioritise who to investigate or how much resource to allocate, then a statistical model that weights risk factors is likely better (and more ethical) than random investigation or random resource allocation.

Data is less powerful when you want to predict datum. The challenge was nicely illustrated in a recent review my colleagues and I undertook of deception detection methods. Typically social science in this area administers a technique, such as asking unanticipated questions or providing a ‘model’ statement for the interviewee to emulate, to one group of interviewees. Then they administer a standard questioning technique to a comparison group. The researchers then compare the two interviewee groups to determine whether, on average, the new technique elicited more information than the standard technique. In the case of unanticipated questions and model statements, they do. On average, people give up more information and their deception is better detected when these techniques are used in the interview, compared to when they are not used.

But, what about a single case? When using these techniques, what criterion – what count of the details the interviewee provides – should I use to infer that you are lying?

Our review found wide variation in the criterion that worked best for each study. So much so that any single criterion would result in us exonerating all liars, or falsely accusing all truth-tellers, in at least one study. The reason?

Context is everything. What you’re describing, what you’ve experienced, and how well you are interviewed, influences what you report far more than whether you are lying or telling the truth.

In the deception detection world, a recognition of the ecological inference problem has led researchers to focus on information elicitation, recognising that the only way to determine veracity for sure is to elicit a checkable fact, which can be verified elsewhere.

In other domains, the solution is coming from the coalescing of three efforts. The first is the development of more precise inference models. In the deception field, baselining an individual’s behaviour or using a criterion that is culturally-specific improves the accuracy of predictions. In work predicting risk of violence, layering contextual moderators into an assessment of individual-level push and pull factors tends to provide a more nuanced view of how that factor should be weighted for that individual. The difficulty of this approach is that models quickly get complex and data too sparse for meaningful development and ethically-defensible use.

The second is the use of innovative, more discriminatory indicators. This is one area where social science is uniquely positioned to contribute. Theory-driven models can inform what new data we look for. While the development of precise inference models looks to squeeze the most out of existing data, this solution encourages us to be informed consumers. Less is often more. In all of the many models proposed for insider threat detection, often using hundreds of variables, one that has survived the most rigorous testing involves a single measure carefully derived to capture a unique aspect of insiders’ experience — the inability or unwillingness to maintain normal interpersonal behaviour with co-workers.

The third is to recognise how good humans can be at navigating ecological inferences. Despite all our unconscious biases, we are uniquely disposed to infer the ‘story’ that underlies data and to form hypotheses that allow us to test such stories. We’re good at finding ways to determine if the inference is correct on these occasions, so long as we receive feedback on the accuracy of our judgements over time.

One interesting way to encourage this positive aspect of human inferences is to provide investigators a systematic way to capture and compare the assessments among colleagues. One CREST-funded project, led by Professor Ashraf Labib, is researching precisely this – see Security and intelligence agencies will depend on their case officers even more because of, not in spite of, the increasing use of data.