Simple Rules for Smart IAM Solutions. – Part 3: Making Sense of Data, Risk Detection and Intelligently Leveraging It
In part 3 of this series on Simple Rules for Smart IAM Solutions, we will examine the most efficient ways in which we can slice and dice the data collected from the various contexts which will then be used to compute risk of the various users.
In my last blog we covered the various contexts that should feed into a user behavior analytics (UBA) system to efficiently identify patterns based on which the identity and access management (IAM) solution can then determine appropriate access policies for a given identity and resource combination. Today we are going to see the most efficient ways in which we can slice and dice the data collected from the various contexts which will then be used to compute risk of the various users. In other words, we need to ask ourselves what are the minimal set of dimensions (a.k.a features/attributes) that we can reduce our dataset to, to be able to extract behavioral patterns in a reliable fashion. In data science parlance, this is often referred to as “dimensionality reduction” in order to overcome the “curse of dimensionality”. So, let’s dive in.
How do I address the curse of dimensionality?
Picking up from where we left last time, notice that I have added another box for this step of dimensionality reduction:
Naturally, the question arising is why would we even want to go through with this? Isn’t having more dimensions going to give me more reliable behavioral patterns? These are perfectly valid questions, and the simple answer is that you actually don’t need all the features to cover the entire behavioral variance. Yes, “variance” is the key factor, and the goal is always to cover as much variance as possible, with as small number of features as possible.
To explain what I mean by variance, let’s consider an example workforce. Say the workforce is limited to one geographical location, with most employees working Monday through Friday, between 8am – 6pm, on enterprise provided devices (both mobile phones and workstations), with limited required travel. The variance (deviation from average/standard behavior) in the behavior of a typical employee for this organization would be much less when compared to a workforce which travels a lot, is spread across multiple geographical locations and has a more casual policy when it comes to devices. Establishing behavioral patterns for the first workforce would be possible with much less features than that for the latter workforce.
There are various methods available for this and one of the most popular ones is the Principal Component Analysis (PCA). As the name suggests, it helps with determining the “Principal” components required to discover patterns in a dataset, for e.g. You can find more information on this here. The below graph illustrates how much variance is covered (Y-axis) by how many PCs (X-Axis) for an example dataset.
After identifying the principal components, the main goal is to establish baselines for the users’ normal behaviors and then use one of the many methods and algorithms available to detect deviations from normal behavior. This is called anomaly detection and you can go through some of the popular methods in ML to apply anomaly detection to datasets here.
A good model should have the ability to adapt to changes in the characteristics of the data being ingested and be able to adjust the severity of the risk in a dynamic fashion. Which means there should be some concept of the degree of anomaly and the confidence with which the model is able to detect risk of various levels.
Automating Responses & Adapting to Risk
So, by now we have collected and cleaned the data, identified the key features and decided on the model to apply in order to detect risks. Now with every new data point, the solution is able to flag it as normal or attribute a risk flag to it with the right severity level. A good solution is able to take this and automate some of the following workflows and actions such that human/manual intervention is reduced to a minimum. Let’s look at some of these actions in the context of Identity and Access Management.
At a minimum the solution must be able to:
- Log the incident
- Notify relevant stakeholders through email/IM
- Step up access security for critical resources
- Define access policies for the various risk levels
- Provide a visual interface to visualize the threat
- Provide sufficient audit trail to the admin in order to investigate the incident
Advanced solutions are able to integrate responses with the following workflows and address a variety of risk-based access control decisions and actions:
- Integration of risk with access workflow management and other Identity Governance processes
- Be able to generate dynamic policies based on the risk level and all impacted resources.
- Be able to provide internal and external workflows that can be invoked automatically upon proper initial configurations
- Provide sufficient knobs to the admins to fine tune the model in order to reduce false positives.
- Continuously monitor user behavior and adjust the associated access policies appropriately.
In the final part of this blog series we’ll double click into the automation and orchestration use cases and how they are absolutely critical in delivering on the promise of Zero Trust, but without compromising on the end-user experience. We’ll explore a couple of use cases to emphasize on the importance of connecting seemingly disparate systems and workflows (both internal and external) in order to deliver this.