Improved Hunt Seeding with Specific Anomaly Scoring presented at FloCon2019 2019

by Brenden Bishop,

Summary : As the practice of hunting has spread through enterprise cyber security, interest in generalized anomaly detectors to seed hunts has also increased. The generally accepted premise seems to be that security events are rare and rare events are almost always anomalous. Therefore, if one seeds hunts with anomalous events, then the hunts are more likely to uncover activity of interest. However, implementing directed hunts in this manner requires the ability to define and detect anomalous behavior within complex systems. This is typically done probabilistically and there are several products available that employ machine learning approaches, such as neural networks, to define anomalous network activity. However, unsupervised learning approaches are inescapably plagued by high false alarm rates which, in turn, lead to analyst alert fatigue. To decrease false alarm rates, one may tune such products to only alert in cases of extremely anomalous events. But this still fails to address the heart of the problem which is that generalized anomaly detection for an entire network is probably not an optimal approach. Rather, defenders should develop a number of specific models. What is required is a scalable and repeatable framework for doing so. We present an open source approach for cyber security experts and data scientists/statistical engineers to collaboratively develop specific anomaly scoring models. Our approach utilizes a non-parametric kernel density estimator to evaluate the distributions of security logs. Once the desired distribution has been learned, analysts may use it to score records with a single-number, probabilistic measure of anomalousness. Logs can then be filtered based on this anomalousness score and rare events can be utilized to seed hunts. After sufficient validation, models may be transitioned to detectors which alert defenders when some criterion is met.