TY - JOUR
T1 - Machine Learning for NetFlow Anomaly Detection with Human-Readable Annotations
AU - Krishnamurthy, Prashanth
AU - Khorrami, Farshad
AU - Schmidt, Steve
AU - Wright, Kevin
N1 - Funding Information:
Manuscript received August 14, 2020; revised December 31, 2020 and April 7, 2021; accepted April 12, 2021. Date of publication April 26, 2021; date of current version June 10, 2021. This work was supported in part by DARPA under Space and Naval Warfare Systems Center, Pacific (SSC Pacific) contract N66001-18-C-4037. The associate editor coordinating the review of this article and approving it for publication was M. Conti. (Corresponding author: Farshad Khorrami.) Prashanth Krishnamurthy and Farshad Khorrami are with the Control/Robotics Research Laboratory, Department of Electrical and Computer Engineering, New York University Tandon School of Engineering, Brooklyn, NY 11201 USA (e-mail: [email protected]).
Publisher Copyright:
© 2004-2012 IEEE.
PY - 2021/6
Y1 - 2021/6
N2 - We propose a framework for anomaly detection in communication network logs along with automated extraction of human-readable annotations that explain the decision logic underlying each anomaly detection. For this purpose, we develop a machine learning methodology formulated in terms of a model comprised of an OR-combination of multiple Boolean logic based sentences. Each sentence is an empirically learned set of inequality conditions involving subsets of features. The feature set, which comprises the 'alphabet' for human-readable annotations, is constructed using dynamic graph based spatio-temporal aggregation to extract human-understandable aggregates of network activity. These aggregates are constructed both in terms of computers (nodes in dynamic graph) and communications between computers (edges in dynamic graph). From the alphabet, the learned model identifies subsets of features that relate to each anomaly type and the combinations of conditions in terms of the feature subsets for detection of the specific anomaly type. Given a data point that the learned model detects as anomalous, the model identifies the specific features and their combinations related to the anomaly detection. These human-readable annotations provide a cyber-security analyst a transparent view into the decision logic underlying an anomaly detection.
AB - We propose a framework for anomaly detection in communication network logs along with automated extraction of human-readable annotations that explain the decision logic underlying each anomaly detection. For this purpose, we develop a machine learning methodology formulated in terms of a model comprised of an OR-combination of multiple Boolean logic based sentences. Each sentence is an empirically learned set of inequality conditions involving subsets of features. The feature set, which comprises the 'alphabet' for human-readable annotations, is constructed using dynamic graph based spatio-temporal aggregation to extract human-understandable aggregates of network activity. These aggregates are constructed both in terms of computers (nodes in dynamic graph) and communications between computers (edges in dynamic graph). From the alphabet, the learned model identifies subsets of features that relate to each anomaly type and the combinations of conditions in terms of the feature subsets for detection of the specific anomaly type. Given a data point that the learned model detects as anomalous, the model identifies the specific features and their combinations related to the anomaly detection. These human-readable annotations provide a cyber-security analyst a transparent view into the decision logic underlying an anomaly detection.
KW - Anomaly detection
KW - human-readable annotations
KW - intrusion detection
KW - networks
KW - security
UR - http://www.scopus.com/inward/record.url?scp=85105104085&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85105104085&partnerID=8YFLogxK
U2 - 10.1109/TNSM.2021.3075656
DO - 10.1109/TNSM.2021.3075656
M3 - Article
AN - SCOPUS:85105104085
SN - 1932-4537
VL - 18
SP - 1885
EP - 1898
JO - IEEE Transactions on Network and Service Management
JF - IEEE Transactions on Network and Service Management
IS - 2
M1 - 9416281
ER -