Continual Learning for Cyber Security

With the consistent evolution of networking technologies, the number of heterogeneous devices (ranging from digital to mechanical) connecting to the internet has increased, thus generating large volumes of data. Ever growing dimensionality and diverse nature of this data leads to more Vulnerabilities that significantly increase the attack surface, especially Zero Day Exploitation (ZDE) i.e., taking the advantage of vulnerabilities unknown to users/creators of the system.

In the Machine Learning(ML)/DataMining (DM) Community the task of identifying the ZDE is analogous to Anomaly detection that even appears in several other domains like fraud detection, medical diagnosis, etc. Typically, Intrusion Detection Systems (IDS) is used to identify anomalies in networks. In particular, Anomaly-based IDS are widely used detection systems (in the Network Security(NS) domain) that use ML approaches to identify these adversaries/anomalies. These systems will try to carve out boundaries that distinguish benign/normal data from anomalies. The effectiveness of the detection system will depend on its ability to automatically adjust these boundaries which will explicitly increase the robustness of the detection system. In a nutshell, these detection systems should exhibit lifelong learning mechanisms to stay up-to-date.

HFig: Architecture of continual learning based anomaly network intrusion detection system
Fig: Architecture of continual learning based anomaly network intrusion detection system


In an offline learning setting, ML approaches (Random forest, decision trees, multilayer perceptron, etc) may yield higher accuracy in detecting known anomalies but these are susceptible to unknown anomalies. To induce additional knowledge (altering the boundaries) into these detection systems, we will have to launch the process of training the system which requires the presence of the entire dataset (old data, new data) and it's time-consuming. To alleviate these problems following design goals are considered while building detection systems.

  1. Detection systems should accumulate new knowledge easily without interfering with the existed knowledge
  2. Less susceptible to unknown anomalies

Currently, we are working on the Continual Learning Framework towards realizing the above design goals. In this domain, there are many different open challenges like demonstrating Catastrophic forgetting, the impact of Task ordering on CL algorithms performance, studying the effect of an imbalanced dataset on CL algorithms, etc.


PhD students : Suresh Kumar Amalapuram
Undergraduate students (Current): Danda Sarat Chandra sai, Uppala Sehouriey, Nisha M
Undergraduate students (past): Akash Tadwai, Reetu Vinta, Thushara Reddy Tippireddy
Publications:

[C2] Suresh Kumar Amalapuram, Akash Tadwai, Reetu vinta, Sumohana S Channappayya, Bheemarjuna Reddy Tamma,"Continual Learning for Anomaly based Network Intrusion Detection", in Proc. of IEEE International Conference on Communication Systems and Networks (COMSNETS), January 2022 (To appear).

[C1] Suresh Kumar Amalapuram, Thushara Reddy tippireddy, Sumohana S Channappayya, Bheemarjuna Reddy Tamma,"On Handling Class Imbalance in Continual Learning based Network Intrusion Detection System ", in Proc. of The ACM First International Conference on AI-ML-Systems, october 2021. [DOI Link]