A Normalizing Flow-Based Semi-Supervised Method for Imbalanced Network Intrusion Detection

Authors

  • Chaoqun Guo School of Software Engineering, Beijing Jiaotong University, China
  • Shunjie Yang School of Software Engineering, Beijing Jiaotong University, China
  • Jubao Cheng School of Software Engineering, Beijing Jiaotong University, China
  • Dalin Zhang Intelligent Systems and Security Laboratory Of BJTU; School of Cyberspace Security, Beijing Jiaotong University, China;

DOI:

https://doi.org/10.15837/ijccc.2025.4.6890

Keywords:

intrusion detection systems, normalizing flows, semi-supervised learning, data imbalance

Abstract

Intrusion Detection Systems (IDS) are integral to ensuring network security. However, in practical settings, network traffic data often exhibits significant imbalances, affecting both labeled and unlabeled data distributions. Such imbalances notably degrade the performance of existing intrusion detection methods, particularly in semi-supervised learning contexts, where traditional approaches struggle to effectively leverage large amounts of unlabeled data for enhanced detection capabilities. This paper introduces a semi-supervised learning approach based on normalizing flows to mitigate the data imbalance issue in network intrusion detection. Normalizing flows construct flexible and invertible probabilistic models that can accurately capture and generate complex, highdimensional network traffic data distributions. Specifically, this method utilizes a small amount of labeled data for initial training and incorporates manifold learning and self-training with unlabeled data to adapt the model to the imbalance in the unlabeled data distribution, thereby improving overall detection performance. Experimental results demonstrate that this method outperforms traditional approaches in addressing data imbalance in intrusion detection. The proposed method not only improves detection accuracy and recall but also significantly reduces reliance on data distribution assumptions, demonstrating robustness and generalization across diverse network traffic datasets.

References

Abuadlla, Y., Kvascev, G., Gajin, S. Jovanovic, Z. (2014). Flow-based anomaly intrusion detection system using two neural network stages. Computer Science And Information Systems. 11, 601-622, 2014. https://doi.org/10.2298/CSIS130415035A

Adadi, A. (2021). A survey on data-efficient algorithms in big data era. Journal Of Big Data. 8, 24, 2021. https://doi.org/10.1186/s40537-021-00419-9

Ahmad, Z., Shahid Khan, A., Wai Shiang, C., Abdullah, J. Ahmad, F. (2021). Network intrusion detection system: A systematic study of machine learning and deep learning approaches. Transactions On Emerging Telecommunications Technologies. 32, e4150, 2021. https://doi.org/10.1002/ett.4150

Atefinia, R. Ahmadi, M. (2021). Network intrusion detection using multi-architectural modular deep neural network. The Journal Of Supercomputing. 77, 3571-3593, 2021. https://doi.org/10.1007/s11227-020-03410-y

Bagui, S. Li, K. (2021). Resampling imbalanced data for network intrusion detection datasets. Journal Of Big Data. 8, 6, 2021. https://doi.org/10.1186/s40537-020-00390-x

Bai, T., Luo, J., Zhao, J., Wen, B. Wang, Q. (2021). Recent advances in adversarial training for adversarial robustness. ArXiv Preprint ArXiv:2102.01356, 2021. https://doi.org/10.24963/ijcai.2021/591

Bhosale, D. Mane, V. (2015). Comparative study and analysis of network intrusion detection tools. 2015 International Conference On Applied And Theoretical Computing And Communication Technology (iCATccT). 312-315, 2015. https://doi.org/10.1109/ICATCCT.2015.7456901

Bolón-Canedo, V., Sánchez-Maroño, N. Alonso-Betanzos, A. (2016). Feature selection for highdimensional data. Progress In Artificial Intelligence. 5, 65-75, 2016. https://doi.org/10.1007/s13748-015-0080-y

Chiba, Z., Abghour, N., Moussaid, K., Rida, M. (2019). Others Intelligent approach to build a Deep Neural Network based IDS for cloud environment using combination of machine learning algorithms. Computers & Security. 86, 291-317, 2019. https://doi.org/10.1016/j.cose.2019.06.013

Cui, Y., Jia, M., Lin, T., Song, Y. Belongie, S. (2019). Class-balanced loss based on effective number of samples. Proceedings Of The IEEE/CVF Conference On Computer Vision And Pattern Recognition. 9268-9277, 2019. https://doi.org/10.1109/CVPR.2019.00949

Das, S., Mullick, S. Zelinka, I. (2022). On supervised class-imbalanced learning: An updated perspective and some key challenges. IEEE Transactions On Artificial Intelligence. 3, 973-993, 2022. https://doi.org/10.1109/TAI.2022.3160658

Zhang, D. (2017). High-speed train control system big data analysis based on the fuzzy RDF model and uncertain reasoning. International Journal Of Computers Communications & Control. 12, 577-591, 2017. https://doi.org/10.15837/ijccc.2017.4.2914

Zhang, D., Du, C., Peng, Y., Liu, J., Mohammed, S. & Calvi, A. (2024). A multi-source dynamic temporal point process model for train delay prediction. IEEE Transactions On Intelligent Transportation Systems. 2024. https://doi.org/10.1109/TITS.2024.3430031

Dhanabal, L. Shantharajah, S. (2015). A study on NSL-KDD dataset for intrusion detection system based on classification algorithms. International Journal Of Advanced Research In Computer And Communication Engineering. 4, 446-452, 2015.

Gong, T., Lee, T., Stephenson, C., Renduchintala, V., Padhy, S., Ndirango, A., Keskin, G. Elibol, O. (2019). A comparison of loss weighting strategies for multi task learning in deep neural networks. IEEE Access. 7, 141627-141632, 2019. https://doi.org/10.1109/ACCESS.2019.2943604

Hamdi, M., Bouhamed, H., Badreddine, F. & Alkanhel, R. (2024) Deep recurrent neural networks distributed on a Hadoop/Spark cluster for fall detection: Deep recurrent neural networks for fall detection. Int. J. Comput. Commun. Control. 19, (2024), https://doi.org/10.15837/ijccc.2024.3.6428 https://doi.org/10.15837/ijccc.2024.3.6428

Hajj, S., El Sibai, R., Bou Abdo, J., Demerjian, J., Makhoul, A. Guyeux, C. (2021). Anomalybased intrusion detection systems: The requirements, methods, measurements, and datasets. Transactions On Emerging Telecommunications Technologies. 32, e4240, 2021. https://doi.org/10.1002/ett.4240

He, H. Garcia, E. (2009). Learning from imbalanced data. IEEE Transactions On Knowledge And Data Engineering. 21, 1263-1284, 2009. https://doi.org/10.1109/TKDE.2008.239

Idrissi, M., Alami, H., Bouayad, A. Berrada, I. (2023). NF-NIDS: Normalizing Flows for Network Intrusion Detection Systems. 2023 10th International Conference On Wireless Networks And Mobile Communications. 1-7, 2023. https://doi.org/10.1109/WINCOM59760.2023.10322987

Ioulianou, P., Vasilakis, V., Moscholios, I. Logothetis, M. (2018). A signature-based intrusion detection system for the internet of things. Information And Communication Technology Form. 2018.

Jha, J. Ragha, L. (2013). Intrusion detection system using support vector machine. International Journal Of Applied Information Systems. 3, 25-30, 2013.

Kasongo, S. Sun, Y. (2019). A deep learning method with filter based feature engineering for wireless intrusion detection system. IEEE Access. 7, 38597-38607, 2019. https://doi.org/10.1109/ACCESS.2019.2905633

Kim, J., Kim, J., Thu, H. Kim, H. (2016). Long short term memory recurrent neural network classifier for intrusion detection. 2016 International Conference On Platform Technology And Service. 1-5, 2016. https://doi.org/10.1109/PlatCon.2016.7456805

Kumar, V. Sangwan, O. (2012). Signature based intrusion detection system using SNORT. International Journal Of Computer Applications & Information Technology. 1, 35-41, 2012. https://doi.org/10.1109/ICCCA.2012.6179141

Li, W., Tug, S., Meng, W. Wang, Y. (2019). Designing collaborative blockchained signature-based intrusion detection in IoT environments. Future Generation Computer Systems. 96, 481-489, 2019. https://doi.org/10.1016/j.future.2019.02.064

Li, Z., Qin, Z., Huang, K., Yang, X. Ye, S. (2017). Intrusion detection using convolutional neural networks for representation learning. International Conference On Neural Information Processing. 858-866, 2017. https://doi.org/10.1007/978-3-319-70139-4_87

Liu, X., Li, K., Wang, W., Yan, Y., Sha, Y., Chen, J. & Qin, J. (2021) Improved RBF Network Intrusion Detection Model Based on Edge Computing with Multi-algorithm Fusion. Int. J. Comput. Commun. Control. 16, 2021. https://doi.org/10.15837/ijccc.2021.4.4232 https://doi.org/10.15837/ijccc.2021.4.4232

Masdari, M. Khezri, H. (2020). A survey and taxonomy of the fuzzy signature-based intrusion detection systems. Applied Soft Computing. 92, 106301, 2020. https://doi.org/10.1016/j.asoc.2020.106301

Moustafa, N. Slay, J. (2015). UNSW-NB15: a comprehensive data set for network intrusion detection systems (UNSW-NB15 network data set). 2015 Military Communications And Information Systems Conference. 1-6, 2015. https://doi.org/10.1109/MilCIS.2015.7348942

Nazih, W., Hifny, Y., Elkilani, W., Abdelkader, T. & Faheem, H. (2019) Efficient Detection of Attacks in SIP Based VoIP Networks Using Linear l1-SVM Classifier. Int. J. Comput. Commun. Control. 14, 518-529, 2019. https://doi.org/10.15837/ijccc.2019.4.3563 https://doi.org/10.15837/ijccc.2019.4.3563

Papamakarios, G., Nalisnick, E., Rezende, D., Mohamed, S. Lakshminarayanan, B. (2021). Normalizing flows for probabilistic modeling and inference. Journal Of Machine Learning Research. 22, 1-64, 2021.

Paxson, V. (1999). Bro: a system for detecting network intruders in real-time. Computer Networks. 31, 2435-2463, 1999. https://doi.org/10.1016/S1389-1286(99)00112-7

Rai, K., Devi, M. Guleria, (2016). A. Decision tree based algorithm for intrusion detection. International Journal Of Advanced Networking And Applications. 7, 2828, 2016.

Rani, M. Singh, G. (2022). Gagandeep Effective network intrusion detection by addressing class imbalance with deep neural networks multimedia tools and applications. Multimedia Tools And Applications. 81, 8499-8518, 2022. https://doi.org/10.1007/s11042-021-11747-6

Rezende, D. Mohamed, S. (2015). Variational inference with normalizing flows. International Conference On Machine Learning. 1530-1538, 2015.

Shafahi, A., Najibi, M., Ghiasi, M., Xu, Z., Dickerson, J., Studer, C., Davis, L., Taylor, G. Goldstein, T. (2019). Adversarial training for free!. Advances In Neural Information Processing Systems. 32, 2019.

Sharafaldin, I., Lashkari, A., Ghorbani, A. (2018). Others Toward generating a new intrusion detection dataset and intrusion traffic characterization. International Conference on Information Systems Security and Privacy. 1, 108-116, 2018. https://doi.org/10.5220/0006639801080116

Shenfield, A., Day, D. Ayesh, A. (2018). Intelligent intrusion detection systems using artificial neural networks. ICT Express. 4, 95-99, 2018. https://doi.org/10.1016/j.icte.2018.04.003

Sudre, C., Li, W., Vercauteren, T., Ourselin, S. Jorge Cardoso, M. (2017). Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations. Deep Learning In Medical Image Analysis And Multimodal Learning For Clinical Decision Support: Third International Workshop, DLMIA 2017, And 7th International Workshop, ML-CDS 2017, Held In Conjunction With MICCAI 2017, Québec City, QC, Canada, September 14, Proceedings 3. 240-248, 2017. https://doi.org/10.1007/978-3-319-67558-9_28

S, A. & R, V. (2023) A Deep Learning Approach for Efficient Anomaly Detection in WSNs. Int. J. Comput. Commun. Control. 18, (2023), https://doi.org/10.15837/ijccc.2023.1.4756 https://doi.org/10.15837/ijccc.2023.1.4756

Tavallaee, M., Bagheri, E., Lu, W. Ghorbani, A. (2009). A detailed analysis of the KDD CUP 99 data set. 2009 IEEE Symposium On Computational Intelligence For Security And Defense Applications. 1-6, 2009. https://doi.org/10.1109/CISDA.2009.5356528

Thakkar, A. Lohiya, R. (2021). A review on machine learning and deep learning perspectives of IDS for IoT: recent updates, security issues, and challenges. Archives Of Computational Methods In Engineering. 28, 3211-3243, 2021. https://doi.org/10.1007/s11831-020-09496-0

Van Engelen, J. Hoos, H. (2020). A survey on semi-supervised learning. Machine Learning. 109, 373-440, 2020. https://doi.org/10.1007/s10994-019-05855-6

Vinayakumar, R., Soman, K. Poornachandran, P. (2017). Applying convolutional neural network for network intrusion detection. 2017 International Conference On Advances In Computing, Communications And Informatics (ICACCI). 1222-1228, 2017. https://doi.org/10.1109/ICACCI.2017.8126009

Wang, L., Han, M., Li, X., Zhang, N. Cheng, H. (2021). Review of classification methods on unbalanced data sets. IEEE Access. 9, 64606-64628, 2021. https://doi.org/10.1109/ACCESS.2021.3074243

Wu, P. Guo, H. (2019). LuNET: a deep neural network for network intrusion detection. 2019 IEEE Symposium Series On Computational Intelligence. 617-624, 2019. https://doi.org/10.1109/SSCI44817.2019.9003126

Yang, D., Usynin, A. Hines, J. (2006). Anomaly-based intrusion detection for SCADA systems. 5th Intl. Topical Meeting On Nuclear Plant Instrumentation, Control And Human Machine Interface Technologies. 12-16, 2006.

Yang, Z., Liu, X., Li, T., Wu, D., Wang, J., Zhao, Y. Han, H. (2022). A systematic literature review of methods and datasets for anomaly-based network intrusion detection. Computers & Security. 116, 102675, 2022. https://doi.org/10.1016/j.cose.2022.102675

Yin, C., Zhu, Y., Fei, J. He, X. (2017). A deep learning approach for intrusion detection using recurrent neural networks. IEEE Access. 5, 21954-21961, 2017. https://doi.org/10.1109/ACCESS.2017.2762418

Yu, H., Sun, C., Yang, X., Zheng, S., Wang, Q. Xi, X. (2018). LW-ELM: a fast and flexible cost-sensitive learning framework for classifying imbalanced data. IEEE Access. 6, 28488-28500, 2018. https://doi.org/10.1109/ACCESS.2018.2839340

Zebari, R., Abdulazeez, A., Zeebaree, D., Zebari, D. Saeed, J. (2020). A comprehensive review of dimensionality reduction techniques for feature selection and feature extraction. Journal Of Applied Science And Technology Trends. 1, 56-70, 2020. https://doi.org/10.38094/jastt1224

Zhang, T. Oles, F. (2000). The value of unlabeled data for classification problems. Proceedings Of The Seventeenth International Conference On Machine Learning. 20, 10-24, 2000.

Additional Files

Published

2025-07-01

Most read articles by the same author(s)

Obs.: This plugin requires at least one statistics/report plugin to be enabled. If your statistics plugins provide more than one metric then please also select a main metric on the admin's site settings page and/or on the journal manager's settings pages.