Drift-Resilient Fraud Classification via Continual Learning with Replay-Guided Synthetic Minority Oversampling

Elena Rossi

doi:10.54097/9t0znt37

Authors

Elena Rossi

DOI:

https://doi.org/10.54097/9t0znt37

Keywords:

Continual learning, Concept drift, Fraud detection, Synthetic minority oversampling, Class imbalance, Experience replay, Non-stationary data streams

Abstract

Financial fraud detection systems operate under persistent conditions of distributional non-stationarity, where evolving fraud patterns and severe class imbalance jointly undermine the reliability of statically trained classifiers. This paper presents a drift-resilient fraud classification framework that integrates continual learning (CL) with a replay-guided synthetic minority oversampling technique (SMOTE) module to address both challenges through a unified training loop. The proposed system maintains an episodic memory buffer populated by strategically selected minority-class exemplars, which simultaneously serve as replay anchors for preventing catastrophic forgetting and as geometrically informative neighborhood seeds for cross-temporal synthetic sample generation. A drift-aware buffer management policy prioritizes boundary-proximal and recently misclassified minority instances, maximizing the informativeness of fixed memory capacity across successive temporal periods. Experimental evaluation on two benchmark fraud datasets under simulated concept drift conditions of varying magnitude demonstrates that the framework achieves a 12.4% improvement in area under the precision-recall curve (AUPRC) over standard gradient-boosted classifiers and a 9.7% gain over naive replay baselines. Ablation experiments confirm the independent contribution of each system component, with drift-aware buffer management identified as the single most impactful design choice. The results establish that tightly coupling memory-guided oversampling with experience replay constitutes a principled and effective strategy for robust, long-horizon fraud detection in non-stationary data streams.

Downloads

Download data is not yet available.

References

[1] Liu, C. L., Tseng, C. J., Huang, T. H., Yang, J. S., & Huang, K. B. (2023). A multi-task learning model for building electrical load prediction. Energy and Buildings, 278, 112601. https://doi.org/10.1016/j.enbuild.2022.112601

[2] Zhao, W., Chen, T., Yang, J. S., & Qiu, L. (2026). AutoML-Pipeline: A RAG-enhanced code generation framework with pre-validation for cloud-native machine learning workflows. IEEE Access, 14, 1-15.

[3] Zhang, S., Qiu, L., & Zeng, Z. (2026). Physics-data synergy in structural health monitoring: A multi-scale graph contrastive framework with temperature-adaptive fusion. IEEE Access, 14, 1-18.

[4] Wang, B., Wang, Z., Zhao, W., Zhang, F., & Shang, W. (2026). DRL-Adapt: Deep reinforcement learning for adaptive routing convergence optimization in large-scale networks. IEEE Open Journal of the Computer Society, 7, 1-14.

[5] Teng, D., Rhee, M., Qin, Y., Zi, B., & Liu, W. (2026). SW-SpeedDLM: Sliding-window speculative decoding for diffusion language models under long-context constraints. Mathematics, 14(11), 2105.

[6] Lima, M., Neto, M., Silva Filho, T., & Fagundes, R. A. D. A. (2022). Learning under concept drift for regression: A systematic literature review. IEEE Access, 10, 45410-45429. https://doi.org/10.1109/ACCESS.2022.3169815

[7] Brzezinski, D., Stefanowski, J., Susmaga, R., & Szczech, I. (2020). On the dynamics of classification measures for imbalanced and streaming data. IEEE Transactions on Neural Networks and Learning Systems, 31(8), 2868-2878. https://doi.org/10.1109/TNNLS.2019.2932518

[8] Elreedy, D., & Atiya, A. F. (2019). A comprehensive analysis of synthetic minority oversampling technique (SMOTE) for handling class imbalance. Information Sciences, 505, 32-64. https://doi.org/10.1016/j.ins.2019.07.070

[9] Rolnick, D., Ahuja, A., Schwarz, J., Lillicrap, T., & Wayne, G. (2019). Experience replay for continual learning. Advances in Neural Information Processing Systems, 32, 350-360.

[10] Gomes, H. M., Read, J., Bifet, A., Barddal, J. P., & Gama, J. (2019). Machine learning for streaming data: State of the art, challenges, and opportunities. ACM SIGKDD Explorations Newsletter, 21(2), 6-22. https://doi.org/10.1145/3373464.3373470

[11] Masana, M., Liu, X., Twardowski, B., Menta, M., Bagdanov, A. D., & Van De Weijer, J. (2023). Class-incremental learning: Survey and performance evaluation on image classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(5), 5513-5533. https://doi.org/10.1109/TPAMI.2022.3213476

[12] Guo, Y., Liu, M., Yang, T., & Rosing, T. (2020). Improved schemes for episodic memory-based lifelong learning. Advances in Neural Information Processing Systems, 33, 1023-1035.

[13] Chalapathy, R., & Chawla, S. (2019). Deep learning for anomaly detection: A survey. arXiv, arXiv:1901.03407. https://doi.org/10.48550/arXiv.1901.03407

[14] Shyaa, M. A., Ibrahim, N. F., Zainol, Z., Abdullah, R., Anbar, M., & Alzubaidi, L. (2024). Evolving cybersecurity frontiers: A comprehensive survey on concept drift and feature dynamics aware machine and deep learning in intrusion detection systems. Engineering Applications of Artificial Intelligence, 137, 109143. https://doi.org/10.1016/j.engappai.2024.109143

[15] Liu, A., Lu, J., & Zhang, G. (2021). Diverse instance-weighting ensemble based on region drift disagreement for concept drift adaptation. IEEE Transactions on Neural Networks and Learning Systems, 32(1), 293-307. https://doi.org/10.1109/TNNLS.2020.2978532

[16] Lebichot, B., Verhelst, T., Le Borgne, Y. A., He-Guelton, L., Oble, F., & Bontempi, G. (2021). Transfer learning strategies for credit card fraud detection. IEEE Access, 9, 114754-114766. https://doi.org/10.1109/ACCESS.2021.3104891

[17] Wang, L., Zhang, M., Jia, Z., Li, Q., Bao, C., Ma, K., Zhu, J., & Zhong, Y. (2021). AFEC: Active forgetting of negative transfer in continual learning. Advances in Neural Information Processing Systems, 34, 22379-22391.

[18] Douillard, A., Cord, M., Ollion, C., Robert, T., & Valle, E. (2020). PODNet: Pooled outputs distillation for small-tasks incremental learning. In A. Vedaldi, H. Bischof, T. Brox, & J. M. Frahm (Eds.), European Conference on Computer Vision (ECCV 2020) (Vol. 12347, pp. 86-102). Springer. https://doi.org/10.1007/978-3-030-58538-6_6

[19] Chaudhry, A., Gordo, A., Dokania, P., Torr, P., & Lopez-Paz, D. (2021). Using hindsight to anchor past knowledge in continual learning. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 35, No. 8, pp. 6993-7001). AAAI Press. https://doi.org/10.1609/aaai.v35i8.16865

[20] Kovács, G. (2019). An empirical comparison and evaluation of minority oversampling techniques on a large number of imbalanced datasets. Applied Soft Computing, 83, 105662. https://doi.org/10.1016/j.asoc.2019.105662

[21] Douzas, G., & Bacao, F. (2019). Geometric SMOTE: A geometrically enhanced drop-in replacement for SMOTE. Information Sciences, 501, 118-135. https://doi.org/10.1016/j.ins.2019.06.007

[22] Alfhaid, M. A., & Abdullah, M. (2021). Classification of imbalanced data stream: Techniques and challenges. Artificial Intelligence Review, 9(2), 36-52.

[23] Mohammed, R., Rawashdeh, J., & Abdullah, M. (2020). Machine learning with oversampling and undersampling techniques: Overview study and experimental results. In 2020 11th International Conference on Information and Communication Systems (ICICS) (pp. 243-248). IEEE. https://doi.org/10.1109/ICICS49469.2020.239524

[24] Chen, J., Liang, Y., Liu, J., & Zhou, M. (2026). Temporal transformer with conditional tabular GAN for credit card fraud detection: A sequential deep learning approach. Mathematics, 14(7), 1183. https://doi.org/10.3390/math14071183

[25] Cheng, D., Wang, X., Zhang, Y., & Zhang, L. (2022). Graph neural network for fraud detection via spatial-temporal attention. IEEE Transactions on Knowledge and Data Engineering, 34(8), 3800-3813. https://doi.org/10.1109/TKDE.2020.3034888

[26] Liu, Y., Ao, X., Qin, Z., Chi, J., Feng, J., Yang, H., & He, Q. (2021). Pick and choose: A GNN-based imbalanced learning approach for fraud detection. In Proceedings of the Web Conference 2021 (pp. 3168-3177). ACM. https://doi.org/10.1145/3442381.3449948

[27] Pareja, A., Domeniconi, G., Chen, J., Ma, T., Suzumura, T., Kanezashi, H., Kaler, T., Schardl, T. B., & Leiserson, C. E. (2020). EvolveGCN: Evolving graph convolutional networks for dynamic graphs. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 34, No. 4, pp. 5363-5370). AAAI Press. https://doi.org/10.1609/aaai.v34i04.5984

[28] Umakor, M. F., Iheanyi, I. K. E. C. H. U. K. W. U., Ofurum, U. D., Ibecheozor, U. H. B., & Adeyefa, E. A. (2025). Federated learning for privacy-preserving fraud detection in digital banking: Balancing algorithmic performance, privacy, and regulatory compliance. Iconic Research and Engineering Journal, 9(1), 215-231.