上海大学学报(自然科学版) ›› 2025, Vol. 31 ›› Issue (4): 719-734.doi: 10.12066/j.issn.1007-2861.2594

• 信息工程 • 上一篇    下一篇

融合MAML和对比学习的小样本加密流量分类模型

金彦亮1,2, 方洁1,2, 高塬1,2   

  1. 1. 上海大学 通信与信息工程学院, 上海 200444;
    2. 上海大学 上海先进通信与数据科学研究院, 上海 200444
  • 收稿日期:2024-01-29 出版日期:2025-08-31 发布日期:2025-09-16
  • 通讯作者: 金彦亮(1973—),副教授,博士生导师,研究方向为大数据与网络安全、人工智能. E-mail:jinyanliang@staff.shu.edu.cn
  • 基金资助:
    上海市科委资助项目(XTCK-KJ-2022-68); 上海市科委创新计划资助项目(22511103202)

Few-shot encrypted traffic classification model incorporating MAML and contrastive learning

JIN Yanliang1,2, FANG Jie1,2, GAO Yuan1,2   

  1. 1. School of Communication and Information Engineering, Shanghai University, Shanghai 200444, China;
    2. Shanghai Institute for Advanced Communication and Data Science, Shanghai University, Shanghai 200444, China
  • Received:2024-01-29 Online:2025-08-31 Published:2025-09-16

摘要: 为了应对当前有标签加密流量数量有限的挑战,同时迅速适应新兴流量的分类任务,提出了一种融合模型无关元学习(model-agnostic meta-learning,MAML)和对比学习的小样本加密流量分类模型.具体来说,通过引入监督对比损失来改进MAML的内层优化,使得会话流经过特征编码网络生成的嵌入表示在标签空间更易区分,从而获得跨多个任务的通用的元知识.借助元知识,新任务适应阶段只需少量标记数据,即可在目标任务上快速学习并获得出色性能.在公有数据集ISCXVPN-NonVPN2016和一个私有数据集上的实验结果表明,所提方法超越了已有的小样本分类方法.在2way-10shot任务中,所提方法在公有数据集上达到97.46%的准确率和97.12%的F1分数;在私有数据集上达到95.19%的准确率和94.96%的F1分数.此外,所提出的模型能够缓解MAML难以应对的类间相似性和类内差异性问题.在公有数据集的5way-10shot任务中,所提出模型的准确率和F1分数相较于MAML分别提升了3.62%和3.70%.

关键词: 加密流量分类, 小样本, MAML, 元学习, 对比学习

Abstract: In order to address the current challenge of limited labeled encrypted traffic and rapidly adapt to the classification tasks of emerging traffic, this paper proposed a few-shot encrypted traffic classification model incorporating model-agnostic meta-learning (MAML) and contrastive learning. Specifically, it improved the inner-loop optimization of the MAML by incorporating supervised contrastive loss, thereby enabling the embedding representations generated by the feature encoding network during the conversation flow to be more distinguishable in the label space. Consequently, general meta-knowledge across multiple tasks was obtained. Leveraging this meta-knowledge, the adaptation phase for new tasks requires only a small amount of labeled data, enabling rapid learning and satisfactory performance on the target task. Results on the public dataset ISCXVPN-NonVPN2016 and a private dataset show that the proposed method exceeds the existing few-shot classification methods. In the 2way-10shot task, the proposed method achieves 97.46% accuracy and 97.12% F1-score on the public dataset, as well as 95.19% accuracy and 94.96% F1-score on the private dataset, respectively. In addition, the proposed model can alleviate the problems of inter-class similarity and intra-class difference that MAML is difficult to deal with. Compared to MAML, its accuracy and F1-score improve by 3.62% and 3.70% on the 5way-10shot task in the public dataset, respectively.

Key words: encrypted traffic classification, few-shot, MAML, meta-learning, contrastive learning

中图分类号: