基于建链信息的密数据流识别方法
作者:
作者单位:

1.陆军工程大学指挥控制工程学院,南京 210007;2.华北计算技术研究所,北京 100083

作者简介:

通讯作者:

基金项目:

国家自然科学基金(6207625)资助项目。


Identification Method of Encrypted Data Flow Based on Chain-Building Information
Author:
Affiliation:

1.Command and Control Engineering College, Army Engineering University of PLA, Nanjing 210007, China;2.North China Institute of Computer Technology, Beijing 100083, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    针对加密流量难以识别的问题,提出一种利用神经网络提取通信双方建链信息以识别加密流量的方法。该方法首先获取加密连接建立阶段的交互流量,将流量数据转化为灰度图,然后利用卷积神经网络提取其图像特征,进而提取加密数据流的类别特征。由于在建链阶段就可提取类别信息,所以该方法具有早期识别特性,这能使加密流量的识别与管控实现有机结合。另外,针对背景流量属性集无限大、训练数据不完备的问题,提出将随机数据加入到背景流量中进行数据增强的近似完备法。在真实环境中进行测试,结果显示该方法的准确率达到95.4%,识别耗时为0.1 ms,明显优于对照算法。

    Abstract:

    Aiming at the problem that it is difficult to identify the encrypted traffic, a novel detection method based on the chain-building information is proposed, which utilize the a neural network to extract encrypted traffic characteristics from chain-building data. Firstly the interactive traffic between clients and servers is captured at the beginning of the encrypted connection establishment, then the fore 1 024 bytes of them is converted into grayscale. Finally the convolutional neural network model is constructed to learn these characteristics to extract the pattern of the encrypted traffic. Due to the category information can be extracted at the stage, so this method has the characteristic of early identification, which enables the identification and management of encrypted traffic to be organically combined. In addition, in view of the problems from infinite background traffic attribute set and incomplete training data, an approximate complete method is proposed which mixs random data to the background traffic for data enhancement. The test is carried out in a real environment, the results show that the accuracy of this method reaches 95.4%, and the recognition time is 0.1 ms, which is significantly better than comparison algorithms.

    表 3 数据集的详细信息Table 3 Details of the data sets
    表 1 TCP数据包的非类别特征及其位置Table 1 Non-categorical features and their location of TCP packets
    表 5 两组实验的各项指标值Table 5 Index comparison of two experiments
    表 7 本文方法和基线方法对比结果Table 7 Comparison of experimental results between our method and the baseline method
    表 2 神经网络的结构与参数Table 2 Structure and parameters of the neural network
    表 4 实验设置对比Table 4 Comparison of experiment settings
    表 6 无随机数据组和有随机数据组中最优例的实验结果Table 6 Experiment results of the best examples in the random data group or no random data group
    图1 加密连接的两个阶段Fig.1 Two stages of encrypted connection
    图2 网络流与会话Fig.2 Network flow and conversation
    图3 TCP/IP协议数据包结构Fig.3 Data packet structure of TCP/IP protocol
    图4 加密流量识别的整体流程Fig.4 Overall process of encrypted traffic identification
    图5 流量文件生成灰度图的过程Fig.5 Process of generating gray images from flow files
    图6 卷积神经网络结构Fig.6 Structure of convolutional neural network
    图7 利用随机数据实现背景流量数据增强Fig.7 Background flow data enhancement by using random flow data
    图8 Shadowsocks和v2ray的实现原理Fig.8 Realization principle of Shadowsocks and v2ray
    图9 比率R对识别效果的影响Fig.9 Influence of R on recognition effect
    参考文献
    相似文献
    引证文献
引用本文

蒋考林,白玮,任传伦,张磊,陈军,潘志松,郭世泽.基于建链信息的密数据流识别方法[J].数据采集与处理,2021,36(3):595-604

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2020-11-07
  • 最后修改日期:2021-01-10
  • 录用日期:
  • 在线发布日期: 2021-05-25