数据中心内服务器能耗数据采集及特征分析
作者:
作者单位:

1.南京邮电大学计算机学院,南京 210023;2.江苏省大数据安全与智能处理重点实验室,南京 210023

作者简介:

通讯作者:

基金项目:

国家重点研发计划(2018YFB1003702)资助项目;科研与实践创新基金(KYCX20_0760)资助项目。


Data Collection and Feature Analysis of Server Energy Consumption in Data Center
Author:
Affiliation:

1.School of Computer Science and Technology, Nanjing University of Posts and Telecommunications, Nanjing 210023, China;2.Jiangsu Key Laboratory of Big Data Security and Intelligent Processing, Nanjing 210023, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    数据中心的高能耗低能效问题正受到广泛关注与研究,但目前没有公开的数据中心内服务器能耗数据集供研究人员使用,且现有过滤式特征选择并不能满足运维人员的需求。为此,提出了一套模拟数据中心内服务器运行状态的仿真环境架构,基于该架构采集了服务器运行各类任务时的多项性能指标和能耗数据。然后将基于因果关系的特征选择应用于能耗数据集的特征分析中,构建出可解释的特征子集和能耗预测结果。实验结果表明,因果特征子集大小约为过滤式特征子集大小的1/3到1/6,并且使用因果特征子集训练的模型在75%的情况下都取得了最优预测精度。

    Abstract:

    The problem of high energy consumption and low energy efficiency of data center has been pard extensive attention to and investigated by researchers. However, there is no public dataset of server energy consumption for researchers to use, and current filter feature selection can not satisfy requirements of engineers. Here, a simulation environment architecture is proposed to simulate the running state of servers in the data center. Based on the proposed architecture, performance parameters and energy consumption data of server are collected when the server as running various tasks. Causal feature selection is applied to the feature analysis of energy consumption datasets, and thus an interpretable feature subset is constructed and the energy consumption forecast results are obtained. Experimental results show that the size of causal feature subset is about 1/3 to 1/6 of the size of filter feature subset, and the model trained with causal feature subset achieves the optimal prediction accuracy in 75% of the cases.

    表 1 服务器配置信息Table 1 Server configuration information
    表 6 ffmpeg的因果特征子集Table 6 Causal feature subset of ffmpeg data set
    表 2 特征命名及物理意义Table 2 Feature naming and physical meaning
    图1 特征类型及其关系Fig.1 Feature types and their relationship
    图2 仿真环境架构Fig.2 Architecture of simulation environment
    图3 基于所有特征的拟合曲线Fig.3 Fitting curves based on all features
    图4 基于皮尔逊系数特征的拟合曲线Fig.4 Fitting curves based on Pearson features
    图5 基于斯皮尔曼系数特征的拟合曲线Fig.5 Fitting curves based on Spearman features
    图6 基于Chaos特征的拟合曲线Fig.6 Fitting curves based on Chaos features
    图7 基于Hiton-PC特征的拟合曲线Fig.7 Fitting curves based on Hiton-PC features
    图8 LSTM在wc98-67数据集上的损失曲线Fig.8 Loss curves of LSTM on wc98-67 data set
    表 4 处理后的数据集统计信息Table 4 Statistics of processed dataset
    表 3 原始数据集统计信息Table 3 Statistics of raw dataset
    表 7 模型预测结果Table 7 Model forecast results
    表 5 特征子集大小Table 5 Size of feature subset
    参考文献
    相似文献
    引证文献
引用本文

周清,张諝晟,沈子钰,李云.数据中心内服务器能耗数据采集及特征分析[J].数据采集与处理,2021,36(5):986-995

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2020-11-11
  • 最后修改日期:2021-09-15
  • 录用日期:
  • 在线发布日期: 2021-09-25