基于特征工程的广告点击转化率预测模型
作者:
作者单位:

1.广东工业大学应用数学学院, 广州, 510520;2.北京明略软件系统有限公司, 广州, 510300;3.广东工业大学计算机学院, 广州, 510006

作者简介:

通讯作者:

基金项目:

国家自然科学基金(61673122)资助项目;广东省自然科学基金(2019A1515010548)资助项目。


A Prediction Model for Advertising Click Conversion Rate Based on Feature Engineering
Author:
Affiliation:

1.School of Applied Mathematics, Guangdong University of Technology, Guangzhou, 510520, China;2.Mininglamp Technology, Guangzhou, 510300, China;3.School of Computers, Guangdong University of Technology, Guangzhou, 510006, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    在大数据环境下,随着全球网络广告传播行业的快速发展,网络广告的计算也越来越受到人们的高度关注。计算广告旨在将广告投放到特定的受众人群,以广告环境和用户特征为基础进行数据分析计算,从候选广告库中选择出最佳匹配的广告。其核心问题是通过网络广告点击转化率预测的计算,将用户点击可能性最高的广告选择出来。广告点击转化率的精确预测与媒体、广告主和用户3方的利益密切相关。该研究基于TrackMaster平台提供的真实广告数据,以特征工程的视角,分别从用户信息特征、广告信息特征、上下文特征和统计特征4个角度进行特征分析,从而挖掘出对广告点击转化率影响较大的重要特征,构建广告点击转化率预测分层模型并训练,并且结合LightGBM算法模型得出广告点击转化率的重要特征排序。实验结果表明当特征选择阈值,特征选择数目为19,树的颗数为100时的受试者工作特征曲线下的面积(Area under receiver operating characteristic curve, AUC)值最大,模型的对数损失函数值约为0.136 8,此时模型具有最优的效果。预测模型和特征排序结果有助于企业制定最优的广告投放策略。

    Abstract:

    Under the environment of big data, with the rapid expansion of the online advertising industry, the online advertising calculation has attracted more and more attention. Computational advertising aims at placing ads on a specific audience, performs data analysis and calculation based on the advertising environment and user characteristics, and selects the best matching ad from the candidate ad library. The core issue is the calculation of click conversion rate prediction for online advertising, which selects the ads with the highest probability of users clicking. The accurate prediction of advertisement click conversion rate is related to benefits of publishers, advertisers and users. Based on the advertising data provided by the TrackMaster platform, this study analyzes user information features, advertising information features, context features and statistical features from the perspective of feature engineering. The larger effects on the advertising click conversion characteristics are excavated out. Layered advertisement click conversion rate prediction model is constructed and trained. The LightGBM algorithm model is adopted to obtain the important feature ranking of the ad click conversion rate. The experimental results indicate that when the feature selection threshold is 0.95, the number of feature choices is 19, and the number of trees is 100, the area under receiver operating characteristic (ROC) curve (AUC) value of the model is the maximum, and the logarithmic loss function value of the model is about 0.136 8. The model has the optimal effect. The prediction model and the result of feature ranking are helpful for the enterprise to make the optimal advertising strategy.

    参考文献
    相似文献
    引证文献
引用本文

邓秀勤,谢伟欢,刘富春,张翼飞,樊娟.基于特征工程的广告点击转化率预测模型[J].数据采集与处理,2020,35(5):842-849

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2020-02-16
  • 最后修改日期:2020-04-27
  • 录用日期:
  • 在线发布日期: 2020-09-25