异构社交平台中用户身份解析
作者:
作者单位:

沈阳建筑大学计算机科学与工程学院,沈阳 110168

作者简介:

通讯作者:

基金项目:

国家自然科学基金(62073227);国家重点研发计划(2021YFF0306303);辽宁省自然科学基金(2019-MS-264);辽宁省教育厅项目(LJKZ0582);中国学位与研究生教育研究课题(2020MSA40)。


User Identity Resolution Across Heterogeneous Social Platforms
Author:
Affiliation:

School of Computer Science and Engineering, Shenyang Jianzhu University, Shenyang 110168, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    跨社交平台的用户身份解析是社交网络一个重要的研究方向,其可以有效集成不同平台的同一用户信息。现有的用户身份解析工作大多针对类型相似的社交平台,平台间的信息相对对称,通过用户在不同平台上的档案属性、空间位置、网络关系等信息的相似度来判别是否为同一用户。然而,在两个异构社交平台中用户信息是不对称的,难以直接获取到用于用户身份解析的相应属性信息。本文研究跨评论类与活动类平台间的用户身份解析方法。为了解决两类社交平台的用户信息属性不对称问题,把用户信息按档案属性、语义序列、特征词序列3类信息组织,从各自的社交平台中抽取相应的信息建立映射关系,提出了综合3类信息的集成匹配算法。考虑了用户活动的时间偏移现象,采用反向传播学习的方法获取时间偏移权重,提出了基于反向传播学习的语义序列与特征词序列相似性度量方法。同时,设计了总体相似度度用于用户身份解析。利用真实数据集进行了充分的实验,实验结果表明了所提出用户身份解析算法的有效性。

    Abstract:

    The identity resolution across social platforms is an important research aspect, which integrates the user’s information from various platforms. Most of the existing user identity resolution work is aimed at social platforms with similar types. The information between platforms is relatively symmetrical. Whether the user is the same user is determined by the similarity of user’s profile attributes, spatial location, network relations and other information on different platforms. However, in the two heterogeneous social platforms, the user information is asymmetric so that we cannot get the corresponding attribute information for user identity resolution. This paper discusses the method of user identity resolution across comment and activity platforms. To solve the problem of user information attribute asymmetry of across social platforms, the user information is organized according to three types of information: profile attribute, semantic sequence and feature word sequence. The corresponding information is extracted from their respective social platforms to establish mapping relationships, and an integrated matching algorithm integrating the three types of information is proposed. Considering the time offset phenomenon of user activities, the back propagation learning method is used to obtain the time offset weights, and a similarity measurement method between semantic sequence and feature word sequence based on back propagation learning is proposed. At the same time, an overall similarity is designed for user identity. Experimental results on real dataset show that the proposed method is effective on user identity resolution.

    参考文献
    相似文献
    引证文献
引用本文

刘俊岭,刘颖,马晨旭,赵巧娜,孙焕良,许景科.异构社交平台中用户身份解析[J].数据采集与处理,2022,37(5):1101-1116

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2021-08-23
  • 最后修改日期:2022-01-21
  • 录用日期:
  • 在线发布日期: 2022-09-25