张克君, 杨炳儒, 赵耿, 曲文龙, 李欣. 一种分布式Web使用模式挖掘模型及算法[J]. 工程科学学报, 2006, 28(9): 896-901. DOI: 10.13374/j.issn1001-053x.2006.09.020
引用本文: 张克君, 杨炳儒, 赵耿, 曲文龙, 李欣. 一种分布式Web使用模式挖掘模型及算法[J]. 工程科学学报, 2006, 28(9): 896-901. DOI: 10.13374/j.issn1001-053x.2006.09.020
ZHANG Kejun, YANG Bingru, ZHAO Geng, QU Wenlong, LI Xin. Construction and algorithms of distributed web usage pattern mining[J]. Chinese Journal of Engineering, 2006, 28(9): 896-901. DOI: 10.13374/j.issn1001-053x.2006.09.020
Citation: ZHANG Kejun, YANG Bingru, ZHAO Geng, QU Wenlong, LI Xin. Construction and algorithms of distributed web usage pattern mining[J]. Chinese Journal of Engineering, 2006, 28(9): 896-901. DOI: 10.13374/j.issn1001-053x.2006.09.020

一种分布式Web使用模式挖掘模型及算法

Construction and algorithms of distributed web usage pattern mining

  • 摘要: 给出了一种分布式Web日志挖掘模型DWLMS.根据对挖掘过程及算法进行分析,提出了一种基于DWLMS的局部频繁路径的更新算法LFP和全局频繁路径的更新算法GFP,较好地解决了Web访问信息的异地存储、实时增长、分布式算法通讯量等因素给模式分析过程带来的困难.在实验室对该方法进行了简单实现和实际日志数据的测试,结果表明了算法的有效性.

     

    Abstract: A distributed Web log mining system model (DWLMS) is presented. Based on the analysis on the procedure and algorithm of Web frequent access pattern mining, the more general incremental updating algorithms of local frequent paths (LFP) and global frequent paths (GFP) in a distributed database system based on DWLMS are proposed for discovering the frequent access paths. Some troubles produced by real time incremental distributed Web access information and more communication data are solved better by the algorithms. The method was realized simply and tested with real world Web log information in laboratory, and the results show that the algorithms are valid.

     

/

返回文章
返回