标题:Massive Fishing Website URL Parallel Filtering Method
作者:Xu, Dongliang; Pan, Jingchang; Du, Xiaojiang; Wang, Bailing; Liu, Meng; Kang, Qinma
作者机构:[Xu, Dongliang; Pan, Jingchang; Liu, Meng; Kang, Qinma] Shandong Univ, Sch Mech Elect & Informat Engn, Weihai 264209, Peoples R China.; [Du, Xiaojia 更多
通讯作者:Du, XJ
通讯作者地址:[Du, XJ]Temple Univ, Dept Comp & Informat Sci, Philadelphia, PA 19122 USA.
来源:IEEE ACCESS
出版年:2018
卷:6
页码:2378-2388
DOI:10.1109/ACCESS.2017.2782847
关键词:URL filtering; randomized fingerprint model; GRFP-WM
摘要:A randomized fingerprint model is proposed, which can effectively reduce the false positive rate by generating a unique fingerprint for each URL. The model is also used to improve the Wu and Manber (WM) algorithm, which is a multi-string matching algorithm; as a result, a randomized fingerprint WM (RFP-WM) algorithm is proposed. Furthermore, a Graphics Processing Unit (GPU)-based parallel randomized fingerprint algorithm (GRFP-WM) is implemented. Experimental results indicate that, for a massive pattern set containing more than a million URLs, the efficiency of the RFP-WM algorithm is 20% higher than that of the WM algorithm. The WM algorithm's efficiency is approximately 7% higher than that of the Aho and Corasick (AC) algorithm, which is also a multi-string matching algorithm. The efficiency and speedup of the GRFP-WM algorithm are higher than those of the GPU-based WM and the GPU-based AC algorithms. These results indicate that the randomized fingerprint model can effectively reduce the collision rate and improve the efficiency of the algorithm.
收录类别:SCIE
资源类型:期刊论文
TOP