标题：A deep learning-based RNNs model for automatic security audit of short messages
作者：You, Lina ;Li, Yujun ;Wang, Yue ;Zhang, Jie ;Yang, Yang
作者机构：[You, Lina ;Li, Yujun ;Wang, Yue ;Zhang, Jie ;Yang, Yang ] School of Information Science and Engineering, Shandong University, Jinan, China
会议名称：16th International Symposium on Communications and Information Technologies, ISCIT 2016
会议日期：26 September 2016 through 28 September 2016
来源：2016 16th International Symposium on Communications and Information Technologies, ISCIT 2016
关键词：deep learning; recurrent neural networks (RNNs); short message security audit; word2vec
摘要：The traditional text classification methods usually follow this process: first, a sentence can be considered as a bag of words (BOW), then transformed into sentence feature vector which can be classified by some methods, such as maximum entropy (ME), Naive Bayes (NB), support vector machines (SVM), and so on. However, when these methods are applied to text classification, we usually can not obtain an ideal result. The most important reason is that the semantic relations between words is very important for text categorization, however, the traditional method can not capture it. Sentiment classification, as a special case of text classification, is binary classification (positive or negative). Inspired by the sentiment analysis, we use a novel deep learning-based recurrent neural networks (RNNs)model for automatic security audit of short messages from prisons, which can classify short messages(secure and non-insecure). In this paper, the feature of short messages is extracted by word2vec which captures word order information, and each sentence is mapped to a feature vector. In particular, words with similar meaning are mapped to a similar position in the vector space, and then classified by RNNs. RNNs are now widely used and the network structure of RNNs determines that it can easily process the sequence data. We preprocess short messages, extract typical features from existing security and non-security short messages via word2vec, and classify short messages through RNNs which accept a fixed-sized vector as input and produce a fixed-sized vector as output. The experimental results show that the RNNs model achieves an average 92.7% accuracy which is higher than SVM. © 2016 IEEE.