A Survey of Research on Fine—grained Sentiment Analysis in Chinese

2017-11-14 11:05YimengTangYouweiYu

西部论丛 2017年6期

Yimeng Tang　Youwei Yu

Abstract：To review the research progress of fine-grained sentiment analysis， and the classification （namely machine learning classification and classification based on dependency syntax and lexicon）. Finally， the application prospect of fine-grained text analysis was introduced. This study helps to understand the key issues and key methods of the current research on fine-grained sentiment analysis.

Keywords： Fine-grained sentiment analysis； Evaluation word extraction； Attribute word

*Corresponding Author： Yimeng Tang （921154624@163.com）

1. Introduction

The popularity of the internet is an important communication platform at present. While promoting peoples network communication， it has also produced a lot of commentary information. So it also produces the demand for emotional analysis of text generated by the Internet communication platform. The public opinion monitoring technology contains the text clustering analysis， topic extraction， rapid generation of briefings， charts and other analysis results that can provide an analysis basis in order to fully grasp the trend of network public opinion and can make the correct guidance of public opinion. However， it is impossible to cope with the emotional analysis task of massive text information by artificial. So it is a hot topic to analyze the emotion of the participants accurately and quickly based on the text data of the massive Internet platform.

2. Definition of Emotional Analysis

Sentiment analysis is also called opinion mining. Traditional textual sentiment analysis is mostly coarse-grained sentiment analysis and it is no longer adapted to the actual needs so the researchers proposed a fine-grained sentiment analysis method for text information. At present， domestic research on sentiment analysis is mainly on fine-grained sentiment analysis. This article reviewed the sentiment classification methods of textual information from the current literature on fine-grained sentiment analysis， and focused on the main issues and methods of fine-grained level sentiment analysis.

3. Process of Emotional Analysis

There are two ways of sentiment analysis： dependency grammar and dictionary analysis， and machine learning analysis. The analysis steps based on dependency syntax and dictionaries are roughly divided into the extraction of subjective sentences and syntax rules， the identification of emotional words in sentences， and the calculation of emotional scores based on sentiment lexicon for emotional tendencies and emotional strengths. The analysis steps based on machine learning include extracting features， selecting features and getting classification results.

3.1 Analysis Based on Machine Learning

The classification based on machine learning means that according to the principle of machine learning and training a large number of labeled samples， effective features can be extracted. The classification model can be constructed， then emotional classification will be fulfilled at last [1]. For emotional analysis requires a lot of training samples， Su[2] proposed naive Bayes model and Latent Dirichlet Allocation （LDA） to provide appropriate emotional dictionaries and perform progress Emotional tendency analysis without marking the corpus. Fan[3] proposed a text-based topic and sentiment analysis method basis on a hybrid model. Some researchers have proposed hybrid models， a combination of deep learning and emotional dictionaries， and a combination of machine learning and sentiment lexicon. Ding[4] found a combination of dictionary and LDA， which is higher than the accuracy of that based on dictionary. From the results， the affective entity recognition rate of the double-layer CRF model has been improved relative to the single-layer Linear-chain CRF model. It can be seen that the hybrid model can combine the advantages of machine learning and dictionaries， and it is superior to the performance of only using deep learning or machine learning.

There are also researchers who use deep learning methods to perform sentiment analysis on feature vectors generated by words. Jiang[5] obtained word vector features， entered the results into Long Short-Term Memory， and used remote monitoring methods to generate a large number of samples to mitigate over fitting. Compared with the MIML-SF model combined with classifier and remote supervision， and the CNN-SF model was constructed from deep learning convolutional neural network. The results show that LSTM has greater advantages in timing information and performance. Although neural networks have excellent performance in many fields， neural networks generally have huge data volumes， many parameters， and high performance requirements for running equipment. Therefore， fewer researchers use only deep learning methods.

3.2 Emotional Analysis Based on Dependency Syntax and Dictionary

The sentiment analysis based on dependency syntax and dictionary is mainly divided into steps of establishing emotional dictionary， extracting subjective sentence，dependency parsing， combining dictionary resources and syntax for fine-grained calculation.

3.2.1 Emotional Word Extraction

Emotional word extraction based on sentiment knowledge uses the existing sentiment dictionary to assign emotional sentiment to words or evaluation units with emotional tendencies in the text， and then calculates the emotional tendency of the whole text. The same words are expressed differently in different professional contexts. For example “the high energy consumption of such a car” and the “high visibility of the light stick at night” are different in different fields. Therefore， when researching different fields， it is necessary to expand the dictionary in a specific field. Some scholars have proposed a cross-language emotional classification， that is， using a more complete English sentiment dictionary for Chinese sentiment analysis. Tang[6]a cross-language fine-grained sentiment analysis algorithm based on dependency syntax. Compared with the original emotion evaluation unit extraction method， this method improves the extraction efficiency to some extent. This method first extracts the emotion evaluation unit and then translates it， so that it can reduce the dependence on machine translation， and effectively utilizes the English vocabulary with richer resources. It also tends to translate Chinese emotion units into higher frequency English basic vocabulary through machine translation. This method combines the advantages of synonymy and extended emotional lexicon， especially in some languages lacking corpus resources， such as some minority language analysis. The combination of the synonym dictionary can merge some words with similar meanings， so that the dimension of the word vector is reduced.

3.2.2 Evaluation Objects Extraction

Ontology is the formal expression between concepts and relationships. In product reviews， the focus of reviews is generally to comment on the attributes of the product itself. A product feature is a product attribute that a user evaluates in a comment. Ontology attribute extraction is the core part of comment mining， including explicit product feature extraction and implicit product feature extraction. Implicit feature extraction is more difficult and less research results. But implicit features also have a major impact on sentiment analysis. Lu[7] uses semantic grammar to describe texts containing attribute knowledge and deeply parse sentences to achieve syntactic and semantic analysis. That is， the pattern matching method is used to extract the implicit features. However， some common words can be matched with many features， resulting in inability to identify features and reduce accuracy. And lack of corpus can lead to inaccurate results. The same words are different in different contexts. For example， “high” is in derogatory sense when describing “price” and it is in complimentary sense when describing “price/performance ratio”. Therefore， one of the next research directions is to study the emotional expression in different situations.

4. Conclusions and Future Work

This paper summarizes the development trends and research hotspots in this field by discussing the research methods and latest developments of Chinese fine-grained sentiment analysis in recent years. The best method is not a single model or algorithm， but a combination of multiple algorithms and dictionaries. At the same time， the expansion of the emotional dictionary is also imperative. Future research directions include cross-domain sentiment analysis， ambiguitys solution of different domains semantic， and implicit emotional object extraction.

（此文由于版面不足有刪减，具体全文可联系作者获得）

References

[1]R. Liu， M. Nian， Z. Fan. Emotional tendency analysis of online review of teaching materials [J]. Application of computer system， 10（2017）144-149.

[2] Y. Su， Y. Hu， B. Hu， X. Tu. Sentiment analysis based on Naive Bayes and latent Dirichlet distribution [J]. Computer application， 06（2016）1613-1618.

[3] N. Fan， W. Cai， Y. Zhao. Text topic emotion analysis method based on hybrid model [J]. Journal of Huazhong University of Science and Technology （NATURAL SCIENCE EDITION）， 01（2010）31-34.

[4] W. Ding. Emotional analysis based on dictionaries and machine learning combinations [D]. Xian University of post and Telecommunications （2017）

[5] H. Jiang. Research on attribute extraction based on depth learning [D].Zhejiang University （2017）

[6] X. Tang， Y. Liu. Cross language fine grained sentiment analysis based on dependency syntax [J]. Information theory and Practice， 06（2018）124-129.

http：//kns.cnki.net/kcms/detail/11.1762.G3.20180315.1523.004.html

[7] Y. Lu. Attribute knowledge acquisition based on semantic grammar [D]. jiangsu university of science and technology （2016）