Big Data Sentiment Analysis based on PLSA and its Application
Ebenezer Komla Gavua1, Seth Okyere-Dankwa2, Gignarta Kasaye Soka3
1Ebenezer Komla Gavua, Koforidua Technical University, Ghana, India.
2Seth Okyere-Dankwa, Koforidua Technical University, Ghana, India.
3Gignarta Kasaye Soka, Omaha, Nebraska, USA, India.
Manuscript received on August 21, 2017. | Revised Manuscript received on August 28, 2017. | Manuscript published on September 05, 2017. | PP: 35-42 | Volume-7 Issue-4, September 2017. | Retrieval Number: D3047097417/2017©BEIESP
Open Access | Ethics and Policies | Cite
©The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: Posting reviews online has become an increasingly popular way for people to express opinions and sentiments toward products bought or services received. Analyzing the large volume of online reviews available would produce useful actionable knowledge that could be of economic value to vendors and other interested parties. This study conducted a case study in the movie domain, and tackles the problem of mining reviews for predicting product sales performance. Based on an analysis of the complex nature of sentiments, this paper studies Sentiment PLSA (S-PLSA), in which a blog entry is viewed as a document generated by a number of hidden sentiment factors. Training an S-PLSA model on the blog data enables us to obtain a succinct summary of the sentiment information embedded in the blogs. The study then presents SAAR, a sentiment-aware autoregressive model, to utilize the sentiment information captured by S-PLSA for predicting product sales performance. Extensive experiments were conducted on a movie data set. In this study SAAR is compared with alternative models that do not take sentiment information into account; as well as a model with different feature selection methods. Experiments confirm the effectiveness and superiority of the approach studied.
Keywords: Blog mining, Hadoop, MapReduce.