no-4

Developing a product rating prediction model based on customer text feedback using machine learning algorithms

Authors:
Ha Dien Thi Hong
Pages:
0
View:
4
Position:
2/2
Download:
0
This study focuses on predicting product rating scores from customer text feedback. This task combines the challenges of natural language processing (NLP) with the class imbalance phenomenon commonly found in real-world datasets. Using the Amazon Fine Food Reviews dataset, the research involves preprocessing, linguistic feature extraction, and the construction of various machine learning and deep learning models to evaluate predictive performance. The models compared include Logistic Regression, Decision Tree, Random Forest, CatBoost, LightGBM, XGBoost, FastText, CNN, and LSTM. Experimental results reveal a clear trade-off between classification accuracy and ordinal prediction capability: Logistic Regression achieved the highest classification accuracy, while CNN demonstrated the best ordinal prediction quality with the highest Quadratic Weighted Kappa (QWK) and the lowest Mean Absolute Error (MAE). FastText stands out as a strong baseline model, offering competitive performance with...
This study focuses on predicting product rating scores from customer text feedback. This task combines the challenges of natural language processing (NLP) with the class imbalance phenomenon commonly found in real-world datasets. Using the Amazon Fine Food Reviews dataset, the research involves preprocessing, linguistic feature extraction, and the construction of various machine learning and deep learning models to evaluate predictive performance. The models compared include Logistic Regression, Decision Tree, Random Forest, CatBoost, LightGBM, XGBoost, FastText, CNN, and LSTM. Experimental results reveal a clear trade-off between classification accuracy and ordinal prediction capability: Logistic Regression achieved the highest classification accuracy, while CNN demonstrated the best ordinal prediction quality with the highest Quadratic Weighted Kappa (QWK) and the lowest Mean Absolute Error (MAE). FastText stands out as a strong baseline model, offering competitive performance with the fastest training time, whereas boosting-based models deliver high arithmetic accuracy but require greater computational cost. No single model outperforms all others across every metric, highlighting the importance of selecting algorithms that align with specific application objectives. These findings clarify the role of linguistic features in user feedback and provide practical insights for developing rating prediction systems in e-commerce.
Relate

Vinh University journal of science

Tạp chí khoa học Trường Đại học Vinh

ISSN: 1859 - 2228

Governing body: Vinh University

  • Address: 182 Le Duan - Vinh City - Nghe An province
  • Phone: (+84) 238.3855.452 - Fax: (+84) 238.3855.269
  • Email: vinhuni@vinhuni.edu.vn
  • Website: https://vinhuni.edu.vn

 

License: 163/GP-BTTTT issued by the Minister of Information and Communications on May 10, 2023

Open Access License: Creative Commons CC BY NC 4.0

 

CONTACT

Editor-in-Chief: Assoc. Prof., Dr. Tran Ba Tien
Email: tientb@vinhuni.edu.vn

Deputy editor-in-chief: Assoc. Prof., Dr. Phan Van Tien
Email: vantienkxd@vinhuni.edu.vn

Sub-Editor: Dr. Do Mai Trang
Email: domaitrang@vinhuni.edu.vn

Editorial assistant: Msc. Le Tuan Dung, Msc. Phan The Hoa, Msc. Pham Thi Quynh Nga, Msc. Tran Thi Thai

  • Address: 4th Floor, Executive Building, No. 182, Le Duan street, Vinh city, Nghe An province.
  • Phone: (+84) 238-385-6700 | Hotline: (+84) 97-385-6700
  • Email: editors@vujs.vn
  • Website: https://vujs.vn

img