This study focuses on predicting product rating scores from customer text feedback. This task combines the challenges of natural language processing (NLP) with the class imbalance phenomenon commonly found in real-world datasets. Using the Amazon Fine Food Reviews dataset, the research involves preprocessing, linguistic feature extraction, and the construction of various machine learning and deep learning models to evaluate predictive performance. The models compared include Logistic Regression, Decision Tree, Random Forest, CatBoost, LightGBM, XGBoost, FastText, CNN, and LSTM. Experimental results reveal a clear trade-off between classification accuracy and ordinal prediction capability: Logistic Regression achieved the highest classification accuracy, while CNN demonstrated the best ordinal prediction quality with the highest Quadratic Weighted Kappa (QWK) and the lowest Mean Absolute Error (MAE). FastText stands out as a strong baseline model, offering competitive performance with...
This study focuses on predicting product rating scores from customer text feedback. This task combines the challenges of natural language processing (NLP) with the class imbalance phenomenon commonly found in real-world datasets. Using the Amazon Fine Food Reviews dataset, the research involves preprocessing, linguistic feature extraction, and the construction of various machine learning and deep learning models to evaluate predictive performance. The models compared include Logistic Regression, Decision Tree, Random Forest, CatBoost, LightGBM, XGBoost, FastText, CNN, and LSTM. Experimental results reveal a clear trade-off between classification accuracy and ordinal prediction capability: Logistic Regression achieved the highest classification accuracy, while CNN demonstrated the best ordinal prediction quality with the highest Quadratic Weighted Kappa (QWK) and the lowest Mean Absolute Error (MAE). FastText stands out as a strong baseline model, offering competitive performance with the fastest training time, whereas boosting-based models deliver high arithmetic accuracy but require greater computational cost. No single model outperforms all others across every metric, highlighting the importance of selecting algorithms that align with specific application objectives. These findings clarify the role of linguistic features in user feedback and provide practical insights for developing rating prediction systems in e-commerce.