Le Hong Hanh, Nguyen Ngoc Nam, Nguyen Thuy Linh, Nguyen Linh Diep, Nguyen Ngoc Hai

Main Article Content

Abstract

There are very few studies in Vietnam on the application of text mining in finance and Vietnamese language processing. The origin of this study comes from one of the leading studies on the use of machine learning to analyze text data from 4 well-known online newspapers in Vietnam to forecast the increase, decrease and neutrality of the VN-Index one day in advance. This study used nearly 70,000 articles from four reputable and reliable online newspapers in Vietnam as input data for machine learning models. These were: decision trees, random forests, KNNs and SVMs. After selecting the best model (SVM) and the best dataset (Vietstock), the techniques used to dig deep and refine the findings raised the accuracy to 60.1%. The end result is solid evidence that news about the financial and stock situation in the popular press affects the price movements of the VN-INDEX and the Vietnamese stock market.

Keywords: Machine learning, text-mining, stock market, decision tree, random forest, KNN, SVM, VN-Index.*

References

[1] Yu, R., He, X. & Liu, Y. “Glad: Group Anomaly Detection in Social Media Analysis,” ACM Transactions on Knowledge Discovery from Data (TKDD), 10 (2) (2015) 1-22.
[2] Pham T. Nghia, Tatiana K. Blokhina, “Improving the Efficency of the Vietnam Stock Market,” Advances in Business-Related Scientific Research Journal, 11 (1) (2020) 75-86.
[3] Aas, K., & Eikvil, L., “Text Categorization: A Survey,” Technical Report, Norwegian Computing Center, 1999.
[4] Das, R.S., “Text and Context: Language Analytics in Finance,” Foundations and Trends(R) in Finance, 8 (3) (2014) 145-261
[5] Dang Hong Phu, “Application of Microsoft Time Series to Build a Forecasting System for Vietnam’s Stock Market,” Master Thesis, University of Information Technology, Vietnam National University HCMC, 2008.
[6] Vu, Tien Thanh, et al., “An Experiment in Integrating Sentiment Features for Tech Stock Prediction in Twitter,” Proceedings of the Workshop on Information Extraction and Entity Analytics on Social Media Data, 2012.
[7] Le Van Tuan, “Application of Some Machine Learning Models in Forecasting the Direction of Movements of the Vietnamese Stock Market,” Master Thesis, Thuongmai University, 2021.
[8] Nassirtoussi, A. K. et al. “Text Mining for Market Prediction: A Systematic Review,” Expert Syst. Appl., 41 (2014) 7653-7670.
[9] Kaya, M., and M. E. Karsligil, “Stock Price Prediction Using Financial News Articles,” 2010 2nd IEEE International Conference on Information and Financial Engineering, 2010, 478-82.
[10] Joshi, Kalyani, Bharathi H. N, and Jyothi Rao, “Stock Trend Prediction Using News Sentiment Analysis,” International Journal of Computer Science and Information Technology, 8 (3) (2016) 67-76.
[11] Ayman E. Khedr, S.E.Salama, and Nagwa Yaseen, “Predicting Stock Market Behavior Using Data Mining Technique and News Sentiment Analysis,” International Journal of Intelligent Systems and Applications, 9 (7) (2017) 22-30.
[12] Tabari, N., A. Seyeditabari, T. Peddi, M. Hadzikadic, and W. Zadrozny, “A Comparison of Neural Network Methods for Accurate Sentiment Analysis of Stock Market Tweets,” in ECML PKDD 2018 Workshops, edited by C. Alzate et al., Cham: Springer, 2019.