8th ISF

COMP030 - Reinforcement Learning-Based Al for Sustainable Stock Trading System in the Stock Exchange of Thailand


Speak/Pause/Resume Stop

In an environment of rising inflation and low interest rates on bank deposits, investing has become crucial for growing savings, achieving financial independence, and securing future wealth. However, the stock market entails significant risks, with around 80% of individual investors losing money, often due to emotional biases such as greed and fear. AI-driven investment systems offer a promising solution to mitigate these risks and improve investment outcomes. Although stock investment systems have been widely researched, the application of reinforcement learning (RL) in stock investment strategies—particularly in the Stock Exchange of Thailand (SET)—remains relatively unexplored. This paper proposes a reinforcement learning-based stock investment system specifically designed for Thailand’s market, drawing inspiration from DeepMind’s AlphaGo. The system employs AI agents to optimize investment strategies by making daily buy or sell decisions based on stock data, including prices, moving averages, and various technical indicators. Through continuous interaction with historical market data, the AI agents learn through trial and error, progressively developing a nuanced understanding of market dynamics that can lead to improved investment strategies and returns. The proposed RL model incorporates five different reinforcement learning algorithms—A2C, DDPG, TD3, PPO, and SAC—along with Bayesian hyperparameter optimization to enhance performance. The system is trained on two datasets with varying input features: one with six inputs (MACD, moving averages, and RSI values), and another with nine inputs (adding indicators such as Bollinger Bands, CCI, and DX). We also extended tradable stocks from the top 100 stocks in SET (70 usable stocks) to the top 200 stocks in SET (130 usable stocks). Our experiments show that while the number of input variables had little impact on overall returns, hyperparameter tuning was crucial for performance enhancement. We also explored ensemble methods, where multiple AI agents collaboratively make trading decisions, and incorporated a stop-loss mechanism to reduce risk by limiting potential losses. Notably, a PPO agent, optimized through hyperparameter tuning and trained on 130 stocks with nine input features, achieved an impressive average annual return of 40.98%, underscoring the potential of AI to optimize stock investment strategies.

Show More

Name :  

Pannawish Tanthawichian, Than Rattanakij

Email :  

prompong.p@kvis.ac.th

Advisor :  

Prompong Pakawanwong, Dr. Kanes Sumetpipat

School :  

Kamnoetvidya Science Academy


PROJECT QR CODE