Abstract
In this study, the authors explore how financial institutions make decisions about stock trading strategies in a rapidly changing and complex environment. These decisions are made with limited, often inconsistent information and depend on the current and future strategies of both the institution itself and its competitors. They develop a dynamic game model that factors in this imperfect information and the evolving nature of decision-making. To model reward transitions, they utilize a combination of t-Copula simulation of a non-stationary Markov chain, probabilistic fuzzy regression, and chaos optimization algorithms. They then apply deep q-network, a method from deep reinforcement learning, to ensure the effectiveness of the chosen strategy during ongoing decision-making. The approach is significant for both researchers across fields and practical professionals in the finance industry.