Quality Assured Publisher
Serving Researchers Since 2012

Context-Integrated Adversarial Learning for Predictive Modelling of Stock Price Dynamics

DOI : https://doi.org/10.5281/zenodo.18815037
Download Full-Text PDF Cite this Publication

Text Only Version

Context-Integrated Adversarial Learning for Predictive Modelling of Stock Price Dynamics

A. Lazanas, S. Christodoulou

Department of Mechanical Engineering and Aeronautics, University of Patras, Patras, Greece

S. Karpouzis

FICC Market Risk, BNP Paribas CIB, London, UK

Abstract It is a challenging task to forecast equity prices in fast moving financial markets as this becomes even more difficult when the predictive signal is based on non-homogeneous information channels. The classical statistical methods, especially the Autoregressive Integrated Moving Average (ARIMA) models, limit their analytical ability with the linear assumptions that prevent the modeling of complex temporal dynamics. In contrast, complex neural networks, including Long Short-Term Memory (LSTM) networks, are also skilled at capturing sequential interaction effects; they however tend to collapse in the face of abrupt shifts in volatility and changing distributions. In this paper we introduce a context-sensitive adversarial learning model to predict equity prices in this work, which is synthesized distribution-based generative modelling with sentiment-based auxiliary information obtained through Natural Language Processing (NLP). The architecture uses adversarial training to model future price movements and incorporates contextual sentiment features derived using financial textual data. Through a collective utilization of quantitative market indicators along with the additional contextual cues, the framework hopes to enhance the reliability of forecasts during the periods of increased volatility and regime change. Empirical evaluation of a sample of U.S. equities testifies that the presented approach outperforms the traditional ARIMA and LSTM baselines in a range of measures of error. These findings imply that context- sensitive adversarial paradigm is an effective instrument of enhancing stock price prediction effectiveness in complex financial environments characterized by uncertainty and structural changes.

Keywords Stock price modelling, Context-aware forecasting, Deep neural networks, Market volatility analysis, Data-driven prediction.

  1. INTRODUCTION

    Accurate forecasting of stock price dynamics in environments characterized by structural transitions, noise, and temporal irregularities continues to represent a central challenge in machine learning research. Financial markets exhibit nonlinear dependencies and evolving statistical properties, while relevant predictive signals often originate from distinct information channels. Traditional statistical approaches, including ARIMA models, are grounded in assumptions of linearity and stationarity, which constrain their capacity to represent complex market behavior [1, 2]. Although such models remain useful benchmarking tools, their representational flexibility is limited in settings involving volatility clustering and abrupt state transitions. The expansion

    of deep learning methodologies has provided alternative mechanisms for modeling sequential data without imposing strict parametric assumptions. Recurrent neural architectures, particularly Long Short-Term Memory (LSTM) networks, have demonstrated the ability to learn extended temporal dependencies in time-series contexts [3, 4]. Applications in financial forecasting have shown that LSTM-based systems frequently outperform classical statistical baselines in capturing nonlinear temporal interactions [5, 6]. Nevertheless, these architectures are primarily optimized for discriminative prediction tasks and may exhibit reduced stability when confronted with substantial distributional shifts or extreme volatility conditions.

    Generative modelling offers a complementary perspective by focusing on the approximation of underlying data distributions rather than solely minimizing pointwise prediction error. Generative Adversarial Networks (GANs), introduced as an adversarial training framework between a generator and a discriminator [7], have shown significant potential in learning complex high-dimensional representations. Subsequent research has extended GAN-based approaches to sequential and financial time-series settings, highlighting their capacity to enhance predictive robustness and generalization [810]. These developments indicate that adversarial distribution learning can provide advantages in environments where market dynamics are irregular and noise-driven.

    In parallel, the increasing availability of textual financial content has motivated the integration of Natural Language Processing (NLP) techniques into forecasting pipelines. Market-relevant information embedded in news articles, analyst commentary, and social media discourse cannot be directly inferred from numerical price series alone. Sentiment extraction methods enable the quantification of qualitative market perceptions and collective behavior patterns [1114]. Despite growing interest in multimodal forecasting, many existing approaches either treat numerical and textual signals independently or rely on simple fusion mechanisms that do not fully exploit their interaction.

    To address these limitations, this study develops a context- integrated adversarial forecasting framework that combines numerical market observations with sentiment-derived auxiliary features within a unified learning structure. The proposed model employs adversarial training to approximate the conditional distribution of future stock prices while incorporating sentiment signals as modulating context information. Financial market data are selected as the empirical

    domain due to their volatility, susceptibility to exogenous information flows, and structural variability. Comparative experiments against ARIMA and LSTM baselines are conducted to evaluate predictive performance under realistic out-of-sample conditions.

    More broadly, the present work aligns with research efforts focused on intelligent data-driven modelling systems that integrate heterogeneous information sources within predictive architectures [15]. The study contributes to the ongoing exploration of adversarial learning mechanisms and context- aware modelling strategies for financial time-series prediction.

    The remainder of the paper is organized as follows: Section

    2 reviews prior research on statistical forecasting methods, adversarial modelling, and NLP-enhanced predictive systems. Section 3 describes the formulation and design of the proposed framework. Section 4 details the experimental setup and evaluation protocol. Section 5 presents empirical findings and comparative analyses. Section 6 discusses implications and future research directions.

  2. RELATED WORK

    Time-series prediction research has developed beyond traditional statistical modelling to data-driven machine learning and deep learning methods due to the growing complexity, nonlinearity, and non-stationarity of real-world sequential data. Section 2 will discuss earlier research on the topic of the proposed hybrid GAN-NLP framework, with a particular focus on statistical foundations, sequential prediction deep learning networks, GANs, and the addition of textual data with the help of NLP.

    1. Classical statistical models have been used as basic tools for time series forecasting for many years. Among them, ARIMA models continue to be used widely because of their interpretability and well-known theoretical properties [1]. More extensions allow time-varying volatility, e.g. Autoregressive Conditional Heteroskedasticity (ARCH) models, to better describe data when volatility is clustered [2]. Although these models areuseful as benchmarks, they are based on the assumptions of linearity and stationarity which in many cases make them ineffective in complex, noisy settings.

    2. The shortcomings of statistical models facilitated the exploitation of machine learning methods with nonlinear- dependencies modelling abilities. Early machine learning methods such as support vector machines (SVM) and ensemble-based methods proved to be more flexible and accurate in their predictive power than purely statistical models [23]. Nevertheless, these methods usually involve a lot of feature engineering, and are not easily scaled to high- dimensional sequential data.

    3. Deep learning has greatly improved time series modelling by allowing automatic representation learning from raw time series data. RNNs and especially the Long Short-Term Memory (LSTM) were developed to overcome the vanishing gradient phenomenon and be able to learn long- term temporal dependencies [3, 4]. 1994). LSTM-based models have been used to good effect for a variety of sequential prediction tasks including financial time series forecasting (often outperforming classical statistical baselines) [5, 6, 34].

    4. Even though LSTM models are successful, they are inherently discriminative and point estimation optimized. This can lead to minimal distributional properties of data, especially in highly volatile or regime-switching settings. This limitation has led to a resurgence of interest in generative modelling models that learn the underlying data distribution instead of being conditionally predictive.

    5. The innovations of Goodfellow [7] – referred to as Generative Adversarial Networks (GANs), are a powerful framework of generative modelling, based on adversarial training between a discriminator and a generator. GANs have been incredibly effective in learning high- dimensional distributions with complex structures and have been studied especially in sequential and time-series data. A preliminary time-series forecasting model based on GANs showed that adversarial learning can result in a more robust and generalized predictor with a good ability to model the temporal dynamics and noise characteristics [8]. Recent works have since optimized GAN-based algorithms to sequential prediction, and time/architecture-specific loss functions. Specifically, Fin-GAN, where loss functions are driven by economic concerns to improve the forecasting and categorization of financial time series, which is more distributional realistic and directionally accurate [9]. Extensive surveys demonstrate the growing applicability of GANs in time series modelling, particularly in non- stationary and noisy situations [10].

    6. The emergence of numerical time-series modelling, the increased availability of unstructured textual data has triggered the investigation of how Natural Language Processing (NLP) techniques can be adapted to predictive systems. The contextual and sentiment-based information that can be encoded through the textual information in news media and social media platforms cannot be physically measured and thus, it can influence sequential dynamics.

    7. Empirical research has established that social media sentiment can be correlated and, in certain instances, anticipate market trends. Applied research has also indicated that aggregated sentiment measures on the basis of online content are connected to the future shifts in market indices [11]. Subsequent studies found that sentiment cues offer predictive behaviour of time-series forecasts, particularly when using numerical characteristics [12, 13].

    8. Recent studies have initiated the process of investigating hybrid architectures involving deep learning and combining such with NLP-derived features to boost predictive performance. Text-based sentiment representations have been demonstrated to be effective in models that combine recurrent neural networks with purely numerical methods [24, 25]. Nevertheless, numerous of the existing techniques are based on superficial fusion approaches or handle textual and numerical streams of data separately.

      More – text guided forecasting studies further confirm the importance of directly including textual information (e.g. news messages and channel descriptions) and learning cross modal interactions through attention based fusion. Xu et al. make Text-Guided Time Series Forecasting formal and show better performance improvement when the textual cues are combined with time-series representations by using cross-attention mechanisms [26]. Similarly, Emami et al. introduce a

      modality-conscious Transformer that utilizes categorical text in combination with numerical time series with feature-level and inter-modal attention and emphasize the significance of organized multimodal fusion to forecasting [27]. Finally, language models are becoming more and more crucial to modern NLP pipelines to derive sentiment embedding on longer format financial text (e.g., analyst reports), which can be used as informative auxiliary predictors of price trend prediction [28]. Despite the independent research on GAN- based models and NLP-enhanced predictors, few studies have

      Adjusted Close

      Volume

      The closing price adjusted for stock splits and dividends

      The number of shares traded on a specific day

      Provides a more accurate picture of price movements over time, especially when comparing periods with corporate actions.

      High volume can indicate increased interest in the stock, potentially leading to higher volatility and price changes. Low volume can suggest a lack of investor interest.

      been done on single architectures to combine adversarial generative learning with sentiment-aware textual representations to predict time-series. This is the gap that drives the current research, which suggests a hybrid GAN-NLP system that will combine both temporal structure and contextual information in a single learning process.

  3. PROBLEM FORMULATION AND PROPOSED GANNLP FRAMEWORK

    Our work aims to create a predictive framework of stock market data that can be run in non-stationary, noisy and volatile conditions and at the same time make use of contextual information that is represented by unstructured textual information. Historical numerical patterns alone influence stock prices, but exogenous information, including market sentiment and collective perception, cannot be directly reflected by numerical indicators. The purpose of the proposed approach is, therefore, to combine numerical market data of stocks and sentiment-based contextual information in a single learning model.

    Let: , denote a multivariate numerical observation of a financial asset at trading day t, where d corresponds to the number of numerical attributes describing the stock. In this study, these attributes include price-related and trading-related variables such as: {Open, High, Low, Close, Adjusted Close, and Volume}, as summarized in Table 1:

    TABLE I. Numerical stock market attributes used as model inputs and their role in price prediction

    Given a historical window of length L,

    , the prediction task consists of estimating the next-day stock observation xt+1 (or a short prediction horizon) conditioned on historical market behaviour. Simultaneously, unstructured text information that is matched to the trading period in time is processed to generate sentiment- based representations. In fact, the numerical input Xt is built as a moving multivariate window of historical observations of the stock market, with each component xt is associated with price- and trading-related set of attributes, as outlined in Table 1. The window moves chronologically through the trading history, maintaining the time sequence of market data and enabling the model to capture short- and medium-term correlations of stock behaviour. This representation allows th learning model to utilize joint dynamics of many numerical attributes, without the underlying market process having to be stationary or linear.

    Also, let: , represent a feature vector of sentiment obtained by applying Natural Language Processing methods, such as sentiment analysis to social media posts. Such characteristics encode contextual data, which relate to market perception and investor behavior, which can have an impact on price formation but cannot be observed directly in numerical stock indicators. The learning objective is therefore formulated as modeling the conditional distribution:

    , instead of working on point-wise prediction error only. In the suggested framework,

    the sentiment vector is treated as a built-in auxiliary conditioning signal, instead of becoming an independent

    predictive input. This design option permits the learning

    Attribute Description Functionality in Prediction

    process to augment the numerical market dynamics with context information and maintains robustness when textual

    Open

    High

    Low

    Close

    The opening price of the stock on a specific day

    The highest price the stock reached during the day

    The lowest price the stock reached during the day

    The price at which the stock last traded on a specific day

    Indicates investor sentiment at the start of trading. A higher than usual open might suggest buying pressure and potentially a rising price.

    Indicates buying pressure and potential upside. Resistance levels can be identified around previous highs, which may be difficult to break through.

    Indicates selling pressure and potential downside. Support levels can be identified around previous lows, which may be difficult to fall below.

    Often considered the most important price point as it reflects the final sentiment of the intraday trading.

    information becomes sparse, noisy or weakly informative.

    The following formulation encourages the implementation of the generative modeling that would be more appropriate to capture situations like: uncertainty, regime changes, and the distributional characteristics that are usually encountered in financial markets. In this regard, the proposed framework utilizes a Generative Adversarial Network (GAN), where a generator G is trained to generate realistic future stock observations given the historical numerical data as well as contextual features based on sentiment analysis.

    On a high level, G applies a mapping of the form:

    , in contrast to a discriminator D, which looks at the plausibility of an observation x by estimating:

    indicating whether the input corresponds to a real or generated sample conditioned on historical market behavior and sentiment information. Through adversarial feedback provided by D, the generator progressively improves its ability to produce context-aware and temporally coherent

    stock predictions. This concept gives the model the ability to collectively utilize the time market and situational sentiment information, without the constraining assumptions of linearity and stationarity of classical statistical models. Moreover, the framework can be made to incorporate sentiment information as an auxiliary conditioning signal, as opposed to an independent predictor, which makes it resilient during times when textual signals can be scarce, noisy, or poorly informative. Fig. 1 offers a conceptual description of the above and provides an illustration of the main aspects of our hybrid

    G/D/ Generator Discriminator Sentiment) interaction on which the training of the GAN is based.

    Fig. 1. The composite formulation of our hybrid GAN model allows the model to share and collectively explore temporal dependencies, contextual sentiment cues and adversarial feedback during training.

  4. EXPERIMENTAL EVALUATION

      1. Data Sources and Preprocessing

        The assessment comprises the large-cap U.S. equities that are usually linked with high liquidity and strong market effects. More precisely, the discussion is conducted on seven representative stocks, namely: Apple (AAPL), Microsoft (MSFT), Amazon (AMZN), Alphabet (GOOGL), Meta Platforms (META), Tesla (TSLA), and Nvidia (NVDA). The evaluation design relies on numerical market data and textual sentiment information, both described below:

        1. Numerical market data: The stock market data are obtained on a daily basis through Yahoo Finance1 and are composed of the attributes described in Table 1 (Section 3). The information is sorted in time order, and it includes trading days only. Redundant records are eliminated, and missing values are looked into and processed to maintain temporal continuity. Feature selection is used to keep the variables associated with price and trading that are important in the prediction. Before model training, numerical features are used in Min-Max scaling and the scaling parameters are estimated only on the training subset and then applied to validation and test data to avoid information leakage.

        2. Textual and sentiment data: The textual data for the stocks under study, are gathered from posts on social

          media (tweets) retrieved through the Twitter (X) platform2. Sentiment is then extracted from the raw text using cleaning procedures which include normalization and noise elimination. Sentiment analysis is done through VADER (Valence Aware Dictionary and sEntiment Reasoner) lexicon-based approach, storing only the sentiment score of the compounds and matched to the correspondent trading days. In case where there exist several text entries to the same day, these are combined into one daily sentiment representation to allow compatibility with the daily frequency of the numerical data.

          All preprocessing procedures are uniformly applied to assets and evaluation settings in order to guarantee reproducibility and fair comparison.

      2. Evaluation Protocol

    The testing of the suggested framework is presented here in a systematic process that indicates the modelling and evaluation decisions that were made. The primary points of the assessment plan that uses two base models (ARIMA and LSTM) compared to our approach are outlined below:

    1. Temporal data partitioning: in all models, the observations of the stock prices are arranged in a chronological order to maintain causality in time. There is no random shuffling at any point of assessment. Different out-of-sample partitioning strategies are used depending on model type, as discussed below, as well as on the different modeling needs and evaluation practices of each approach.

    2. Model-specific data splits: a) in the case of the ARIMA baseline, around 90% of available observations are utilized in fitting the model, and the remaining 10%, which is the latest portion of the time series, is set aside to be used in out-of-sample testing, b) in the case of the LSTM model, the time-varying data is separated into training and test data in the ratio of 70% and 30%, respectively and c) in our GAN-based model, the final 20 observations of each time series are kept to make the evaluation, and the rest of the data are utilized to train the model. It is a decision that allows the generation and observed values to be directly compared over a predetermined period of prediction. These partitioning methods are used to make sure that any evaluation is done in a strictly out-of-sample manner and corresponds to realistic forecasting conditions.

    3. Prediction task definition: Evaluation of all models is done on a one-step-ahead forecasting task. For each trading day t, it is based on available information (up to time) to generate predictions of the next trading day. The models maintain the same prediction horizon o that the predictive accuracy can be compared directly.

    4. Models under evaluation: The analysis considers three types of models, namely: a) A statistical ARIMA-based baseline, which incorporates linear time-dependencies in stock prices [1]. The parameters of ARIMA models are chosen by mixing an information criterion and automated order selection and stationarity tests, b) A Long Short-Term Memory (LSTM) network-based deep learning baseline, which aims to learn nonlinear sequential behavior of

      ‌1 https://finance.yahoo.com/ 2 https://x.com/‌

      numerical stock data and c) Our suggested GAN-based model with sentiment conditioning the framework, which approximates the conditional distribution of future stocks prices based on historical numerical data and contextual sentiment analysis. Each of the models is tested with temporally aligned data of the same stock universe.

    5. Model training configuration: The models based on neural networks are trained with fixed configurations that are aligned with the original study. In the LSTM model, early stopping and scheduling the learning rate, is used during training phase to enhance convergence stability [29, 30], thus in the GAN-based model, the numerical price values are brought to the range: (-1, 1). The training is done with the fixed batch size of 5, and the prediction period of one trading day. Parameters of scaling are obtained based on the training data and reused at evaluation.

    6. Evaluation metrics: The predictive performance is determined through standard regression-based measures that are widely used in the stock price forecasting, such as:

      1. Mean Absolute Error (MAE)

      2. Mean Squared Error (MSE)

      3. Root Mean Squared Error (RMSE)

      4. Mean Absolute Percentage Error (MAPE), where applicable.

    All metrics are only calculated using held-out evaluation data and reported uniformly across stocks and models. Figure 2 summarizes the evaluation workflow adopted in this study.

    Fig. 2. Overview of the evaluation workflow adopted in this study, illustrating data preprocessing, temporal partitioning, model assessment, and performance evaluation.

  5. Results and Performance Analysis

    This Section presents the empirical findings derived from the evaluation protocol in Section 4. All evaluation tests share the same preprocessing steps, temporal data partitioning strategies and one step ahead forecasting setup to ensure strictly out-of-sample evaluation and comparison between models and assets. The analysis is done in a consistent way for the seven large-cap U.S. equities examined, using the same evaluation metrics.

    1. Comparative Forecasting Performance

      The proposed sentiment-conditioned GAN framework is compared with two baselines: a statistical ARIMA model and a deep learning-based LSTM model. Predictive performance is evaluated only on held-out test data based on standard regression measures (MAE, MSE, RMSE and MAPE where

      applicable), which provides the possibility of systematically comparing the forecasting performance and robustness under realistic market conditions. Table 2 presents the predictive performance of all the models based on the conventional regression statistics, which are MAE, MSE, RMSE, and MAPE where applicable. The findings suggest none of the models perform equally well across all assets, but our sentiment- conditioned GAN model shows a consistently high performance in several instances, especially in those stocks with higher volatility and stronger sentiment-driven dynamics. Here, inclusion of the sentiment information seems to improve the capacity of the model to reflect short-term dynamics in the market that would otherwise not be possible with pure numbers.

      The LSTM baseline is competitive on the stocks with a more predictable time-frequency structure, and in many cases, it performs better than the statistical ARIMA model in terms of errors size. Although the ARIMA is an appropriate model for linear dependencies, it has relatively large errors in prediction compared to the neural methods, especially during periods of high market variability. It is mentioned that, in the case of NVDA, only ARIMA and LSTM models are reported to have results because our sentiment-conditioned GAN modeling framework was not tested on this asset.

      TABLE II. Comparative Forecasting Performance

      Stock

      Model

      MAE

      MSE

      RMSE

      MAPE

      Google

      ARIMA

      14.04

      276.17

      16.62

      0.11

      Google

      LSTM

      6.97

      Google

      GAN

      13.42

      Amazon

      ARIMA

      16.83

      435.67

      20.87

      0.13

      Amazon

      LSTM

      3.35

      Amazon

      GAN

      7.05

      Apple

      ARIMA

      9.38

      139.97

      11.83

      0.05

      Apple

      LSTM

      6.24

      Apple

      GAN

      7.02

      Meta

      ARIMA

      128.89

      21350.1

      146.11

      0.47

      Meta

      LSTM

      11.21

      Meta

      GAN

      8.24

      Microsoft

      ARIMA

      34.75

      1698.81

      41.22

      0.10

      Microsoft

      LSTM

      14.76

      Microsoft

      GAN

      27.07

      Nvidia

      ARIMA

      155.24

      33139.52

      182.04

      0.39

      Nvidia

      LSTM

      118.30

      Tesla

      ARIMA

      25.30

      942.59

      30.70

      0.13

      Tesla

      LSTM

      13.21

      Tesla

      GAN

      9.33

      As shown, the quantitative findings of Table 2 demonstrate that our sentiment-conditioned GAN model can provide quantifiable benefits in accuracy of prediction in realistic market behavior, as well as point to the complementary abilities of LSTM/RNNs to predict assets with smoother price behavior.

    2. Qualitative Forecasting Comparison

      To further enhance the quantitative analysis in Section 5.1, this subsection provides a detailed qualitative analysis of representative predicted-vs.-actual price paths. The objective is not only to visualize the accuracy of the forecasts, but to understand how various modelling paradigms react to various market features, which may be the volatility, persistence of the trend, or the sensitiviy to the information, which is driven by external sentiment-driven factors. Each of the figures refers to held-out test data and represents purely out-of-sample behaviour.

      Meta is a representative of stocks that have a high level of volatility and whose dynamics are mainly sentiment-based. As Figure 4 demonstrates, the predictions made by our sentiment- conditioned GAN model have a higher capability to track sudden price fluctuations and short-term directional shifts. Specifically, the model adjusts faster to drastic changes in the price series, minimizing lag effects which are common in solely sequential models. This behavior implies that sentiment conditioning is an informative contextual signal that enables the generative process to predict market responses of external events, news cycles, or changes in investor perception that are not directly encoded in historical prices.

      Fig. 4. Predicted vs. actual closing prices for Alphabet Inc. (Google) with our GAN modeling framework, illustrating reduced forecasting accuracy for assets with smoother temporal dynamics.

      Another example that could be used to demonstrate the complementary qualities of the analysed models is Tesla (TSLA). Tesla suggests a challenging forecasting problem because of its volatility and price fluctuations caused by announcements, social media, and speculative actions of the population. Our sentiment-conditioned GAN model shown in Figure 5 shows a higher reaction to abrupt shifts in price direction than the sequential-only approaches. Although certain deviations are unavoidable since the asset is too unpredictable, the generative model adopts its response to sharp changes, which justifies the hypothesis that sentiment-sensitive modelling can help to become more adaptable to highly dynamic market environments.

      Fig. 3. Predicted versus actual closing prices for Meta (META) on held-out test data using our sentiment-conditioned GAN model.

      Conversely, assets with a more continuous price dynamics and more stable temporal patterns that reveal weaknesses in our sentiment-conditioned GAN model. Figure 4 shows this behaviour by providing the predicted-vs. actual price behaviour of a relatively stable asset like Alphabet Inc. (Google) with the GAN-based model used. The forecasts generated in this case, have a greater variability with respect to the actual price path, especially in the ability to capture slow trends and long-term directional movements. The qualitative mismatch is a sign that, when market dynamics are largely determined by the laws of history instead of sudden signals of sentiment, the GAN process is unreliable in maintaining correct long-term alignment. In line with the quantitative findings of Section 5.1, the LSTM-based model seems to perform better in these scenarios by virtue of its recurrent nature, which is more appropriate to learn long-term temporal correlations.

      Fig. 5. Predicted vs. actual closing prices for Alphabet Inc. (Google) with our GAN modeling framework, illustrating reduced forecasting accuracy for assets with smoother temporal dynamics.

      Combined with the qualitative evidence, the quantitative results of Section 5.1 are strengthened and enhanced. The graphical comparisons affirm that, there is no universal best modelling method that fits all assets, but that effectiveness of models is highly dependent on the underlying market regime. Recurrent neural networks like LSTM can be used to model assets with predictable time variations and our sentiment- conditioned GAN architecture has specific benefits in environments where price dynamics are subject to significant effects of exogenous information and investor mood. The observations highlight the significance of the model selection and hybrid modelling approach in the practical stock price forecasting processes.

    3. Aggregate Robustness and Comparative Performance Analysis

    To add to the earlier analysis that was per-asset based as detailed in the preceding subsections, this section gives a more aggregate and robustness-based analysis of the forecasting

    performance of all the assets that are discussed. Although the stock-specific measures can be used to provide a comprehensive understanding of individual behaviour, the aggregate measures are needed to determine the overall consistency, stability and relative effectiveness of the models tested in the heterogeneous market.

    Table 3 presents aggregate performance statistics based on the values of RMSE in Section 4, such as: mean RMSE, median RMSE and the number of wins (meaning the number of assets that each of the models produce the lowest prediction error). Combination of mean and median RMSE enables the evaluation where an overall accuracy is considered, and the impact of extreme error values of highly volatile assets is minimized. Sentiment-conditioned GAN model results do not include NVDA stock, where no GAN-NLP based evaluation was performed.

    TABLE III. Comparative Forecasting Performance

    Model

    Mean RMSE

    Median RMSE

    Wins (Lowest RMSE)

    ARIMA

    64.2

    30.7

    0

    LSTM

    24.86

    11.21

    4

    GAN

    12.02

    8.79

    2

    The cumulative findings reveal that the LSTM-based model produces the lowest mean and median RMSE among assets; a very positive indicator of overall consistency in presenting the temporal relationship between stock price fluctuations. Conversely, sentiment-conditioned GAN model has fewer total wins but is competing on aggregate terms, which underscores its usefulness in a subset of assets with greater volatility and sentiment sensitivity. The ARIMA baseline has significantly large values of aggregate errors, which validates its weaknesses in the analysis of complicated, non-linear market dynamics.

    In general, the results of our evaluation support the asset- conditional character of model performance and also offer a robustness-oriented summary which supplements the more detailed analysis of stock levels that was discussed above. These aggregate findings are a quantitative basis of the comparative and qualitative discourse of the next Section.

  6. DISCUSSION AND FUTURE WORK DIRECTIONS

The empirical assessment shows that the performance of stock price forecasts is always conditional on asset specifics, and current market regimes as opposed to being governed by one modelling paradigm. This finding is in line with new insights on machine learning-based financial forecasting that highlights the significance of adaptive modelling techniques compared to general predictors [31]. Results reported in this context, are not to be perceived as a comparison of architectures, but rather as indicators of the responses of various learning mechanisms to different informational

environments. Sequential deep learning structures are highly effective in those environments, where the dynamics of prices change according to quite stable time series. Recurrent memory models are best adapted to such dependencies, initially developed in the original work on LSTM networks [3] and since then involved in financial forecasting tasks [6, 5]. Historical price data seems to be adequate in producing correct short-term forecasting when market forces are largely endogenous, which restricts the marginal value of external contextual information.

Contrastingly, the incorporation of the sentiment information is most effective in stocks whose price action is more susceptible to the external stories, news, cycles and investor perceptions. Previous research has demonstrated that sentiment measures derived on the basis of textual data can reflect elements of market psychology that are not directly expressed by quantitative price data [11, 12, 37, 38]. Training generative models with these cues allws the prediction procedure to react to sudden informational shocks, especially in a setting with a high volatility rate. Notably, the selectivity of this advantage supports the opinion that sentiment is a regime- specific indicator, and not a universally predictive aspect. These results have a larger implication to multimodal fusion of data in financial forecasting. The literature has shown that naive fusion approaches, including naive feature concatenation, do not typically fully take advantage of the complementary characteristics of heterogeneous data sources [13, 28]. Our findings indicate that conditioning processes, where contextual information actively determines the generative or predictive process, are better frameworks to capture distributional uncertainty. This is in line with the current developments of conditional generative modeling and sentiment-directed forecasting [25, 9].

The relatively poor results of the traditional statistical methods draw modeling conclusions about the inadequacy of the linear assumptions in modern financial markets. Although autoregressive models are still applicable in terms of interpretability and benchmarking [16], they have representational weaknesses and can be designed to deal with nonlinear dependence, regime change, and exogenous information flow. Other recent comparative surveys have come up with similar findings, highlighting the widening gap in performance of classical and deep learning architectures in complex market settings [31-33]. Our findings have also a wider implication to the general debate on market behavior and information efficiency beyond predictive considerations. The fact that sentiment-conditioned models have performed better in certain situations aligns with theories of behavioral finance, which highlight the significance of investor sentiment and limited rationality in the dynamic of short-term price movements [18, 19]. At the same time, these findings do not mean constant inefficiencies and predictability since short- lived informational impacts can quickly fade away as markets adapt. This reading is consistent with a moderate opinion of market efficiency, in which temporary aberrations can be compatible with long-term adaptive behavior [17].

There are some shortcomings that one should take into consideration when explaining the current findings. The data are analyzed based on daily information and sentiment derived from social media, which might not reflect intra-day changes and longer-term information impacts. Previous research has indicated that sentiment signals are noisy and platform-specific

and could be biased unless put into context [22, 21]. Further, even sophisticated architectures have a hard time attaining extreme volatility, which highlights the uncertainty of financial markets and the challenge of providing predictive performance that is stable even in highly turbulent markets.

Finally, future research can expand this study by investigating regime conscious architectures that dynamically change their dependence on numerical and textual inputs. Another attractive direction is the integration of alternative contextual sources, e.g. macroeconomic indicators, analyst reports, or option-implied measures [26, 36]. Furthermore, explainability methods may be included to make the implementation more transparent and practically relevant, especially in high-stakes financial implementations. Together, these guidelines demonstrate that context-sensitive deep- learning models can be used to make financial predictions and consider the structural uncertainty of market dynamics. The template is used to format your paper and style the text. All margins, column widths, line spaces, and text fonts are prescribed; please do not alter them. You may note peculiarities. For example, the head margin in this template measures proportionately more than is customary. This measurement and others are deliberate, using specifications that anticipate your paper as one part of the entire proceedings, and not as an independent document. Please do not revise any of the current designations.

REFERENCES

  1. D. Billings and J.-S. Yang, Application of the ARIMA models to urban roadway travel time prediction: a case study, In: Proc. of IEEE International Conference on Systems, Man and Cybernetics, pp. 25292534, 2006.

  2. R. F. Engle, Autoregressive conditional heteroscedasticity with estimates of the variance of United Kingdom inflation, Econometrica, Vol. 50, No. 4,

    pp. 9871007, 1982.

  3. S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural Computation, Vol. 9, No. 8, pp. 17351780, 1997.

  4. Y. Bengio, P. Simard, and P. Frasconi, Learning long-term dependencies with gradient descent is difficult, IEEE Transactions on Neural Networks, Vol. 5, No. 2, pp. 157166, 1994.

  5. T. Fischer and C. Krauss, Deep learning with long short-term memory networks for financial market predictions, European Journal of Operational Research, Vol. 270, No. 2, pp. 654669, 2018.

  6. D. M. Q. Nelson, A. C. M. Pereira, and R. A. de Oliveira, Stock markets price movement prediction with LSTM neural networks, In: Proc. of International Joint Conference on Neural Networks, pp. 14191426, 2017.

  7. I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio, Generative adversarial networks, arXiv preprint, 2014.

  8. K. Zhang, G. Zhong, J. Dong, S. Wang, and Y. Wang, Stock market prediction based on generative adversarial network, Procedia Computer Science, Vol. 147, pp. 400406, 2019.

  9. M. Vuleti, F. Prenzel, and M. Cucuringu, Fin-GAN: forecasting and classifying financial time series via generative adversarial networks, SSRN Electronic Journal, 2023.

  10. E. Brophy, Z. Wang, Q. She, and T. E. Ward, Generative adversarial networks in time series: a systematic literature review, ACM Computing Surveys, Vol. 55, No. 10, 2023.

  11. J. Bollen, H. Mao, and X. Zeng, Twitter mood predicts the stock market, Journal of Computational Science, Vol. 2, No. 1, pp. 18, 2011.

  12. A. Derakhshan and H. Beigy, Sentiment analysis on stock social media for stock price movement prediction, Engineering Applications of Artificial Intelligence, Vol. 85, pp. 569578, 2019.

  13. K. Puh and M. Bagi Babac, Predicting stock market using natural language processing, American Journal of Business, Vol. 38, No. 2, pp. 41 61, 2023.

  14. N. M. Tuan, P. Meesad, and N. H. Son, On a stock prediction aligned to natural language sentiments, In: Proc. of International Conference on Natural Language Processing and Information Retrieval, Article 111, 2024.

  15. S. Dedotsi, A. Lazanas, I. Siachos, D. D. Teloni, and A. G. Telonis, Discrete clusters formulation through the exploitation of optimized k-modes algorithm for hypotheses validation in social work research: the case of Greek social workers working with refugees, BOHR International Journal of Internet of Things, Artificial Intelligence and Machine Learning, Vol. 2, No. 1, pp. 1118, 2023.

  16. G. E. P. Box, G. M. Jenkins, G. C. Reinsel, and G. M. Ljung, Time Series Analysis: Forecasting and Control, 5th ed., Wiley, Hoboken, NJ, 2015.

  17. E. F. Fama, Efficient capital markets: a review of theory and empirical work, The Journal of Finance, Vol. 25, No. 2, pp. 383417, 1970.

  18. D. Kahneman and A. Tversky, Prospect theory: an analysis of decision under risk, Econometrica, Vol. 47, No. 2, pp. 263291, 1979.

  19. R. J. Shiller, Irrational Exubernce, 3rd ed., Princeton University Press, Princeton, NJ, 2015.

  20. P. D. Turney, Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews, In: Proc. of 40th Annual Meeting of the Association for Computational Linguistics (ACL), pp. 417424, 2002.

  21. C. M. Whissell, The dictionary of affect in language, In: R. Plutchik and H. Kellerman (eds.), The Measurement of Emotions, Elsevier, pp. 113 131, 1989.

  22. P. D. Turney and M. L. Littman, Measuring praise and criticism, ACM Transactions on Information Systems, Vol. 21, No. 4, pp. 315346, 2003.

  23. C. K. S. Leung, R. K. MacKinnon, and Y. Wang, A machine learning approach for stock price prediction, In: Proc. of 18th International Database Engineering and Applications Symposium, pp. 274277, 2014.

  24. J. Shen and M. O. Shafiq, Short-term stock market price trend prediction using a comprehensive deep learning system, Journal of Big Data, Vol. 7, No. 1, 2020.

  25. P. Sonkiya, V. Bajpai, and A. Bansal, Stock price prediction using BERT and GAN, arXiv preprint, 2021.

  26. Z. Xu, Y. Bian, J. Zhong, X. Wen, and Q. Xu, Beyond trend and periodicity: guiding time series forecasting with textual cues, arXiv preprint, 2024.

  27. H. Emami, X.-H. Dang, Y. Shah, and P. Zerfos, Modality-aware transformer for financial time series forecasting, arXiv preprint, 2023.

  28. A. Moreno and J. Ordieres-Meré, Predicting stock price trends using language models to extract the sentiment from analyst reports: evidence from IBEX 35-listed companies, Economics Letters, 2025.

  29. C. Kim, S. Kim, J. Kim, D. Lee, and S. Kim, Automated learning rate

    scheduler for large-batch training, arXiv preprint, 2021.

  30. L. Prechelt, Early stopping but when? In: G. Montavon, G. B. Orr, and K.-R. Müller (eds.), Neural Networks: Tricks of the Trade, 2nd ed., Springer, pp. 5367, 2012.

  31. C. Zhang, N. N. A. Sjarif, and R. Ibrahim, Deep learning models for price forecasting of financial time series: a review of recent advancements (20202022), WIREs Data Mining and Knowledge Discovery, Vol. 14, No. 1, 2024.

  32. N. Rouf et al., Stock market prediction using machine learning techniques: a decade survey on methodologies, recent developments, and future directions, Applied Sciences, Vol. 10, No. 21, 2021.

  33. K. Benidis et al., Deep learning for time series forecasting: tutorial and literature survey, ACM Computing Surveys, 2022.

  34. T. Julian, T. Devrison, V. Anora, and K. M. Suryaningrum, Stock price prediction model using deep learning optimization based on technical analysis indicators, Procedia Computer Science, Vol. 227, pp. 939947, 2023.

  35. S. Li, Enhancing stock price prediction using GANs and transformer- based attention mechanisms, Empirical Economics, 2025.

  36. J. Woo et al., Towards time series generation conditioned on unstructured natural language descriptions, arXiv preprint, 2025.

  37. P. Azar, The wisdom of Twitter crowds: predicting stock market reactions to FOMC meetings via Twitter feeds, SSRN Electronic Journal, 2016.

  38. I. Goodfellow, NIPS 2016 tutorial: Generative Adversarial Networks,

arXiv preprint, 2016.