Skip to content

Stabilarity Hub

Menu
  • Home
  • Research
    • Healthcare & Life Sciences
      • Medical ML Diagnosis
    • Enterprise & Economics
      • AI Economics
      • Cost-Effective AI
      • Spec-Driven AI
    • Geopolitics & Strategy
      • Anticipatory Intelligence
      • Future of AI
      • Geopolitical Risk Intelligence
    • AI & Future Signals
      • Capability–Adoption Gap
      • AI Observability
      • AI Intelligence Architecture
      • AI Memory
      • Trusted Open Source
    • Data Science & Methods
      • HPF-P Framework
      • Intellectual Data Analysis
      • Reference Evaluation
    • Publications
      • External Publications
    • Robotics & Engineering
      • Open Humanoid
    • Benchmarks & Measurement
      • Universal Intelligence Benchmark
      • Shadow Economy Dynamics
      • Article Quality Science
  • Tools
    • Healthcare & Life Sciences
      • ScanLab
      • AI Data Readiness Assessment
    • Enterprise Strategy
      • AI Use Case Classifier
      • ROI Calculator
      • Risk Calculator
      • Reference Trust Analyzer
    • Portfolio & Analytics
      • HPF Portfolio Optimizer
      • Adoption Gap Monitor
      • Data Mining Method Selector
    • Geopolitics & Prediction
      • War Prediction Model
      • Ukraine Crisis Prediction
      • Gap Analyzer
      • Geopolitical Stability Dashboard
    • Technical & Observability
      • OTel AI Inspector
    • Robotics & Engineering
      • Humanoid Simulation
    • Benchmarks
      • UIB Benchmark Tool
  • API Gateway
  • About
    • Contributors
  • Contact
  • Join Community
  • Terms of Service
  • Login
  • Register
Menu

FLAI: An Intelligent System for Social Media Trend Prediction Using Recurrent Neural Networks with Dynamic Exogenous Variable Injection

Posted on March 25, 2026 by
Anticipatory IntelligenceAcademic Research · Article 18 of 19
Authors: Dmytro Grybeniuk, Oleh Ivchenko

FLAI: An Intelligent System for Social Media Trend Prediction Using Recurrent Neural Networks with Dynamic Exogenous Variable Injection

Academic Citation: Grybeniuk, Dmytro, Ivchenko, Oleh (2026). FLAI: An Intelligent System for Social Media Trend Prediction Using Recurrent Neural Networks with Dynamic Exogenous Variable Injection. Research article: FLAI: An Intelligent System for Social Media Trend Prediction Using Recurrent Neural Networks with Dynamic Exogenous Variable Injection. Odessa National Polytechnic University, Department of Economic Cybernetics.
DOI: 10.5281/zenodo.19226414[1]  ·  View on Zenodo (CERN)
DOI: 10.5281/zenodo.19226414[1]Zenodo Archive
5,060 words · 0% fresh refs · 4 diagrams · 14 references

52stabilfr·wdophcgmx
BadgeMetricValueStatusDescription
[s]Reviewed Sources57%○≥80% from editorially reviewed sources
[t]Trusted64%○≥80% from verified, high-quality sources
[a]DOI79%○≥80% have a Digital Object Identifier
[b]CrossRef57%○≥80% indexed in CrossRef
[i]Indexed43%○≥80% have metadata indexed
[l]Academic29%○≥80% from journals/conferences/preprints
[f]Free Access14%○≥80% are freely accessible
[r]References14 refs✓Minimum 10 references required
[w]Words [REQ]5,060✓Minimum 2,000 words for a full research article. Current: 5,060
[d]DOI [REQ]✓✓Zenodo DOI registered for persistent citation. DOI: 10.5281/zenodo.19226414
[o]ORCID [REQ]✗✗Author ORCID verified for academic identity
[p]Peer Reviewed [REQ]—✗Peer reviewed by an assigned reviewer
[h]Freshness [REQ]0%✗≥80% of references from 2025–2026. Current: 0%
[c]Data Charts0○Original data charts from reproducible analysis (min 2). Current: 0
[g]Code—○Source code available on GitHub
[m]Diagrams4✓Mermaid architecture/flow diagrams. Current: 4
[x]Cited by0○Referenced by 0 other hub article(s)
Score = Ref Trust (63 × 60%) + Required (2/5 × 30%) + Optional (1/4 × 10%)

Abstract #

Social media platforms — foremost TikTok and Instagram — generate billions of interaction events daily, creating stochastic, high-velocity Big Data streams whose trend trajectories prove notoriously difficult to forecast with classical statistical models. This paper presents FLAI, an intelligent information-analytical system for predicting the behaviour of social-network objects, with emphasis on music-trend reposts on TikTok. The core technical contribution is an original Recurrent Neural Network (RNN) architecture augmented with a novel Injection Layer — a mechanism that dynamically incorporates exogenous variables (viral news, influencer activity, platform seasonality, and social events) directly into the network’s memory state at every inference step. Unlike standard ARIMA or vanilla LSTM models that require full retraining to absorb distributional shifts, FLAI’s adaptive recalibration predictor updates synapse weights online, achieving sub-one-hour response to “black swan” events. The mathematical model is fully formalised through nine first-chapter equations and five validation expressions covering the daily-rise metric DR(n), the interpolation-aware state parameter dS(n), the dynamically generated graph weights GW(n), the self-correcting error term DRFE(n), and the forward forecast FR(n+1). Trained and validated on a dataset of 2.7 million publicly available records from TikTok and Instagram public engagement data, the system achieves 99.76% accuracy on the test set, R² = 0.988, and demonstrates a 19–24% MAPE improvement over ARIMA and standard LSTM baselines. Approximation error decreases from Ā₁₋₂₅ = 5.20% to Ā₂₆₋₅₀ = 4.25% across sequential training windows, confirming successful online learning. The microservice platform — built on .NET Core MVC and a Python/Flask AI service — has been validated commercially and attracted wide media coverage across Ukrainian technology outlets.

1. Introduction #

The Creator Economy has transformed social media from passive broadcasting infrastructure into a dynamic competitive marketplace where trends emerge and collapse within hours. On TikTok alone, more than one billion active users generate an uninterrupted stream of music trends, viral challenges, influencer collaborations, and audio-driven content cycles. For music labels, content strategists, and talent agencies, the ability to anticipate which audio track will cross the virality threshold 24–48 hours in advance translates directly into competitive advantage: early promotion amplifies organic reach, secures brand partnerships, and optimises advertising spend.

The core forecasting challenge is structural. Repost counts — the engagement metric used as the primary target variable in this research — exhibit stochastic, non-stationary behaviour with frequent sudden discontinuities. These discontinuities, colloquially termed “black swan” events by practitioners, arise from external shocks: a celebrity endorsement, a breaking news cycle that hijacks a trending sound, an algorithm update that elevates a previously invisible account, or a geopolitical event that redirects audience attention. Classical time-series approaches such as ARIMA are analytically rigorous but assume stationarity and cannot accommodate exogenous signals at inference time. Facebook’s Prophet model adds holiday regressors but lacks a mechanism for real-time, arbitrary external signal injection. Standard LSTM architectures improve on both but still require complete retraining to ingest distributional shifts — a process that takes 24–48 hours and leaves the model blind to fast-moving events.

The FLAI project, initiated in 2023 at the Department of Economic Cybernetics and Information Technologies of Odessa National Polytechnic University within Research Project No. 267-68 (“Theoretical and Applied Problems of Implementing Intelligent Information Technologies and Model Applications in Business System Management”), addresses these limitations through a purpose-built RNN architecture with a dynamic Injection Layer. The system was co-authored by Oleh Ivchenko and Dmytro Hrybeniuk, with scientific supervision by Professor Zoya Sokolovska (D.Sc., Economics).

Research Questions #

RQ1: How can exogenous variables be dynamically injected into RNN architectures to improve social media trend prediction accuracy?

RQ2: What is the comparative performance advantage of the Injection Layer approach versus standard ARIMA and LSTM baselines?

RQ3: How does the adaptive recalibration mechanism handle “black swan” events in real-time trend data?


Data Sources and Legal Compliance #

This research relies exclusively on publicly available data that was accessible on the open internet at the time of collection. No proprietary, restricted, or private data sources were used. All data collection adhered to the terms of service of the respective platforms and applicable data protection regulations.

Data Provenance #

The dataset of 2.7 million records was assembled from publicly visible social media content — specifically, publicly posted music trend statistics, repost counts, and engagement metrics that any internet user could observe by browsing the platforms. The data represents aggregate, non-personal engagement metrics (repost counts, trend trajectories) rather than individual user behaviour or personally identifiable information. No private profiles, direct messages, or restricted content were accessed.

Legal and Ethical Framework #

  • Public data only: All data points used in this research were publicly visible on the open internet at the time of collection. No authentication-restricted, private, or behind-paywall data was used.
  • No personal data: The dataset contains aggregate engagement metrics (repost counts, trend indices) — not personal user data, names, or identifiers. No personally identifiable information (PII) was collected, stored, or processed.
  • Academic research purpose: This work was conducted under Research Project No. 267-68 at the Department of Economic Cybernetics and Information Technologies of Odessa National Polytechnic University, a registered academic institution. The research serves a legitimate scientific purpose: advancing forecasting methodology for stochastic social media data.
  • Reproducibility from public sources: The analytical methodology described in this paper can be independently reproduced by any researcher using the same publicly available data sources. The mathematical model (Equations 1.1-1.9, 3.1-3.5) and the evaluation framework (MAPE, R2, accuracy) are fully specified to enable replication.
  • No data redistribution: Raw data is not redistributed. Only aggregate statistics, model outputs, and derived analytical results are presented in this publication.

Project Timeline #

gantt
    title FLAI Project Timeline
    dateFormat YYYY-MM
    axisFormat %b %Y
    section Research
        Literature review and methodology design     :2023-01, 2023-06
        Mathematical model formalisation (Eq. 1.1-1.9) :2023-03, 2023-08
        University registration (Project No. 267-68) :milestone, 2023-02, 0d
    section Data Collection
        Public data collection and ETL pipeline      :2023-04, 2024-01
        Data cleaning and interpolation validation   :2023-09, 2024-02
    section Development
        RNN architecture with Injection Layer        :2023-06, 2024-03
        Adaptive recalibration predictor (DRFE)      :2023-10, 2024-04
        Microservice platform (.NET Core + Flask)    :2024-01, 2024-06
    section Validation
        Model training and baseline comparison       :2024-03, 2024-08
        Validation expressions (Eq. 3.1-3.5)         :2024-05, 2024-09
        Commercial pilot and media coverage          :2024-06, 2024-12
    section Publication
        Academic paper preparation                    :2024-10, 2025-03
        Zenodo DOI registration                      :milestone, 2026-03, 0d
        Stabilarity Research Hub publication          :milestone, 2026-03, 0d

2. Existing Approaches (2026 State of the Art) #

The landscape of time-series forecasting for social media data has evolved considerably over the past decade. We survey the principal families of approaches and characterise their limitations relative to the demands of real-time trend prediction.

2.1 Classical Statistical Models: ARIMA #

AutoRegressive Integrated Moving Average (ARIMA) and its seasonal extension SARIMA have long served as the workhorse of time-series analysis [1][2]. ARIMA decomposes a series into autoregressive, differencing, and moving-average components and is well-suited to stationary or trend-stationary processes. However, social media repost series are neither stationary nor Gaussian: they exhibit fat-tailed spikes, abrupt regime changes, and strong exogenous dependency. Fitting an ARIMA model requires the analyst to specify the (p, d, q) order in advance and provides no mechanism for injecting real-time external signals after training. When a black-swan event occurs, the model’s predictions diverge rapidly from observed values until the next retraining cycle — typically 24–48 hours later.

2.2 Prophet and Additive Decomposition Models #

Meta’s Prophet framework [2][3] improves on ARIMA by decomposing the series into trend, seasonality, and holiday components, and by supporting custom regressors. For social media applications, Prophet handles weekly and daily seasonality reasonably well. However, the regressor mechanism requires the analyst to enumerate, engineer, and supply exogenous features at prediction time — it does not learn to weight them dynamically from data. Sudden, unscheduled events (viral news, celebrity deaths, geopolitical crises) that are not captured in a pre-specified regressor set remain invisible to Prophet. Furthermore, Prophet’s piecewise linear trend model may not capture the exponential growth dynamics common to viral content.

2.3 Standard LSTM and Transformer Models #

Long Short-Term Memory networks [3][4] represent the current industry baseline for sequence modelling in social media analytics. LSTM’s gating mechanism (input, forget, output gates) allows the network to retain relevant long-range dependencies and forget noise. Recent extensions — including attention mechanisms [4][5] and Temporal Fusion Transformers [5][6] — further improve representational capacity. Nevertheless, all these architectures share a fundamental limitation for the trend-prediction use case: exogenous variables must be embedded as static input features at training time. If the distribution of those features shifts during inference — as invariably happens when viral events occur — the model’s implicit priors become miscalibrated. Correcting this requires batch retraining on updated data, imposing latency that is unacceptable for sub-hourly trend markets.

2.4 Comparative Overview #

The diagram below summarises the capability comparison across the four principal approaches evaluated in this work.

graph TD
    subgraph "Forecasting Approaches — Capability Matrix"
        A["ARIMA\n• Stationary series\n• No exogenous injection\n• Full retrain on shift\n• 24-48h adaptation lag"]
        B["Prophet\n• Additive decomposition\n• Static regressors only\n• Moderate exogenous support\n• No real-time injection"]
        C["Standard LSTM\n• Long-range dependencies\n• Exogenous as static features\n• Batch retrain required\n• No black-swan response"]
        D["FLAI Injection Layer RNN\n• Dynamic exogenous injection\n• Online weight recalibration\n• Sub-1h black-swan response\n• 19-24% MAPE improvement"]
    end
    A -->|"Add seasonality"| B
    B -->|"Add nonlinearity"| C
    C -->|"Add Injection Layer + DRFE"| D
    style D fill:#f3f3f3,stroke:#000,stroke-width:2px

The key differentiator of the FLAI approach is the ability to modify internal network weights between training epochs in response to observed forecast error, without triggering a full backpropagation cycle on historical data.


3. Quality Metrics & Evaluation Framework #

To provide a rigorous and reproducible evaluation, the study adopts a three-metric assessment framework aligned with standard practice in time-series forecasting literature.

3.1 Metric Definitions #

Mean Absolute Percentage Error (MAPE) measures the average magnitude of forecast error as a percentage of actual values:

MAPE is scale-invariant, which is essential when comparing forecasts across music trends with vastly different repost magnitudes (from emerging micro-trends with hundreds of daily reposts to viral phenomena with millions).

Mean Squared Error (MSE) penalises large deviations quadratically and is used as the loss function during neural network training:

Coefficient of Determination (R²) quantifies the proportion of variance in actual reposts explained by the model’s predictions:

An R² approaching 1.0 indicates the model’s forecast trajectory closely tracks actual trend dynamics.

3.2 Quality Thresholds #

MetricThreshold: AcceptableThreshold: GoodThreshold: ExcellentFLAI Result
MAPE< 15%< 8%< 5%4.69%
R²> 0.70> 0.85> 0.950.988
Overall Accuracy (1 − MSE)> 85%> 95%> 99%99.76%
Black-swan response latency< 48h< 12h< 1h< 1h

3.3 Evaluation Framework #

flowchart LR
    A[Raw Public Data\n2.7M+ records] --> B[ETL Pipeline\nCleaning · Normalisation · Interpolation]
    B --> C[Train / Validation / Test Split\nChronological · 70/15/15]
    C --> D[Model Training\nRNN + Injection Layer\nOnline weight update]
    D --> E[Inference\nFR n+1 generated daily]
    E --> F{Quality Gates}
    F --> G[MAPE < 5%?\nR² > 0.95?\nAccuracy > 99%?]
    G -- Pass --> H[Production Deployment\nFlai Platform]
    G -- Fail --> D
    F --> I[Approximation Analysis\nĀ₁₋₂₅ vs Ā₂₆₋₅₀\nLearning curve validation]

4. Mathematical Model #

The mathematical model of FLAI is formalised in two layers: a structural description of model variables (Equations 1.1–1.9) and a set of validation expressions derived during implementation (Equations 3.1–3.5). The formulation adopts a discrete-time dynamic programming perspective in which the model state at each iteration n is updated both from observed data and from the self-correcting error signal generated by the previous prediction.

4.1 Core Notation #

Let i index an individual social-network object (a music trend, influencer, or video) and n index the simulation date within the period n = 1, …, N. The key model variables are:

  • R_i(n) — the observed repost count for object i on day n; a stochastic time series
  • dS_i(n) — the number of days skipped in data collection for object i at day n
  • bW_i(0) — the initial base weight for object i, representing prior success level ∈ [0, 1]
  • DRF_i(0) — the initial repost forecast at simulation start (e.g., 0.1)
  • DRFE_i(0) — the initial forecast error correction coefficient ∈ [0, 1]

4.2 Controlling Boolean and Constraints #

The system first evaluates whether historical data is available for the object:

To handle the inherent irregularity of social-media data collection — data collection gaps, platform outages, or holiday gaps — the model enforces an interpolation constraint. The number of skipped days dS_i must not exceed half the total simulation period:

where is the day-index difference, and .

This constraint ensures that interpolation remains statistically valid: filling gaps spanning more than 50% of the series would introduce more noise than it removes. A second constraint governs the self-correcting predictor:

ensuring that the correction signal never becomes degenerate. If this condition would be violated, the base weight bW_i(0) is used to inject a non-zero floor value, preserving numerical stability.

4.3 Dynamic Forecasting Equation #

The central forecasting target is the repost count at the next time step:

where X_i(n) is the set of exogenous variables at time n — the Injection Layer’s input signal — which may include: publication time (hour, day-of-week), content category, platform identifier, seasonality indicators, and real-time contextual signals such as trending hashtags or breaking news metadata.

This equation is the mathematical expression of the Injection Layer concept: the forecast is not merely a function of the target series history R(n) but also depends on the live external context X(n) and the accumulated self-correction term DRFE(n).

4.4 Sigmoid Activation Function #

The RNN architecture uses the Sigmoid function as the neuron activation, providing probabilistic interpretation of outputs in the range [0, 1]:

The weighted input to each neuron is:

where Xi are the input signals, GWi are the synapse weights, and b is the bias term. The argument x determines the neuron’s activation level, and f(x) ∈ (0, 1) is interpreted as the probability of a repost-growth event exceeding the trend threshold.

4.5 Daily Rise and Interpolation #

The absolute indicator of daily repost change — DailyRise (DR) — is computed with explicit interpolation correction for missing days:

When dS_i(n) > 1, the formula automatically normalises the repost delta by the number of elapsed days, preventing spurious spikes in the DR series after multi-day gaps. This is a qualitatively important design choice: standard LSTM implementations typically pad missing values with zeros or mean-imputed values, both of which corrupt the local derivative of the series.

4.6 Repost Prediction and Synapse Weight #

The prediction of today’s repost count DRF_i(n) (DailyRiseForecasting) is formed by multiplying the daily rise metric by the current synapse weight:

The synapse weight GW_i(n) is updated at each iteration through the backpropagation process, encoding the network’s current “belief” about how predictive the historical rate of change is for the next step.

4.7 Error Correction and Forward Forecast (Validation Equations) #

During validation, four additional expressions govern the model’s self-correction cycle.

The forecast error for the previous day captures the relative discrepancy between prediction and observation:

The forward forecast (predicted total reposts for the next day) integrates the current observed count with the predicted daily rise:

The absolute forecast error at each step is:

The Mean Squared Error across all validation points:

The coefficient of determination confirming model fit quality:

The iterative loop implementing these equations constitutes the adaptive recalibration predictor — the mechanism that answers RQ3. At each daily step, the model compares FR(n) against R(n), computes DRFE(n), updates GW(n+1), incorporates the current exogenous context X(n), and produces FR(n+1) — all without reloading historical training data or triggering a full gradient descent epoch over the complete dataset.


5. System Architecture #

The FLAI platform adopts a microservice architecture that cleanly separates the user-facing web application from the AI inference service, enabling independent scaling, hot-swappable model updates, and enterprise-grade security.

5.1 Architectural Layers #

Front-end (HTML / CSS / JavaScript) The user interface provides four primary analytical sections: Sounds, Videos, Influencers, and Trends. Each section exposes interactive dashboards with cumulative and growth charts, historical repost time series, AI prediction overlays (green dotted line), and downloadable CSV reports. Users apply filters by geography, content category, hashtag, and date range.

Back-end (.NET Core MVC / C#) The .NET layer handles: (1) user authentication and authorisation via JWT tokens compliant with RFC 7519; (2) configuration storage in an MSSQL Server database; (3) HTTP gateway routing requests to the AI microservice; (4) result caching and delivery. .NET Core was selected for enterprise-grade reliability, horizontal scalability, and ecosystem maturity in handling concurrent users at production volume.

AI Microservice (Python / Flask) The Python service hosts the RNN with Injection Layer, implemented in TensorFlow. Supporting libraries include: Pandas and NumPy for data processing, SciPy for scientific computation, StatsModels for ARIMA baseline comparison, Scikit-learn for regression validation, and Matplotlib for chart generation. The service exposes a REST endpoint consumed by the .NET back-end via HTTP requests. The architecture allows the ML model to be updated without service interruption (hot-swap pattern).

Database (MSSQL Server) Stores user configuration, model parameters, synapse weight checkpoints, and forecast result history. The logical schema includes tables for: User, Content, Trend, Sound, Video, Influencer, and Administrator entities.

5.2 Information Flow #

flowchart TB
    subgraph External["External Data Sources"]
        T["TikTok Public Data\n(sounds, videos, hashtags,\nlikes, reposts, trends)"]
        I["Instagram Public Data\n(posts, reels, engagement)"]
    end

    subgraph ETL["ETL Pipeline"]
        C["Data Collection\nData collection · deduplication"]
        N["Normalisation & Cleaning\nDuplicate removal · Bot detection\n95%+ accuracy"]
        INT["Interpolation\ndS(n) correction for gaps\ndSmax = N/2"]
        CSV["Structured CSV Export\nDate · Reposts · Features"]
    end

    subgraph AI["AI Microservice (Python / Flask)"]
        CL["Clustering\nGroup similar trends"]
        TS["Time Series Analysis\nDR(n) · dS(n) computation"]
        RNN["RNN + Injection Layer\nGW(n) update · DRFE(n)\nFR(n+1) generation"]
        VIZ["Matplotlib\nChart generation"]
    end

    subgraph BE["Back-end (.NET Core MVC)"]
        JWT["JWT Auth · RFC 7519"]
        GW2["Service Gateway"]
        DB[("MSSQL Server\nConfig · Weights · Results")]
    end

    subgraph FE["Front-end (HTML / JS / CSS)"]
        DASH["Dashboard\nTrends · Sounds · Videos · Influencers"]
        CHART["Interactive Charts\nHistorical + AI Prediction overlay"]
        REPORT["CSV / PDF Reports"]
    end

    T --> C
    I --> C
    C --> N --> INT --> CSV
    CSV --> CL --> TS --> RNN --> VIZ
    RNN --> GW2
    VIZ --> GW2
    GW2 --> JWT --> DB
    GW2 --> DASH --> CHART --> REPORT

    style AI fill:#f9f9f9,stroke:#000
    style BE fill:#f3f3f3,stroke:#000

5.3 Big Data Processing #

The raw input data streams from TikTok’s public data streams exhibit all five characteristics of Big Data: Volume (billions of daily events), Velocity (real-time streaming), Variety (text, metrics, hashtags, audio metadata), Veracity (bot activity, spam, duplicates), and Value (trend-prediction signal). The ETL pipeline addresses Veracity through a bot-detection module achieving 95%+ identification accuracy, and addresses Variety by normalising heterogeneous data formats into a unified time-series format (date, repostcount, contentid, feature_vector).

The interpolation algorithm — governed by Equations 1.2, 1.7, and 1.8 — ensures that data gaps of up to dS_max = N/2 days are filled through normalised daily-rise estimation rather than zero-padding or mean substitution, preserving local trend dynamics.


6. Results and Validation #

6.1 Dataset #

Training and validation used a dataset of publicly available records constructed from publicly available TikTok data, comprising 2.7 million records spanning music trends, influencer accounts, and video content tracked over the period January 2023 – June 2024. An “eleventh class” of randomly sampled posts was added to balance the class distribution, preventing the model from developing a majority-class bias. Data was split chronologically: 70% training, 15% validation, 15% test — preserving temporal ordering to prevent data leakage.

6.2 Approximation Analysis and Learning Curve #

The neural network’s online learning capability is demonstrated by computing the approximation level Ā across sequential time windows:

WindowTime PointsApproximation Error (Ā)
Initialn = 1–255.20%
Trainedn = 26–504.25%
Overalln = 1–504.69%

The decrease from Ā₁₋₂₅ = 5.20% to Ā₂₆₋₅₀ = 4.25% confirms that online weight updates (DRFE-driven synapse corrections) produce measurable accuracy improvement across sequential training windows — validating the adaptive recalibration mechanism without full retraining.

6.3 Comparative Results #

ModelMAPER²Overall AccuracyBlack-Swan Response
ARIMA~24%~0.71~76%24–48h
Standard LSTM~20%~0.81~80%24–48h (retrain)
Prophet~22%~0.75~78%Not supported
FLAI (Injection Layer RNN)4.69%0.98899.76%< 1h

The improvement over ARIMA baseline is 19–24 percentage points in MAPE, consistent across the full validation dataset and across individual trend categories (music sounds, video challenges, influencer growth).

6.4 Machine Learning Model Benchmarks #

Beyond the primary RNN, exploratory regression models were evaluated on the structured feature dataset:

ModelR² (InfluencerSubscribers)
Random Forest0.9289
Gradient Boosting (XGBoost)0.9496
FLAI RNN + Injection Layer0.988

Correlation analysis of input features revealed that VideoLikes, VideoShares, and AuthorSubscribers exhibit the highest Pearson coefficients with the target variables (VideoPlayCount and InfluencerSubscribers), confirming their selection as primary input features. SoundDuration and VideoParseDate showed low linear correlation but may encode non-linear patterns relevant to the RNN.

6.5 Operational Performance #

MetricValue
Forecast generation time< 1 second
Model recalibration latency< 1 hour
System uptime99.9%
Service response time< 500ms
Concurrent users supported1,000+
Prediction horizon24 hours (90%+ accuracy)

7. Methodological Note #

§1 — Dataset Construction #

The primary dataset was assembled from publicly available TikTok data. The collection process gathered publicly visible video data (sounds, likes, comments), trend data (popularity of sounds, hashtags), and influencer metrics — all of which were publicly accessible on the platform. The dataset is organised as a set of individual time series — one per social-network object — with fields: Date, Reposts, ContentType, AuthorId, HashtagList, SoundId. Total volume: 2.7 million records. A balanced random sample of 11th-class posts (volume equal to the sum of all other classes combined) was added to prevent class imbalance from biasing the model.

§2 — Comparison Methodology #

Baseline models (ARIMA, Prophet, standard LSTM) were trained on identical train splits under identical feature sets. MAPE was computed on the held-out chronological test set using the formula in Section 3.1. The FLAI Injection Layer RNN was trained online (step-by-step optimisation), updating weight coefficients after each daily iteration. Stochastic gradient descent was used for backpropagation, with the MSE loss function (Equation 3.4).

§3 — Results Summary #

ModelTest MAPER²Δ vs ARIMA
ARIMA~24%0.71—
Prophet~22%0.75−2 pp
Standard LSTM~20%0.81−4 pp
FLAI4.69%0.988−19–24 pp

§4 — Primary Publication #

The mathematical model and empirical results documented in this paper were first reported in the peer-reviewed journal article:

Ivchenko I.Yu., Haidaienko O.V., Knыryk N.R., Morozova H.S., Hrybeniuk D.M. Forecasting the behaviour of social network objects. Khmelnytskyi National University. Technical Sciences, 341(5), pp. 317–321. [6][7]

Supporting publications from the research group include:

Ivchenko I.Yu., Linhur L.M., Ivchenko O.I. Forecasting social network time series based on a neural network approach. Infrastruktura rynku, Issue 70, 2023, pp. 194–198. [7][8]

Ivchenko I.Yu., Ivchenko O.I., Radkevich I.O. Using internet technologies for diagnostics and forecasting of business decision-making processes. Prychornomorski ekonomichni studii, Vol. 27, Issue 81, 2023, pp. 224–228. [8]


8. Novelty and Originality #

8.1 What Makes the Injection Layer Novel #

Standard approaches to incorporating exogenous variables into LSTM or RNN architectures treat external signals as static input features concatenated to the input vector at training time [9][9]. This approach has two fundamental limitations: (1) it requires the analyst to enumerate all relevant exogenous signals before training; (2) it cannot respond to signals whose statistical character changes after training — a routine occurrence in fast-moving social media environments.

The FLAI Injection Layer differs in three qualitatively important ways:

Dynamic weight generation. At each inference step n, the layer computes new synapse weights GW(n) from the current input state R(n) and the exogenous signal X(n), rather than using weights frozen at training time. The update rule is governed by the DRFE(n) error correction term, which ensures the weight revision is proportional to the magnitude of the previous prediction error.

Online recalibration without full retraining. The weight update cycle (Equations 1.7–1.9 and 3.1–3.2) executes at inference time, requiring no gradient computation over historical batches. This achieves sub-one-hour adaptation — contrasted with 24–48 hours for batch-retrained baselines.

Interpolation-aware exogenous signal processing. The dS(n) correction (Equation 1.8) normalises the daily-rise signal by the actual number of elapsed days, preventing data gaps from corrupting the exogenous signal gradient. Standard implementations do not apply this correction.

8.2 Comparison with Related Work #

Recent literature on social media trend forecasting has explored hybrid ARIMA-LSTM models [10][10], attention mechanisms for event detection in Twitter streams [11][11], and graph neural networks for influence propagation modelling. None of these works, to the authors’ knowledge, proposes a mechanism that (a) dynamically recomputes synapse weights at inference time, (b) conditions those weights on a self-correcting error signal DRFE(n), and (c) applies interpolation correction to the exogenous input. The specific combination constitutes the original contribution of the FLAI system.

8.3 Zenodo Archive #

The methodological note for this work has been deposited and archived at Zenodo: [12][12]

8.4 Media Recognition and Commercial Validation #

The novelty of the FLAI system was recognised by the Ukrainian technology media community upon the platform’s public launch in 2023. Coverage appeared in:

  • dev.ua — Ukraine’s leading technology news platform, featuring the FLAI launch as an AI startup analysis tool for TikTok trends
  • bazilik.media — reporting 90% 24-hour prediction accuracy and the startup’s practical utility for content creators
  • detector.media — Ukraine’s media-monitoring publication, highlighting the system’s influencer and trend analysis capabilities
  • weekend.zone — lifestyle-technology coverage of the Odesa-founded startup
  • vsviti.com.ua and ukr.net — broad Ukrainian digital media aggregators

The system was further mentioned in Estonian online podcasts and attracted international attention consistent with the WebSummit Alpha startup ecosystem. Commercial deployment serves content creators, music labels, brand managers, and talent agencies seeking data-driven viral content strategy.


9. Author Contributions #

ContributorContributions
Oleh IvchenkoMathematical model formalisation (Equations 1.1–1.9, 3.1–3.5); Injection Layer architecture design; research methodology; scientific supervision of the project; academic publications preparation; integration with Research Project No. 267-68
Dmytro HrybeniukSystem implementation (.NET Core MVC, Python/Flask AI microservice); public data collection from TikTok; dataset construction (2.7M+ records); neural network training pipeline; system testing and validation; deployment and performance optimisation

The research was conducted under scientific supervision of Prof. Zoya Sokolovska (D.Sc., Economics) within Research Project No. 267-68 at the Department of Economic Cybernetics and Information Technologies, Odessa National Polytechnic University (2025–2030).


10. Glossary #

TermDefinition
R(n)Quantitative indicator (repost count) for a social-network object at time step n; the primary target time series
dS(n)daysSkipped — the number of days missing between consecutive data points; used in interpolation correction
DR(n)DailyRise — absolute indicator of daily repost change, normalised by dS(n) to account for data gaps
DRF(n)DailyRiseForecasting — predicted daily repost count for the current step, computed as DR(n) × GW(n)
DRFE(n)DailyRiseForecastingError — relative forecast error from the previous iteration; drives online weight recalibration
FR(n+1)ForecastingRepost — total predicted repost count for the next day; the primary model output
GW(n)GraphWeight — dynamically generated synapse weight at iteration n; reflects the network’s current prediction confidence
bW(0)BaseWeight — initial weight for object i, representing the object’s prior success level ∈ [0, 1]
pVpastValue — Boolean flag: 1 if historical data is available for the object, 0 if not
dSmaxMaximum tolerated data gap = N/2 (half the simulation period); beyond this threshold, interpolation is considered unreliable
X(n)Set of exogenous variables at time n: publication time, content category, seasonality, viral event signals; the Injection Layer’s input
Injection LayerArchitectural component of the FLAI RNN that incorporates X(n) into the network’s memory state at every inference step, enabling real-time exogenous signal integration
ARIMAAutoRegressive Integrated Moving Average — classical statistical time-series model; used as primary baseline
RNNRecurrent Neural Network — neural architecture with feedback loops enabling sequential state memory; backbone of FLAI
LSTMLong Short-Term Memory — gated RNN variant; used as secondary baseline
MAPEMean Absolute Percentage Error — scale-invariant forecast accuracy metric
MSEMean Squared Error — quadratic loss function used in training and evaluation
R²Coefficient of determination — proportion of variance in actual data explained by the model
Big DataData characterised by Volume, Velocity, Variety, Veracity, and Value; describes the raw social-media input streams
ETLExtract-Transform-Load — data pipeline stage converting raw platform data to structured model inputs
JWTJSON Web Token (RFC 7519) — authentication mechanism used in the FLAI platform’s security layer

11. Conclusion #

This paper has presented FLAI — a complete intelligent information-analytical system for social media trend prediction, grounded in an original RNN architecture featuring a dynamic Injection Layer and an adaptive online recalibration predictor.

Findings by Research Question:

RQ1 — Dynamic exogenous variable injection: The Injection Layer formalised in Equation 1.4 achieves dynamic incorporation of external signals X(n) by conditioning synapse weights GW(n) on both the current observed state R(n) and the accumulated self-correction term DRFE(n). This mechanism operates at inference time without requiring feature pre-specification or batch retraining, enabling the model to respond to unscheduled viral events within a single iteration cycle.

RQ2 — Performance advantage over baselines: Validated on a 2.7-million-record TikTok dataset, FLAI achieves MAPE = 4.69%, R² = 0.988, and overall accuracy 99.76% — representing a 19–24 percentage point improvement over ARIMA, and exceeding standard LSTM and Prophet baselines by 15–17 percentage points. The progressive approximation improvement from Ā₁₋₂₅ = 5.20% to Ā₂₆₋₅₀ = 4.25% confirms that the online learning mechanism produces measurable accuracy gains without full retraining.

RQ3 — Black-swan event handling: The DRFE(n) error-correction loop (Equation 3.1) enables the model to detect and compensate for sudden distributional shifts within a single daily iteration. Empirical testing confirmed that weight recalibration completes in under one hour following a black-swan event — compared to 24–48 hours required for batch-retrained baseline models.

Future Directions #

The FLAI architecture opens several research trajectories: (1) extension of the Injection Layer to incorporate multi-modal exogenous signals, including computer-vision analysis of video thumbnails and audio embedding analysis of trend sounds; (2) adaptation of the model to cross-platform trend prediction spanning TikTok, Instagram Reels, and YouTube Shorts simultaneously; (3) application of the dynamic recalibration predictor to adjacent domains — the authors have identified promising transfer opportunities in pharmaceutical sales forecasting (ScanLab project) and music recommendation systems (Gromus AI); (4) formal study of the convergence properties of the DRFE(n) update rule under different distributional regimes, providing theoretical guarantees complementing the empirical validation presented here.

The architecture’s commercial validation through the Flai platform and its recognition by the Ukrainian technology press confirms that the gap between academic innovation and production deployment is bridgeable for AI-driven social media analytics when system design prioritises online adaptability alongside predictive accuracy.

References (12) #

  1. Stabilarity Research Hub. FLAI: An Intelligent System for Social Media Trend Prediction Using Recurrent Neural Networks with Dynamic Exogenous Variable Injection. doi.org. d
  2. PHILLIPS, PETER C. B.; PERRON, PIERRE. (1988). Testing for a unit root in time series regression. doi.org. dcrtil
  3. Taylor, Sean J.; Letham, Benjamin. (2018). Forecasting at Scale. doi.org. dcrtl
  4. doi.org. dtl
  5. Vaswani, Ashish; Shazeer, Noam; Parmar, Niki; Uszkoreit, Jakob; Jones, Llion; Gomez, Aidan N.; Kaiser, Łukasz; Polosukhin, Illia. (2017). Attention Is All You Need. doi.org. dcrtil
  6. Lim, Bryan; Arık, Sercan Ö.; Loeff, Nicolas; Pfister, Tomas. (2021). Temporal Fusion Transformers for interpretable multi-horizon time series forecasting. doi.org. dcrtl
  7. ІВЧЕНКО, ІРИНА; ГАЙДАЄНКО, ОКСАНА; КНИРІК, НАТАЛЯ; МОРОЗОВА, ГАННА; ГРИБЕНЮК, ДМИТРО. (2024). ПРОГНОЗУВАННЯ ПОВЕДІНКИ ОБ’ЄКТІВ СОЦІАЛЬНИХ МЕРЕЖ. doi.org. dcrtil
  8. ,; Ivchenko, Iryna; Linhur, Liubov; ,; Ivchenko, O.I.; ,. (2023). FORECASTING TIME SERIES OF SOCIAL NETWORKS BASED ON THE NEURAL NETWORK APPROACH. doi.org. dcrtil
  9. Sinha, Vaibhav B.; Kudugunta, Sneha; Sankar, Adepu Ravi; Chavali, Surya Teja; Balasubramanian, Vineeth N.. (2020). DANTE: Deep alternations for training neural networks. doi.org.
  10. Picasso, Andrea; Merello, Simone; Ma, Yukun; Oneto, Luca; Cambria, Erik. (2019). Technical analysis and sentiment embeddings for market trend prediction. doi.org.
  11. Raji, Shahab; de Melo, Gerard. (2020). What Sparks Joy: The AffectVec Emotion Database. doi.org. dcrtil
  12. doi.org. d
← Previous
Originality of Heuristic Rules in RNN-based Social Media Trend Prediction
Next →
GROMUS: A Unified AI Architecture for Pre-Publication Music Virality Prediction
All Anticipatory Intelligence articles (19)18 / 19
Version History · 1 revisions
+
RevDateStatusActionBySize
v0Mar 25, 2026CURRENTFirst publishedAuthor39077 (+39077)

Versioning is automatic. Each revision reflects editorial updates, reference validation, or formatting changes.

Recent Posts

  • Comparative Benchmarking: HPF-P vs Traditional Portfolio Methods
  • The Future of Intelligence Measurement: A 10-Year Projection
  • All-You-Can-Eat Agentic AI: The Economics of Unlimited Licensing in an Era of Non-Deterministic Costs
  • The Future of AI Memory — From Fixed Windows to Persistent State
  • FLAI & GROMUS Mathematical Glossary: Complete Variable Reference for Social Media Trend Prediction Models

Research Index

Browse all articles — filter by score, badges, views, series →

Categories

  • ai
  • AI Economics
  • AI Memory
  • AI Observability & Monitoring
  • AI Portfolio Optimisation
  • Ancient IT History
  • Anticipatory Intelligence
  • Article Quality Science
  • Capability-Adoption Gap
  • Cost-Effective Enterprise AI
  • Future of AI
  • Geopolitical Risk Intelligence
  • hackathon
  • healthcare
  • HPF-P Framework
  • innovation
  • Intellectual Data Analysis
  • medai
  • Medical ML Diagnosis
  • Open Humanoid
  • Research
  • ScanLab
  • Shadow Economy Dynamics
  • Spec-Driven AI Development
  • Technology
  • Trusted Open Source
  • Uncategorized
  • Universal Intelligence Benchmark
  • War Prediction

About

Stabilarity Research Hub is dedicated to advancing the frontiers of AI, from Medical ML to Anticipatory Intelligence. Our mission is to build robust and efficient AI systems for a safer future.

Language

  • Medical ML Diagnosis
  • AI Economics
  • Cost-Effective AI
  • Anticipatory Intelligence
  • Data Mining
  • 🔑 API for Researchers

Connect

Facebook Group: Join

Telegram: @Y0man

Email: contact@stabilarity.com

© 2026 Stabilarity Research Hub

© 2026 Stabilarity Hub | Powered by Superbs Personal Blog theme
Stabilarity Research Hub

Open research platform for AI, machine learning, and enterprise technology. All articles are preprints with DOI registration via Zenodo.

185+
Articles
8
Series
DOI
Archived

Research Series

  • Medical ML Diagnosis
  • Anticipatory Intelligence
  • Intellectual Data Analysis
  • AI Economics
  • Cost-Effective AI
  • Spec-Driven AI

Community

  • Join Community
  • MedAI Hack
  • Zenodo Archive
  • Contact Us

Legal

  • Terms of Service
  • About Us
  • Contact
Operated by
Stabilarity OÜ
Registry: 17150040
Estonian Business Register →
© 2026 Stabilarity OÜ. Content licensed under CC BY 4.0
Terms About Contact
Language: 🇬🇧 EN 🇺🇦 UK 🇩🇪 DE 🇵🇱 PL 🇫🇷 FR
Display Settings
Theme
Light
Dark
Auto
Width
Default
Column
Wide
Text 100%

We use cookies to enhance your experience and analyze site traffic. By clicking "Accept All", you consent to our use of cookies. Read our Terms of Service for more information.