This document is the workflow specification for the project:
An Explainable Multimodal Neural Framework for Financial Risk Management
This file documents:
The current implementation does not include fundamentals.
Older versions of the project architecture contained:
These are now intentionally excluded from the active workflow.
The project may still contain historical SEC fundamentals data and documentation because those datasets were collected and processed earlier. However, the current model pipeline must not silently include fundamentals unless they are explicitly reintroduced later.
The active implementation uses four modelling families:
Fundamentals are treated as historical project work, not an active model input.
The project does not attempt to solve financial risk management using a single monolithic black-box model.
Instead, it uses a distributed, modular, explainable architecture:
Specialised modules → risk-aware synthesis → hybrid fusion → final decision + explanations
The system is built around the following principles:
| Principle | Meaning in This Project |
|---|---|
| Specialisation | Each model performs a bounded task: sentiment, news, volatility, drawdown, contagion, regime, etc. |
| Multimodality | The system combines market time series, text, macro data, and graph relations. |
| Explainability | Each module emits its own explanation trace; fusion also explains the combined decision. |
| Risk-first design | Risk modules are not decorative; they control sizing, gating, and final approval. |
| Chronological discipline | All training, validation, testing, normalisation, and PCA fitting must respect time order. |
| Buildability | The project prioritises a working, defensible system over unnecessary complexity. |
| Auditability | Every final decision should be traceable to module outputs and rule-barrier effects. |
The research argument is that transparency improves when a large decision problem is decomposed into specialised, inspectable components rather than hidden inside one oversized neural network.
INPUT DATA
├── Market data
│ ├── OHLCV
│ ├── returns
│ ├── engineered technical features
│ └── liquidity features
│
├── Financial text data
│ ├── SEC textual filings
│ ├── section-level text chunks
│ ├── filing metadata
│ └── FinBERT embeddings
│
├── Macro / regime data
│ ├── FRED interest-rate series
│ ├── yield curve features
│ ├── credit spread features
│ └── regime stress indicators
│
└── Cross-asset relation data
├── rolling correlation snapshots
├── ticker universe
├── sector metadata
├── beta estimates
└── graph edge lists
ENCODERS
├── Temporal Encoder
│ └── 256-dimensional market embeddings
│
└── FinBERT Encoder
└── 256-dimensional financial text embeddings
ANALYST MODULES
├── Technical Analyst
├── Sentiment Analyst
├── News Analyst
├── Qualitative Analyst
└── Quantitative Analyst
RISK ENGINE
├── Volatility Model
├── Drawdown Risk Model
├── Historical VaR Module
├── CVaR / Expected Shortfall Module
├── StemGNN Contagion Risk Module
├── Liquidity Risk Module
├── MTGNN Regime Risk Module
└── Position Sizing Engine
SYNTHESIS
├── Qualitative branch
│ └── Sentiment + News → daily qualitative signal
│
├── Quantitative branch
│ └── Technical + Risk + Position Sizing → trained risk-attention output
│
└── Fusion Engine
├── Layer 1: learned fusion weighting
└── Layer 2: user-defined rule barrier
FINAL DECISION
└── Buy / Hold / Sell + confidence + position size + explanation
The system is trained and evaluated using chronological chunks. This protects against look-ahead bias and creates realistic out-of-sample validation periods.
| Chunk | Train Period | Validation Period | Test Period | Purpose |
|---|---|---|---|---|
| Chunk 1 | 2000–2004 | 2005 | 2006 | Early historical regime and first full pipeline integration |
| Chunk 2 | 2007–2014 | 2015 | 2016 | Financial crisis/post-crisis period and mid-sample robustness |
| Chunk 3 | 2017–2022 | 2023 | 2024 | Recent market period and final current-era evaluation |
ticker and date for downstream joins.Primary market files are stored under:
data/yFinance/processed/
Important files include:
returns_panel_wide.csv
returns_long.csv
ohlcv_final.csv
features_temporal.csv
liquidity_features.csv
The market pipeline produced a complete panel over approximately 2,500 tickers and about 6,286 trading days. This data drives:
Financial text comes primarily from SEC filings. The text pipeline includes:
FinBERT outputs live under:
outputs/embeddings/FinBERT/
Expected files:
chunk{n}_train_embeddings.npy
chunk{n}_train_metadata.csv
chunk{n}_val_embeddings.npy
chunk{n}_val_metadata.csv
chunk{n}_test_embeddings.npy
chunk{n}_test_metadata.csv
FRED macro/regime data lives under:
data/FRED_data/outputs/
Important file:
macro_features_trading_days_clean.csv
This file contains trading-day-aligned macro features such as:
The active regime model uses these features alongside graph and embedding signals.
Cross-asset relation data lives under:
data/graphs/
Important subfolders:
data/graphs/snapshots/
data/graphs/metadata/
data/graphs/returns/
data/graphs/combined/
The relation graph data includes rolling correlation edge snapshots and metadata such as sector mapping, beta estimates, market-cap proxies, and universe membership. These outputs support:
The Temporal Encoder is the shared market representation module.
It converts market sequences into 256-dimensional embeddings. These embeddings are reused by multiple downstream market/risk modules.
Market data is sequential. A single day of OHLCV is not enough to understand trend, momentum, volatility context, or pre-drawdown behaviour. The Temporal Encoder learns a reusable representation of recent market behaviour.
outputs/embeddings/TemporalEncoder/chunk{n}_{split}_embeddings.npy
outputs/embeddings/TemporalEncoder/chunk{n}_{split}_manifest.csv
The .npy file contains the embedding matrix. The manifest maps each row to:
ticker,date
chunk1_train_embeddings.npy: 3,065,000 × 256
chunk1_val_embeddings.npy: 555,000 × 256
chunk1_test_embeddings.npy: 552,500 × 256
chunk2_train_embeddings.npy: 4,960,000 × 256
chunk2_val_embeddings.npy: 555,000 × 256
chunk2_test_embeddings.npy: 555,000 × 256
chunk3_train_embeddings.npy: 3,700,000 × 256
chunk3_val_embeddings.npy: 550,000 × 256
chunk3_test_embeddings.npy: 547,500 × 256
The FinBERT Encoder is the shared text representation module.
It turns SEC filing text chunks into 256-dimensional financial text embeddings.
Financial text contains qualitative information that cannot be captured from price data alone:
outputs/embeddings/FinBERT/chunk{n}_{split}_embeddings.npy
outputs/embeddings/FinBERT/chunk{n}_{split}_metadata.csv
outputs/embeddings/FinBERT/chunk{n}_{split}_manifest.json
The model may extract 768-dimensional FinBERT embeddings first, then project to 256 dimensions using IncrementalPCA.
The PCA must be fit on the train split only:
Projection: IncrementalPCA fitted on train split only
The analyst layer converts encoder outputs and risk outputs into interpretable domain-specific signals.
outputs/embeddings/FinBERT/chunk{n}_{split}_embeddings.npy
outputs/embeddings/FinBERT/chunk{n}_{split}_metadata.csv
outputs/results/analysts/sentiment/chunk{n}_{split}_predictions.csv
outputs/embeddings/analysts/sentiment/chunk{n}_{split}_sentiment_embeddings.npy
sentiment_scoresentiment_confidencesentiment_uncertaintysentiment_magnitudeThe module provides the sentiment component of the qualitative branch.
FinBERT embeddings and text metadata.
outputs/results/analysts/news/chunk{n}_{split}_news_predictions.csv
outputs/embeddings/analysts/news/chunk{n}_{split}_news_embeddings.npy
news_event_impactnews_importancerisk_relevancevolatility_spikedrawdown_risknews_uncertaintyThe module interprets event relevance and qualitative risk from text.
Temporal Encoder embeddings and manifest.
outputs/results/TechnicalAnalyst/technical_analysis_chunk{n}_{split}.csv
trend_scoremomentum_scoretiming_confidencetechnical_confidenceThe module converts temporal market embeddings into directional technical signals.
Event-level:
outputs/results/QualitativeAnalyst/qualitative_events_chunk{n}_{split}.csv
Daily ticker-level:
outputs/results/QualitativeAnalyst/qualitative_daily_chunk{n}_{split}.csv
qualitative_scorequalitative_risk_scorequalitative_confidencequalitative_recommendationevent_countdominant_qualitative_driverxai_summaryThe module creates the qualitative branch consumed by Fusion.
Qualitative daily outputs are sparse compared with quantitative market rows. Fusion must handle missing qualitative rows by inserting a neutral no-text state.
The Quantitative Analyst consumes:
outputs/results/QuantitativeAnalyst/quantitative_analysis_chunk{n}_{split}.csv
The final Fusion-ready Quantitative Analyst output must contain:
attention_pooled_risk_score
top_attention_risk_driver
risk_attention_volatility
risk_attention_drawdown
risk_attention_var_cvar
risk_attention_contagion
risk_attention_liquidity
risk_attention_regime
If these columns are absent, the file is from the older non-attention schema and must not be used for final Fusion training.
quantitative_recommendationrisk_adjusted_quantitative_signaltechnical_direction_scorequantitative_risk_scorequantitative_risk_statequantitative_confidencequantitative_action_strengthrecommended_capital_fractionattention_pooled_risk_scoretop_attention_risk_driverThe module acts as the market/risk synthesis branch before final Fusion.
The Risk Engine is the control centre of the project. It estimates different types of financial risk, then constrains decision confidence and position size.
Volatility measures uncertainty and instability in price movement. Higher volatility means future price movement is less stable, so position sizing and confidence should be more conservative.
vol_10dvol_30dvolatility_risk_scorevolatility_regime_labelgarch_volrecent_volvolatility_confidenceConsumed by:
Drawdown risk estimates how far an asset could fall from a recent or future local peak. It captures path-dependent downside risk, not just volatility.
expected_drawdown_10dexpected_drawdown_30ddrawdown_risk_10ddrawdown_risk_30ddrawdown_risk_scorerecovery_days_10drecovery_days_30dconfidence_10dconfidence_30dDrawdown risk caps position size and influences Quantitative Analyst risk attention.
Value at Risk estimates a threshold loss under a historical return distribution.
Example interpretation:
VaR 95% = -0.03
means that, historically, the asset lost more than 3% on about 5% of days in the rolling window.
var_95var_99VaR contributes to tail-risk scoring and position sizing.
CVaR estimates the average loss beyond the VaR threshold. It is more informative than VaR when the tail loss is severe.
cvar_95cvar_99tail_ratio_95tail_ratio_99CVaR is used by Position Sizing and Quantitative Analyst as a tail-risk component.
Contagion risk estimates whether risk is spreading through cross-asset relationships. It captures systemic or relational risk that a single-asset time-series model may miss.
StemGNN is used for cross-asset contagion modelling.
outputs/results/StemGNN/contagion_scores_chunk{n}_{split}.csv
Columns include multi-horizon contagion scores such as:
contagion_5dcontagion_20dcontagion_60dContagion risk is one of the dominant drivers in Quantitative Analyst risk attention and Position Sizing caps.
Liquidity risk estimates whether a trade can be executed safely. An asset may look attractive but still be dangerous if volume is weak or execution cost is high.
liquidity_scoreslippage_estimate_pctdays_to_liquidate_1MtradableThe Fusion rule barrier may force HOLD and zero position when liquidity is too low or tradable == False.
Regime risk identifies the market environment: calm, volatile, crisis, or rotation.
Sentiment and news can explain why a regime may be changing, but the regime model captures the actual market-behaviour state through graph, macro, and embedding features.
regime_labelregime_confidenceprob_calmprob_volatileprob_crisisprob_rotationgraph_densityavg_degree_normlearned_graph_stressmacro_stress_scoreRegime outputs influence:
Position sizing converts risk outputs into a recommended capital allocation.
| Mode | Maximum per stock |
|---|---|
| Conservative | 5% |
| Moderate / Default | 10% |
| Aggressive | 15% |
Additional crisis constraints:
| Situation | Maximum exposure |
|---|---|
| Crisis regime, short/default horizon | 5% |
| Crisis regime, long horizon | 3% |
outputs/results/PositionSizing/position_sizing_chunk{n}_{split}.csv
Important columns:
recommended_capital_fractionrecommended_capital_pctposition_fraction_of_maxbinding_cap_sourcehard_cap_appliedsize_bucketrisk_budget_usedsize_reduction_reasonsFusion may reduce the Position Sizing recommendation, but should not increase it beyond the risk-approved exposure.
The qualitative branch compresses text-derived evidence into daily ticker-level signals.
FinBERT → Sentiment Analyst
→ News Analyst
→ Qualitative Analyst
→ qualitative_daily_chunk{n}_{split}.csv
Qualitative branch output meaning:
| Field | Meaning |
|---|---|
qualitative_score |
Directional text-based signal, usually from negative to positive |
qualitative_risk_score |
Text-derived risk or uncertainty |
qualitative_confidence |
Confidence based on event count and model uncertainty |
event_count |
Number of matched events for ticker-date |
dominant_qualitative_driver |
Main qualitative explanation driver |
The quantitative branch compresses technical and risk-engine outputs into a dense market/risk decision signal.
Technical Analyst
Risk Engine modules
Position Sizing
↓
Quantitative Analyst
↓
quantitative_analysis_chunk{n}_{split}.csv
Quantitative branch output meaning:
| Field | Meaning |
|---|---|
risk_adjusted_quantitative_signal |
Directional signal after risk adjustment |
quantitative_risk_score |
Learned/rule-combined risk intensity |
quantitative_confidence |
Confidence in quantitative signal |
recommended_capital_fraction |
Position sizing recommendation |
top_attention_risk_driver |
Highest-weight risk source |
risk_attention_* |
Learned attention weights over risk modules |
The Fusion Engine is hybrid.
It has two layers:
Layer 1: Learned Fusion Model
Layer 2: User Rule Barrier
The learned layer combines quantitative and qualitative evidence.
It learns:
Inputs include:
The rule barrier is the final safety layer.
The learned model proposes. The rule barrier approves, caps, vetoes, or modifies.
Rules include:
tradable == False, force HOLD and position 0,The final position should follow:
final_position = min(
position_sizing_recommendation,
learned_position_suggestion,
user_rule_cap
)
Expected output:
outputs/results/FusionEngine/fused_decisions_chunk{n}_{split}.csv
Important columns:
final_recommendationfinal_fusion_signalfinal_fusion_risk_scorefinal_fusion_confidencefinal_position_fractionfinal_position_pctlearned_recommendationlearned_quantitative_weightlearned_qualitative_weightrule_changed_actionrule_barrier_reasonsfusion_xai_summaryThe Final Trade Approver consumes Fusion output and emits the final user-facing decision.
Final output should include:
Buy / Hold / Sell
Confidence Score
Position Size Recommendation
Risk Summary
Module-wise Explanation Trace
Final Explanation
In the current architecture, Fusion already performs much of the final decision logic. A thin final approver may still be useful for:
A future production inference day should follow this order:
1. Load latest market data
2. Update market features
3. Generate temporal embedding
4. Load latest text/filings/news if available
5. Generate FinBERT embedding
6. Run Sentiment Analyst
7. Run News Analyst
8. Run Technical Analyst
9. Run risk modules
10. Run Position Sizing
11. Run Qualitative Analyst
12. Run Quantitative Analyst
13. Run Fusion Engine
14. Run Final Trade Approver
15. Export final decision + XAI report
For missing text on a given day, the system should use a neutral qualitative state:
qualitative_score = 0.0
qualitative_risk_score = 0.5
qualitative_confidence = 0.0
event_count = 0
dominant_qualitative_driver = no_text_event
This prevents missing text from becoming falsely bullish or bearish.
code/encoders/temporal_encoder.py
code/encoders/finbert_encoder.py
outputs/embeddings/TemporalEncoder/
outputs/embeddings/FinBERT/
outputs/models/TemporalEncoder/
outputs/models/FinBERT/
code/analysts/sentiment_analyst.py
code/analysts/news_analyst.py
code/analysts/technical_analyst.py
code/analysts/qualitative_analyst.py
code/analysts/quantitative_analyst.py
outputs/results/analysts/
outputs/results/TechnicalAnalyst/
outputs/results/QualitativeAnalyst/
outputs/results/QuantitativeAnalyst/
code/riskEngine/volatility.py
code/riskEngine/drawdown.py
code/riskEngine/var_cvar_liquidity.py
code/riskEngine/regime_gnn.py
code/riskEngine/position_sizing.py
outputs/results/risk/
outputs/results/StemGNN/
outputs/results/MTGNNRegime/
outputs/results/PositionSizing/
code/gnn/stemgnn_contagion.py
code/gnn/stemgnn_base_model.py
code/gnn/mtgnn_regime.py
code/fusion/fusion_layer.py
code/fusion/final_fusion.py
outputs/results/FusionEngine/
outputs/models/FusionEngine/
outputs/codeResults/FusionEngine/
Fusion must not consume old Quantitative files.
A valid Fusion-ready Quantitative output must satisfy:
attention_schema=True
old_schema=False
Required columns:
top_attention_risk_driver
risk_attention_volatility
risk_attention_drawdown
risk_attention_var_cvar
risk_attention_contagion
risk_attention_liquidity
risk_attention_regime
attention_pooled_risk_score
Before Fusion training, confirm:
QuantitativeAnalyst:
quantitative_analysis_chunk{n}_train.csv
quantitative_analysis_chunk{n}_val.csv
quantitative_analysis_chunk{n}_test.csv
QualitativeAnalyst:
qualitative_daily_chunk{n}_train.csv
qualitative_daily_chunk{n}_val.csv
qualitative_daily_chunk{n}_test.csv
Each module should produce:
This project depends heavily on point-in-time correctness.
The following are forbidden:
At the time this replacement workflow was written, the project had reached the final integration stage.
Completed or largely completed:
Remaining final-stage work:
This project proposes An Explainable Multimodal Neural Framework for Financial Risk Management. Instead of relying on a monolithic black-box model, the system decomposes the decision process into specialised modules for market sequence understanding, textual analysis, risk estimation, graph-based contagion modelling, regime detection, position sizing, and final fusion. Each module produces interpretable intermediate outputs and explanation traces. A hybrid Fusion Engine combines learned evidence weighting with a user-controlled rule barrier, ensuring that the final decision remains risk-aware, auditable, and practically controllable. The architecture is evaluated using chronological train/validation/test chunks to reduce look-ahead bias and improve realism.
This workflow is the active project workflow.
Do not reintroduce fundamentals, Bull/Bear debaters, or monolithic agent orchestration unless explicitly approved.
The current priority is not adding more architectural complexity. The priority is:
complete integration → validate outputs → audit XAI → evaluate results → document clearly