Project: An Explainable Multimodal Neural Framework for Financial Risk Management
Component: MTGNN-style learned graph builder
Primary implementation context: code/gnn/mtgnn_regime.py
Scope of this document: only the MTGNN graph-building usage inside the current project
Not covered here: full regime classifier documentation, regime training details, final regime-risk interpretation
This document explains the MTGNN graph-building component as it is used in the current fin-glassbox system.
The project does not use the full original MTGNN architecture as a general forecasting model. Instead, it uses the graph construction idea from MTGNN: learning a sparse, feature-aware adjacency matrix between assets. That learned graph is then converted into graph properties and passed into the regime module.
Therefore, this document intentionally focuses only on:
The full regime-risk model, classifier, labels, macro integration, and final regime outputs should be documented separately in the Regime Risk Module documentation.
In this project, MTGNN is used as:
Feature-aware graph builder
not as:
full MTGNN forecasting model
The implemented component learns an adjacency matrix from node features. Each node is a stock. Each node feature vector combines temporal market embeddings and text context embeddings.
The core output is:
learned adjacency matrix: [batch, nodes, nodes]
This adjacency matrix is summarised into graph-level properties such as density, average degree, mean edge weight, entropy, and learned graph stress.
Temporal Encoder embeddings
+
FinBERT text embeddings
↓
MTGNN-style graph learner
↓
learned sparse adjacency
↓
graph properties
↓
Regime Risk Module
↓
Position Sizing / Quantitative Analyst / Fusion
The MTGNN graph builder is not exposed as a standalone final decision module. It is an internal structural component of the regime module.
The project uses only the MTGNN-style graph builder because:
This keeps module boundaries clean:
Temporal Encoder → market sequence representation
StemGNN → contagion risk
MTGNN graph builder → learned market connectivity graph for regime analysis
Path pattern:
outputs/embeddings/TemporalEncoder/chunk{N}_{split}_embeddings.npy
outputs/embeddings/TemporalEncoder/chunk{N}_{split}_manifest.csv
The manifest supplies:
ticker,date
The embedding file supplies a 256-dimensional temporal embedding per ticker-date row.
For a given graph snapshot date, the dataset finds the latest available temporal embedding for each selected node ticker at or before that date.
Path pattern:
outputs/embeddings/FinBERT/chunk{N}_{split}_embeddings.npy
outputs/embeddings/FinBERT/chunk{N}_{split}_metadata.csv
The text context is aggregated over a lookback window, default:
text_lookback_days = 30
The text embedding contributes a 256-dimensional representation. When a ticker mapping exists, the module aggregates text by ticker. If ticker mapping is unavailable, it falls back to broadcasting a global market-level text vector.
Path:
data/graphs/snapshots/edges_YYYY-MM-DD.csv
These snapshots come from the Cross-Asset Relation Data Builder. They are used to define the snapshot dates and to compute existing graph features such as density and correlation strength.
Path:
data/FRED_data/outputs/macro_features_trading_days_clean.csv
Default selected macro columns include:
yield_spread_10y2y
yield_spread_10y3m
credit_spread_baa_aaa
regime_yield_inverted
DFF
DGS10
DGS2
DGS3MO
BAA10Y
AAA10Y
Macro features are not part of the graph builder itself, but they are concatenated with graph properties later inside the regime module.
Each node is a stock ticker.
For each selected node ticker at a snapshot date, the module constructs:
node_feature = [temporal_embedding, text_embedding]
Default dimensions:
| Component | Dimension | Source |
|---|---|---|
| Temporal embedding | 256 | Temporal Encoder |
| Text embedding | 256 | FinBERT aggregation |
| Combined node feature | 512 | Concatenation |
So for a graph with N selected stocks:
node_features: [N, 512]
During training/inference batches:
node_features: [batch, N, 512]
The default node_limit is 2,500, but HPO and fast runs may use a smaller node limit such as 768 or 512.
The node universe is selected from the temporal manifest and graph snapshots. The module tries to select tickers that are available in the embeddings and appear in the graph snapshots.
Important controls:
| Parameter | Meaning |
|---|---|
node_limit |
Maximum number of stock nodes retained |
hpo_node_limit |
Smaller node count used during HPO |
top_k |
Number of outgoing learned graph neighbours per node |
The purpose is to keep graph learning computationally manageable while preserving enough market coverage for regime structure.
The actual graph builder class is:
class MTGNNGraphLearner(nn.Module)
It contains:
node_encoder
query projection
key projection
top-k sparse adjacency construction
Each node feature is passed through an encoder:
Linear(input_dim → hidden_dim)
LayerNorm
GELU
Dropout
Linear(hidden_dim → graph_dim)
LayerNorm
GELU
This produces a graph-space node representation.
The graph learner projects node representations into query and key vectors:
q = W_q h
k = W_k h
Then computes pairwise scores:
score_ij = (q_i · k_j) / sqrt(graph_dim)
The diagonal is masked out so a node does not connect to itself.
Scores are passed through a sigmoid:
weight_ij = sigmoid(score_ij)
Then for each node only the top k neighbours are retained:
adjacency_i = top_k(weight_i)
All other edge weights are set to zero.
The final output is a sparse directed adjacency matrix:
adjacency: [batch, N, N]
The learned adjacency is converted into 7 graph-level properties by:
graph_properties_from_adjacency(adj)
| Index | Property | Meaning |
|---|---|---|
| 0 | density |
Fraction of possible edges retained |
| 1 | mean_degree_norm |
Average normalised node degree |
| 2 | std_degree_norm |
Dispersion of node degree |
| 3 | mean_weight |
Average learned edge strength |
| 4 | max_weight |
Strongest learned edge weight |
| 5 | graph_entropy |
How dispersed/uncertain edge distributions are |
| 6 | graph_stress |
Simple stress proxy: 0.5 × density + 0.5 × mean_weight |
These properties are compact, explainable summaries of the learned market graph.
The module uses two graph concepts.
Precomputed by build_cross_asset_graph.py:
data/graphs/snapshots/edges_YYYY-MM-DD.csv
They provide realised correlation-based graph features such as existing density, average degree, mean/max absolute correlation, and sector concentration.
Generated inside the model from node features:
Temporal Encoder embedding + FinBERT text context → learned adjacency
It provides feature-aware dynamic edge weights, top-k learned links, graph properties for regime classification, and XAI top edges.
The existing graph is data-engineered from realised returns. The learned graph is model-generated from multimodal node embeddings.
RegimeSnapshotDataset
├── loads Temporal Encoder embeddings
├── aggregates FinBERT text context
├── loads graph snapshot files
├── loads FRED macro rows
└── produces node_features + macro_features + label metadata
MTGNNGraphLearner
├── node encoder
├── query/key scoring
└── top-k adjacency
graph_properties_from_adjacency
└── converts learned adjacency into 7 graph properties
MTGNNRegimeModel
├── calls MTGNNGraphLearner
├── extracts graph properties
└── passes graph properties to downstream classifier
This document focuses on the graph-building parts, not the full classifier.
The build-graph command writes:
outputs/results/MTGNNRegime/graph_summary_chunk{N}_{split}.csv
These files summarise snapshot-level graph and stress metrics.
Typical columns include:
date
label
macro_stress_score
graph_stress_score
market_vol_21d
market_ret_21d
market_drawdown_63d
xsec_dispersion
existing_edges
existing_density
existing_avg_degree_norm
existing_mean_abs_corr
existing_max_abs_corr
sector_concentration
When full regime prediction is run, learned graph properties are saved in:
outputs/results/MTGNNRegime/predictions_chunk{N}_{split}.csv
Relevant graph-builder columns include:
graph_density
avg_degree_norm
std_degree_norm
mean_edge_weight
max_edge_weight
graph_entropy
learned_graph_stress
XAI files are saved under:
outputs/results/MTGNNRegime/xai/
Graph-builder-relevant XAI includes Level 1 graph properties, Level 2 top learned edges, graph-diff records showing added/removed top edges between snapshots, and counterfactual behaviour when node/text features are altered.
python -m py_compile code/gnn/mtgnn_regime.py
python code/gnn/mtgnn_regime.py inspect --repo-root .
python code/gnn/mtgnn_regime.py smoke --repo-root . --device cuda
python code/gnn/mtgnn_regime.py build-graph --repo-root . --chunk 1 --split train --device cuda --node-limit 512 --max-snapshots 5
This command is the cleanest graph-builder-specific run.
python code/gnn/mtgnn_regime.py build-graph --repo-root . --chunk 1 --split val --device cuda --node-limit 768
python code/gnn/mtgnn_regime.py build-graph --repo-root . --chunk 1 --split test --device cuda --node-limit 768
A successful graph-builder run should show:
snapshots > 0
nodes > 0
macro_cols > 0
node_feature_finite = 1.000000 or very close
label_counts printed for snapshot records
graph_summary_chunk{N}_{split}.csv saved
Example style of expected output:
Building chunk1_train: snapshots=63, nodes=768, macro_cols=10
chunk1_train: 61 samples | label_counts={...} | node_feature_finite=1.000000
Saved graph summary: outputs/results/MTGNNRegime/graph_summary_chunk1_train.csv
Validation command:
python - <<'PY'
import pandas as pd
from pathlib import Path
for p in sorted(Path('outputs/results/MTGNNRegime').glob('graph_summary_chunk*_*.csv')):
df = pd.read_csv(p)
print('\n', p)
print('shape:', df.shape)
print('columns:', list(df.columns))
print(df.head().to_string(index=False))
PY
This MTGNN documentation is intentionally limited. The current project does not use MTGNN as a full time-series forecasting model. Therefore:
The graph builder is used only to construct a learned graph representation and graph-property features for regime analysis.
The full regime-risk module documentation should separately explain regime labels, classifier training, macro stress interpretation, prediction outputs, regime XAI, and how regime risk affects position sizing.
The MTGNN-style graph builder gives the project a learned, multimodal market-connectivity graph. It combines:
Temporal Encoder features
FinBERT text context
Top-k learned adjacency
Graph property extraction
This is the correct use of MTGNN for the current architecture because it avoids adding another redundant forecasting model and instead uses MTGNN where it is most useful: adaptive graph construction.
Final role:
MTGNN graph builder = internal graph construction layer for regime risk
not:
MTGNN = standalone final predictive module
This keeps the architecture clean, explainable, and consistent with the project’s distributed modular design.