File: code/analysts/qualitative_analyst.py
Module name: Qualitative Analyst
Project: An Explainable Multimodal Neural Framework for Financial Risk Management
Layer: Qualitative synthesis branch
Status: Trainable downstream analyst module
Consumes: Sentiment Analyst outputs + News Analyst outputs
Produces: Event-level and daily ticker-level qualitative signals for the Fusion Engine
Fundamentals used: No
The Qualitative Analyst is the trained synthesis module that combines text-side information from the Sentiment Analyst and News Analyst. It does not perform raw text encoding and it does not read SEC filings directly. Raw financial text is already encoded upstream by FinBERT, then processed by specialist downstream text analysts. This module learns how to combine those specialist outputs into a structured qualitative branch signal.
The module is intentionally separated from the Quantitative Analyst. The Quantitative Analyst uses technical, risk, and position-sizing outputs. The Qualitative Analyst uses text-derived outputs only.
The final system follows a modular, distributed design rather than a single monolithic model. The Qualitative Analyst exists to summarise textual market evidence into a form that can be merged later with the quantitative branch.
Its role is to answer:
“What does the available financial text evidence suggest about this ticker/date, and how risky or reliable is that text evidence?”
It does not answer:
“Should the final system buy or sell?”
That final action is handled later by the Fusion Engine and Final Trade Approver.
The simplified location of this module is:
FinBERT Encoder
↓
Sentiment Analyst News Analyst
↓ ↓
Qualitative Analyst
↓
Fusion Engine
↓
Final Trade Approver
The first design possibility was a purely rule-based qualitative synthesis layer. However, training is important in the project architecture because each major analyst module should learn a calibrated mapping rather than only applying fixed rules.
The implemented design therefore uses a lightweight trainable MLP calibration model.
The model learns from structured upstream analyst outputs, not from raw text. This is important because the raw language understanding is already handled by FinBERT and the text-specific analysts.
The Qualitative Analyst is trained using weak-supervised targets derived from sentiment and news outputs. This makes the module trainable even when no manually labelled qualitative synthesis dataset exists.
This is defensible because:
The module consumes two upstream prediction families.
Expected locations include:
outputs/results/analysts/sentiment/
outputs/results/SentimentAnalyst/
The script automatically searches for chunk/split-specific sentiment prediction files.
Expected file naming patterns include:
chunk{chunk}_{split}_predictions.csv
sentiment_predictions_chunk{chunk}_{split}.csv
predictions_chunk{chunk}_{split}.csv
Typical sentiment columns include:
ticker
filing_date or date
doc_id
accession
form_type
source_name
predicted_sentiment_score
predicted_sentiment_class
predicted_sentiment_confidence
predicted_sentiment_uncertainty
predicted_sentiment_magnitude
The module supports flexible column aliases. For example, predicted_sentiment_score, sentiment_score, and pred_sentiment_score can all be mapped to the internal sentiment_score field.
Expected locations include:
outputs/results/analysts/news/
outputs/results/NewsAnalyst/
Expected file naming patterns include:
chunk{chunk}_{split}_news_predictions.csv
news_predictions_chunk{chunk}_{split}.csv
predictions_chunk{chunk}_{split}.csv
Typical news columns include:
ticker
filing_date or date
doc_id
accession
form_type
source_name
predicted_news_event_impact
predicted_news_importance
predicted_risk_relevance
predicted_volatility_spike
predicted_drawdown_risk
predicted_news_uncertainty
The News Analyst is treated as an event-impact and event-risk source, not only a direction source.
The module converts all available sentiment and news columns into a fixed feature vector.
sentiment_score
sentiment_confidence
sentiment_uncertainty
sentiment_magnitude
sentiment_event_present
news_event_impact
news_importance
risk_relevance
volatility_spike
drawdown_risk
news_uncertainty
news_event_present
has_both_sources
These 13 features form the model input vector.
| Feature | Meaning |
|---|---|
sentiment_score |
Directional sentiment score, normally from -1 to +1 |
sentiment_confidence |
Confidence of the Sentiment Analyst prediction |
sentiment_uncertainty |
Uncertainty of the sentiment output |
sentiment_magnitude |
Strength/magnitude of the sentiment signal |
sentiment_event_present |
Binary indicator that sentiment evidence exists |
news_event_impact |
Directional market impact estimated by News Analyst |
news_importance |
Importance or relevance of the news/filing event |
risk_relevance |
How relevant the news event is to financial risk |
volatility_spike |
Estimated likelihood or severity of volatility increase |
drawdown_risk |
Estimated downside/drawdown risk from the news event |
news_uncertainty |
Uncertainty of the News Analyst output |
news_event_present |
Binary indicator that news evidence exists |
has_both_sources |
Binary indicator that both sentiment and news evidence exist |
The model predicts three continuous outputs:
qualitative_score
qualitative_risk_score
qualitative_confidence
qualitative_scoreRange:
-1 to +1
Interpretation:
-1 = strongly negative qualitative evidence
0 = neutral or no clear qualitative direction
+1 = strongly positive qualitative evidence
qualitative_risk_scoreRange:
0 to 1
Interpretation:
0 = low text-driven risk
1 = severe text-driven risk
This is separate from direction. A text event can be positive but still risky.
qualitative_confidenceRange:
0 to 1
Interpretation:
0 = unreliable or missing text evidence
1 = strong, confident qualitative evidence
Because there is no manually labelled qualitative synthesis dataset, the module constructs transparent weak-supervised targets from the upstream analyst outputs.
sentiment_direction = sentiment_score
news_direction = news_event_impact × news_importance
qualitative_score_target = 0.55 × sentiment_direction + 0.45 × news_direction
The result is clipped to:
[-1, +1]
Sentiment receives slightly higher weight because it is already a direct polarity signal, while news impact is event-based and may be noisier.
qualitative_risk_target =
0.35 × risk_relevance
+ 0.30 × volatility_spike
+ 0.25 × drawdown_risk
+ 0.10 × news_uncertainty
The result is clipped to:
[0, 1]
This makes the model sensitive to text that warns of tail risk, instability, volatility, or downside pressure.
qualitative_confidence_target =
0.40 × sentiment_confidence
+ 0.25 × (1 - sentiment_uncertainty)
+ 0.20 × news_importance
+ 0.15 × (1 - news_uncertainty)
The result is clipped to:
[0, 1]
If neither sentiment nor news evidence exists, confidence is forced to 0.
The Qualitative Analyst uses a small MLP.
Input: 13-dimensional feature vector
↓
Linear layer(s)
↓
Tanh activation
↓
Dropout
↓
Linear output head
↓
Output transformations
The output transformations are:
qualitative_score = tanh(raw_output_0)
qualitative_risk = sigmoid(raw_output_1)
qualitative_confidence = sigmoid(raw_output_2)
This ensures valid output ranges:
qualitative_score ∈ [-1, +1]
qualitative_risk ∈ [0, 1]
qualitative_confidence ∈ [0, 1]
Default architecture:
input_dim = 13
hidden_dim = 64
n_layers = 2
dropout = 0.15
activation = tanh
output_dim = 3
The hidden activation is tanh, consistent with the project preference for MLP hidden layers.
The module uses a weighted multi-output regression loss.
score_loss = MSE(predicted_qualitative_score, qualitative_score_target)
risk_loss = MSE(predicted_qualitative_risk, qualitative_risk_target)
conf_loss = MSE(predicted_qualitative_confidence, qualitative_confidence_target)
Final loss:
loss = 0.40 × score_loss + 0.35 × risk_loss + 0.25 × confidence_loss
The score and risk outputs receive higher emphasis because they are more directly used in the Fusion Engine.
The module supports Optuna TPE hyperparameter search.
Search parameters:
hidden_dim: [32, 64, 96, 128]
n_layers: 1 to 3
dropout: 0.05 to 0.40
learning_rate: 1e-4 to 3e-3
weight_decay: 1e-7 to 1e-3
batch_size: [256, 512, 1024]
HPO is stored in SQLite:
outputs/codeResults/QualitativeAnalyst/hpo_chunk{chunk}.db
Best parameters are saved as:
outputs/codeResults/QualitativeAnalyst/best_params_chunk{chunk}.json
The HPO command is resumable unless --fresh is used.
For each chunk, the module saves:
outputs/models/QualitativeAnalyst/chunk{chunk}/best_model.pt
outputs/models/QualitativeAnalyst/chunk{chunk}/final_model.pt
outputs/models/QualitativeAnalyst/chunk{chunk}/scaler.npz
outputs/models/QualitativeAnalyst/chunk{chunk}/training_history.csv
outputs/models/QualitativeAnalyst/chunk{chunk}/model_freezed/model.pt
outputs/models/QualitativeAnalyst/chunk{chunk}/model_unfreezed/model.pt
The model checkpoint contains:
state_dict
config
feature_names
target_names
best_val_loss
The scaler contains train-fitted feature mean and standard deviation. The scaler is fitted only on the training split to avoid validation/test leakage.
The module produces two prediction outputs per chunk/split.
Path:
outputs/results/QualitativeAnalyst/qualitative_events_chunk{chunk}_{split}.csv
Purpose:
Preserve individual text-event evidence for explanation and audit.
Important columns:
ticker
date
chunk
split
doc_id
accession
form_type
source_name
sentiment_score
sentiment_confidence
sentiment_uncertainty
sentiment_magnitude
news_event_impact
news_importance
risk_relevance
volatility_spike
drawdown_risk
news_uncertainty
sentiment_direction_score
news_direction_score
qualitative_score
qualitative_risk_score
qualitative_confidence
qualitative_recommendation
dominant_qualitative_driver
xai_summary
Path:
outputs/results/QualitativeAnalyst/qualitative_daily_chunk{chunk}_{split}.csv
Purpose:
Provide ticker/date aligned qualitative signals for the Fusion Engine.
Important columns:
ticker
date
chunk
split
event_count
sentiment_event_count
news_event_count
qualitative_score
qualitative_risk_score
qualitative_confidence
qualitative_recommendation
max_event_risk_score
mean_event_risk_score
mean_sentiment_score
mean_news_impact_score
mean_news_importance
dominant_qualitative_driver
xai_summary
The event-level output is aggregated by:
ticker, date
daily_qualitative_score = confidence-weighted mean of event qualitative_score
daily_qualitative_risk_score =
0.60 × max(event_qualitative_risk_score)
+ 0.40 × weighted_mean(event_qualitative_risk_score)
This prevents severe risk disclosures from being averaged away by many neutral events.
daily_qualitative_confidence = mean(event_confidence) × min(1, log1p(event_count) / log(5))
This rewards repeated supporting evidence but caps the boost.
The module converts qualitative score, risk, and confidence into a qualitative recommendation.
BUY:
qualitative_score > +0.20
qualitative_risk_score < 0.70
qualitative_confidence ≥ 0.40
SELL:
qualitative_score < -0.20
OR qualitative_risk_score > 0.85
HOLD:
otherwise
This conservative logic ensures that the qualitative branch does not overreact to weak or uncertain text signals.
The module provides both row-level and file-level XAI.
Each event and daily row contains an xai_summary column.
Example:
SELL: qualitative_score=-0.421, risk=0.881, confidence=0.624; driver=drawdown_risk; sentiment=-0.350; news_impact=-0.220.
Each event row includes:
dominant_qualitative_driver
Possible values include:
sentiment
news_impact
risk_relevance
volatility_spike
drawdown_risk
news_uncertainty
This tells which upstream factor most influenced the qualitative output.
The module computes input-gradient feature importance for a sample of rows.
XAI output path:
outputs/results/QualitativeAnalyst/xai/qualitative_chunk{chunk}_{split}_xai_summary.json
The JSON contains:
module
chunk
split
config
model metadata
event row count
daily row count
ticker count
date range
event recommendation counts
daily recommendation counts
dominant driver counts
gradient feature importance
top positive event examples
top negative event examples
highest risk event examples
plain-English explanation
summary statistics
The Fusion Engine should consume the daily qualitative file:
outputs/results/QualitativeAnalyst/qualitative_daily_chunk{chunk}_{split}.csv
It should merge on:
ticker, date
with:
outputs/results/QuantitativeAnalyst/quantitative_analysis_chunk{chunk}_{split}.csv
outputs/results/PositionSizing/position_sizing_chunk{chunk}_{split}.csv
Important interpretation rule:
Missing qualitative row = no text evidence available
It should not be interpreted as positive or negative sentiment.
Recommended Fusion defaults for missing qualitative evidence:
qualitative_score = 0.0
qualitative_confidence = 0.0
qualitative_risk_score = 0.5 or neutral default
This ensures that absent text does not falsely strengthen the decision.
This module intentionally excludes fundamentals.
It does not read:
fundamental embeddings
fundamental analyst outputs
SEC tabular fundamentals
valuation ratios
balance-sheet strength outputs
The current qualitative branch is limited to:
Sentiment Analyst + News Analyst
This preserves the current approved architecture.
The module follows these anti-leakage rules:
The weak-supervised targets are derived from same-row upstream analyst outputs, so they do not introduce future price information.
The Qualitative Analyst has several strengths:
This version has limitations that should be acknowledged in the report.
These limitations are acceptable because this module is a synthesis component, not a primary text encoder.
cd ~/fin-glassbox && python -m py_compile code/analysts/qualitative_analyst.py
cd ~/fin-glassbox && python code/analysts/qualitative_analyst.py inspect --repo-root .
cd ~/fin-glassbox && python code/analysts/qualitative_analyst.py smoke --repo-root . --device cuda
cd ~/fin-glassbox && python code/analysts/qualitative_analyst.py hpo --repo-root . --chunk 1 --trials 30 --device cuda --fresh
cd ~/fin-glassbox && python code/analysts/qualitative_analyst.py train-best --repo-root . --chunk 1 --device cuda --fresh
cd ~/fin-glassbox && python code/analysts/qualitative_analyst.py predict --repo-root . --chunk 1 --split test --device cuda
cd ~/fin-glassbox && python code/analysts/qualitative_analyst.py validate --repo-root . --chunk 1 --split test
Before launching the full run, execute a small sanity run:
cd ~/fin-glassbox && python -m py_compile code/analysts/qualitative_analyst.py && python code/analysts/qualitative_analyst.py inspect --repo-root . && python code/analysts/qualitative_analyst.py smoke --repo-root . --device cuda
Then run one chunk with a small number of trials:
cd ~/fin-glassbox && python code/analysts/qualitative_analyst.py hpo --repo-root . --chunk 1 --trials 3 --device cuda --fresh && python code/analysts/qualitative_analyst.py train-best --repo-root . --chunk 1 --device cuda --fresh && python code/analysts/qualitative_analyst.py predict --repo-root . --chunk 1 --split val --device cuda && python code/analysts/qualitative_analyst.py predict --repo-root . --chunk 1 --split test --device cuda && python code/analysts/qualitative_analyst.py validate --repo-root . --chunk 1 --split test
If this passes, run the full version.
Use this after sentiment and news outputs exist for all chunks.
cd ~/fin-glassbox && python code/analysts/qualitative_analyst.py hpo --repo-root . --chunk 1 --trials 30 --device cuda --fresh && python code/analysts/qualitative_analyst.py train-best --repo-root . --chunk 1 --device cuda --fresh && python code/analysts/qualitative_analyst.py predict-all --repo-root . --chunks 1 --splits val test --device cuda && python code/analysts/qualitative_analyst.py hpo --repo-root . --chunk 2 --trials 30 --device cuda --fresh && python code/analysts/qualitative_analyst.py train-best --repo-root . --chunk 2 --device cuda --fresh && python code/analysts/qualitative_analyst.py predict-all --repo-root . --chunks 2 --splits val test --device cuda && python code/analysts/qualitative_analyst.py hpo --repo-root . --chunk 3 --trials 30 --device cuda --fresh && python code/analysts/qualitative_analyst.py train-best --repo-root . --chunk 3 --device cuda --fresh && python code/analysts/qualitative_analyst.py predict-all --repo-root . --chunks 3 --splits val test --device cuda
After FinBERT, Sentiment Analyst, and News Analyst are fully rerun, the complete qualitative branch can be refreshed with this command:
cd ~/fin-glassbox && python code/analysts/sentiment_analyst.py hpo --repo-root . --chunk 1 --trials 30 --device cuda --fresh && python code/analysts/sentiment_analyst.py train-best --repo-root . --chunk 1 --device cuda --fresh && python code/analysts/sentiment_analyst.py predict --repo-root . --chunk 1 --split val --device cuda && python code/analysts/sentiment_analyst.py predict --repo-root . --chunk 1 --split test --device cuda && python code/analysts/news_analyst.py hpo --repo-root . --chunk 1 --trials 30 --device cuda --fresh && python code/analysts/news_analyst.py train-best --repo-root . --chunk 1 --device cuda --fresh && python code/analysts/news_analyst.py predict --repo-root . --chunk 1 --split val --device cuda && python code/analysts/news_analyst.py predict --repo-root . --chunk 1 --split test --device cuda && python code/analysts/qualitative_analyst.py hpo --repo-root . --chunk 1 --trials 30 --device cuda --fresh && python code/analysts/qualitative_analyst.py train-best --repo-root . --chunk 1 --device cuda --fresh && python code/analysts/qualitative_analyst.py predict-all --repo-root . --chunks 1 --splits val test --device cuda && python code/analysts/sentiment_analyst.py hpo --repo-root . --chunk 2 --trials 30 --device cuda --fresh && python code/analysts/sentiment_analyst.py train-best --repo-root . --chunk 2 --device cuda --fresh && python code/analysts/sentiment_analyst.py predict --repo-root . --chunk 2 --split val --device cuda && python code/analysts/sentiment_analyst.py predict --repo-root . --chunk 2 --split test --device cuda && python code/analysts/news_analyst.py hpo --repo-root . --chunk 2 --trials 30 --device cuda --fresh && python code/analysts/news_analyst.py train-best --repo-root . --chunk 2 --device cuda --fresh && python code/analysts/news_analyst.py predict --repo-root . --chunk 2 --split val --device cuda && python code/analysts/news_analyst.py predict --repo-root . --chunk 2 --split test --device cuda && python code/analysts/qualitative_analyst.py hpo --repo-root . --chunk 2 --trials 30 --device cuda --fresh && python code/analysts/qualitative_analyst.py train-best --repo-root . --chunk 2 --device cuda --fresh && python code/analysts/qualitative_analyst.py predict-all --repo-root . --chunks 2 --splits val test --device cuda && python code/analysts/sentiment_analyst.py hpo --repo-root . --chunk 3 --trials 30 --device cuda --fresh && python code/analysts/sentiment_analyst.py train-best --repo-root . --chunk 3 --device cuda --fresh && python code/analysts/sentiment_analyst.py predict --repo-root . --chunk 3 --split val --device cuda && python code/analysts/sentiment_analyst.py predict --repo-root . --chunk 3 --split test --device cuda && python code/analysts/news_analyst.py hpo --repo-root . --chunk 3 --trials 30 --device cuda --fresh && python code/analysts/news_analyst.py train-best --repo-root . --chunk 3 --device cuda --fresh && python code/analysts/news_analyst.py predict --repo-root . --chunk 3 --split val --device cuda && python code/analysts/news_analyst.py predict --repo-root . --chunk 3 --split test --device cuda && python code/analysts/qualitative_analyst.py hpo --repo-root . --chunk 3 --trials 30 --device cuda --fresh && python code/analysts/qualitative_analyst.py train-best --repo-root . --chunk 3 --device cuda --fresh && python code/analysts/qualitative_analyst.py predict-all --repo-root . --chunks 3 --splits val test --device cuda
After a successful full run, the following files should exist:
outputs/models/QualitativeAnalyst/chunk1/best_model.pt
outputs/models/QualitativeAnalyst/chunk2/best_model.pt
outputs/models/QualitativeAnalyst/chunk3/best_model.pt
outputs/results/QualitativeAnalyst/qualitative_events_chunk1_val.csv
outputs/results/QualitativeAnalyst/qualitative_events_chunk1_test.csv
outputs/results/QualitativeAnalyst/qualitative_events_chunk2_val.csv
outputs/results/QualitativeAnalyst/qualitative_events_chunk2_test.csv
outputs/results/QualitativeAnalyst/qualitative_events_chunk3_val.csv
outputs/results/QualitativeAnalyst/qualitative_events_chunk3_test.csv
outputs/results/QualitativeAnalyst/qualitative_daily_chunk1_val.csv
outputs/results/QualitativeAnalyst/qualitative_daily_chunk1_test.csv
outputs/results/QualitativeAnalyst/qualitative_daily_chunk2_val.csv
outputs/results/QualitativeAnalyst/qualitative_daily_chunk2_test.csv
outputs/results/QualitativeAnalyst/qualitative_daily_chunk3_val.csv
outputs/results/QualitativeAnalyst/qualitative_daily_chunk3_test.csv
outputs/results/QualitativeAnalyst/xai/qualitative_chunk1_val_xai_summary.json
outputs/results/QualitativeAnalyst/xai/qualitative_chunk1_test_xai_summary.json
outputs/results/QualitativeAnalyst/xai/qualitative_chunk2_val_xai_summary.json
outputs/results/QualitativeAnalyst/xai/qualitative_chunk2_test_xai_summary.json
outputs/results/QualitativeAnalyst/xai/qualitative_chunk3_val_xai_summary.json
outputs/results/QualitativeAnalyst/xai/qualitative_chunk3_test_xai_summary.json
A qualitative run is considered successful if:
1. Training completes and best_model.pt is saved.
2. scaler.npz exists for each chunk.
3. Event-level outputs exist for validation and test splits.
4. Daily outputs exist for validation and test splits.
5. XAI JSON outputs exist for validation and test splits.
6. qualitative_score is always within [-1, +1].
7. qualitative_risk_score is always within [0, 1].
8. qualitative_confidence is always within [0, 1].
9. Recommendation labels are only BUY, HOLD, SELL.
10. Fusion can merge daily qualitative outputs on ticker/date.
The Qualitative Analyst is a trainable synthesis module that combines outputs from the Sentiment Analyst and News Analyst into a daily ticker-level qualitative signal. Rather than directly reprocessing raw text, it learns a calibrated mapping from upstream text-specialist predictions into qualitative direction, qualitative risk, and qualitative confidence. The module is trained using transparent weak-supervised targets derived from sentiment polarity, news impact, risk relevance, volatility-spike likelihood, drawdown-risk likelihood, and uncertainty. It produces event-level outputs for auditability and ticker-date daily outputs for integration into the Fusion Engine. Explainability is provided through row-level explanations, dominant-driver attribution, and gradient-based feature importance.
The Qualitative Analyst completes the qualitative branch of the architecture. It transforms sparse, event-level text analysis into structured, trainable, explainable daily qualitative signals. It is designed to be lightweight, reproducible, auditable, and directly compatible with the upcoming Fusion Engine.
Its main contribution is:
Sentiment + News → trained qualitative direction/risk/confidence → Fusion-ready daily signal
This keeps the full system modular, explainable, and defensible.