Neural Networks in Trading: An Ensemble of Agents with Attention Mechanisms (Final Part)

Portfolio management thrives on timely reallocation—shifting capital across assets to maximize return while managing risk. A recent study, “Developing an attention-based ensemble learning framework for financial portfolio optimisation,” proposes MASAAT: a multi-agent, attention-driven framework that blends time-series modeling with cross-asset analysis. Its core idea is to deploy agents that read markets at multiple granularities, rebalance continuously, and seek a robust profit–risk trade-off in volatile regimes.

From Directional Filters to Attention Tokens

Each agent uses directional movement filters with different thresholds to detect meaningful price swings, extracting trend signatures from price series. MASAAT introduces a sequence tokenization scheme that powers two attention modules:

Cross-Sectional Attention (CSA): tokens derived from per-asset indicators to capture asset-to-asset dependencies.
Temporal Analysis (TA): tokens built from temporal patterns to capture time-to-time relationships.

Agents then fuse CSA and TA outputs via attention to uncover per-asset dependencies across all time points in the observation window.

Same Architecture, Different Perspective: CSA vs. TA

CSA and TA are nearly isomorphic: they analyze the same data from orthogonal perspectives. That symmetry suggests a practical shortcut—reuse the CSA machinery for the TA module by transposing the input tensor. In MASAAT, data arrive as a 3D tensor [Agent, Asset, Time]. For TA, the last two dimensions must be swapped. Instead of building a new OpenCL kernel, the implementation uses three existing layers in sequence:

Flatten and apply a 2D transpose: [Agent, [Asset, Time]] → [[Time, Asset], Agent].
3D transpose of the first two dims: [Time, Asset, Agent] → [Asset, Time, Agent].
Another 2D transpose to restore the agent dimension to front: [[Asset, Time], Agent] → [Agent, [Time, Asset]].

This pipeline is encapsulated in CNeuronTransposeVRCOCL. The TA module itself (CNeuronTemporalAnalysis) simply transposes the input and feeds it into the CSA module. Because the source time series is represented as multi-scale piecewise-linear segments (three values per directed segment), the TA window is tripled while the sequence length is reduced by a factor of three to preserve logical grouping.

Fusing Views to Build Portfolios

CSA and TA produce tensors enriched with cross-asset and cross-time dependencies. The CNeuronPortfolioGenerator combines them via attention:

TA output is used twice—original and transposed—so a 3D transpose layer is instantiated internally.
CSA output is multiplied by transposed TA output, then normalized with Softmax. Normalization is per asset, per agent (i.e., one head per asset–agent pair).
These coefficients are applied to the original TA output to form asset embeddings per agent.
A final projection aggregates all agents into a unified state embedding.

While the original MASAAT paper formulates a direct portfolio weight vector (summing to 1), this implementation slightly reframes the output. The end goal is an agent action vector (direction, size, SL/TP). Because account state is not in the input, the system outputs a rich hidden-state embedding capturing market context, leaving execution logic to downstream components.

From Blocks to System: The CNeuronMASAAT Class

With all blocks in place, CNeuronMASAAT assembles the pipeline and inherits CNeuronPortfolioGenerator as its parent (serving as the final stage). Key steps:

Input transposition prepares data for multi-view analysis.
Multi-agent transformation (CNeuronPLRMultiAgentsOCL) builds multi-scale piecewise-linear representations.
Concatenate the transformed outputs with the original series; then feed to both CSA and TA.
Pass CSA and TA outputs to the inherited portfolio generator for spatiotemporal fusion and projection.

Two practical notes shape initialization. First, because of the piecewise-linear encoding (three points per segment), the effective sequence length is divided by three. Second, the number of agents equals the number of threshold levels, plus one extra agent that processes the raw series; prior studies show benefits from combining raw and transformed views.

Forward and Backward, Without the Boilerplate

The forward pass chains the modules in order and reuses tensors where needed (both the transposed original and the concatenated tensors are consumed twice, driving careful buffer management). Backprop follows the reverse pathways, splitting and summing gradients across the twin flows, handling Softmax and activation derivatives, and using auxiliary buffers in the transpose layer to reconcile gradients arriving from two routes.

Training Setup and Results

The implemented MASAAT variant was integrated into an Actor model and trained on EURUSD (H1). The 2023 period served as training data, using default indicator settings. A bootstrap-and-refresh routine periodically updated the dataset to track the evolving policy. Final testing ran on January 2024 data with fixed parameters.

Headline results:

16 trades executed over the test window.
A bit over one-third of trades were profitable, yet the maximum win was 2.5x the maximum loss.
Average profit per trade was about 3x the average loss, yielding a steadily rising equity curve.

Why It Matters

MASAAT’s strength is architectural: an ensemble of agents reading the market at multiple scales; dual attention views (cross-asset and cross-time) to reduce bias; and a fusion step that translates structure into actionable embeddings. Our implementation diverges from the paper in output interpretation, focusing on a general hidden state suitable for policy heads rather than direct weights. Still, the underlying approach—tokenized sequences, attention-driven correlations, and spatiotemporal fusion—proved effective in testing.

The full source code, configuration examples, and model architectures are available in the accompanying materials. In the next phase, we’ll broaden the asset universe and stress-test across regimes to assess generalization and robustness.

Maximizing Returns in Volatile Markets: An Attention-Driven Multi-Agent Framework for Financial Portfolio Optimization

Up next

Blueprint for Quantum AI: Transforming Data Centers with Photonic Processors and NVIDIA Technology

Author

Alex Rivera

Tags

Share article

Neural Networks in Trading: An Ensemble of Agents with Attention Mechanisms (Final Part)

From Directional Filters to Attention Tokens

Same Architecture, Different Perspective: CSA vs. TA

Fusing Views to Build Portfolios

From Blocks to System: The CNeuronMASAAT Class

Forward and Backward, Without the Boilerplate

Training Setup and Results

Why It Matters

Leave a Reply Cancel reply

Unlock Your Escape: Mastering Asylum Life Codes for Roblox Adventures

Challenging AI Boundaries: Yann LeCun on Limitations and Potentials of Large Language Models

Charting New Terrain: Physical Reservoir Computing and the Future of AI

Mastering Retail Logistics: Strategies to Meet Modern Consumer Demands and Overcome Challenges

Decoding Everyday Decisions: The Power of Operational Research and Model Thinking

Maximizing Returns in Volatile Markets: An Attention-Driven Multi-Agent Framework for Financial Portfolio Optimization

Up next

Author

Alex Rivera

Tags

Share article

Neural Networks in Trading: An Ensemble of Agents with Attention Mechanisms (Final Part)

From Directional Filters to Attention Tokens

Same Architecture, Different Perspective: CSA vs. TA

Fusing Views to Build Portfolios

From Blocks to System: The CNeuronMASAAT Class

Forward and Backward, Without the Boilerplate

Training Setup and Results

Why It Matters

Leave a Reply Cancel reply

You May Also Like