Research paper · Finance & Machine Learning

Forecasting Returns in Thin Markets: A Machine Learning Approach to the Zagreb Stock Exchange

Mislav Šagovac · Luka Šikić · Petra Palić

Zagreb Stock Exchange (ZSE) · sample period 2000–2024

1,100+ predictors

4 ML models

24 years of data

1.58 peak Sharpe ratio

Abstract

Purpose. This paper investigates the out-of-sample predictability of weekly stock returns on the Zagreb Stock Exchange (ZSE), a thin frontier market characterized by low liquidity and concentrated ownership, and asks whether machine learning can extract predictive signal in this environment.

Methodology. We construct over 1,100 predictors from daily OHLCV data (2000–2024) — technical indicators, time-series features, and wavelet decompositions — and evaluate four models (Elastic Net, Random Forest, XGBoost, and a shallow neural network) within a rigorous nested rolling-window cross-validation framework, assessed via statistical metrics and a realistic portfolio backtest.

Results. Directional accuracy is modest (46–53%), with nonlinear ensembles and neural networks outperforming the linear benchmark. A key finding is a strong monotonic liquidity gradient: portfolio Sharpe ratios rise from 0.17 for the most liquid stocks to 1.58 for the full universe (up to 1.97 for the best individual model).

Conclusion. Machine learning generates economically significant signals in frontier markets, but predictability concentrates in thinly traded stocks where transaction costs and market depth constrain practical implementation.

Read online (HTML) Download PDF

Key findings

Modest but real predictability Directional accuracy of 46–53%, exceeding random chance and a random-walk benchmark.
Nonlinear models win Random Forest, XGBoost, and the neural network beat the penalized-linear Elastic Net in portfolio terms.
Ensembles add value Forecast combinations (mean / median) match or exceed the best individual model.
Liquidity gradient Risk-adjusted performance rises monotonically as the universe widens to include less-liquid stocks — Sharpe 0.17 → 1.58.
Implementation caveat The highest gross returns come from illiquid stocks where transaction costs bite hardest — backtest figures are an upper bound.

Earlier version

Croatian · prior version

Primjena modela strojnog učenja za predviđanje očekivanih prinosa dionica u RH

An earlier, Croatian-language version of this research line on machine learning for ZSE return prediction.

Read online (HTML) → PDF → DOCX →

Reproducibility: the main paper's tables are self-contained and render in full. Two figures reference chart images not bundled in this repository — add fig3_cropped.png and fig4_cropped.png to the paper/ folder and re-render to display them. The earlier Croatian paper depends on proprietary data, so its analysis code is shown but not executed.