Geometric-Entropic Optimization: Integrating Optimal Transport with Riemannian Gradient Methods for Neural Network Training

Ferrara, Massimiliano2026-04-212026-04-2120260022-32391573-287810.1007/s10957-026-02958-82-s2.0-105033413245https://hdl.handle.net/123456789/9026https://doi.org/10.1007/s10957-026-02958-8We introduce Geometric-Entropic Optimization (GEO), an algorithm for neural network training that integrates Riemannian gradient methods with entropy-regularized optimal transport. The algorithm operates on a parameter manifold equipped with a combined Fisher-Wasserstein metric and incorporates Sinkhorn-type projections to enforce distributional constraints on layer activations. We establish convergence guarantees showing that GEO achieves an O(1/T)+O(rho 2K)\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$O(1/\sqrt{T}) + O(\rho <^>{2K})$$\end{document} rate, where the first term reflects Riemannian gradient descent and the second captures the contraction of Sinkhorn iterations. Computational experiments on continuous control tasks and language modeling demonstrate consistent improvements over standard optimizers, with performance gains of approximately 20% on benchmark tasks. The theoretical framework unifies recent architectural innovations in deep learning, including manifold-constrained connections and orthogonality-preserving updates within a coherent optimization-theoretic perspective rooted in the geometric dynamics tradition.eninfo:eu-repo/semantics/openAccessRiemannian OptimizationOptimal TransportNeural Network TrainingSinkhorn AlgorithmFisher Information MetricGeometric DynamicsGeometric-Entropic Optimization: Integrating Optimal Transport with Riemannian Gradient Methods for Neural Network TrainingArticle