** We present a new systematic approach to constructing spherical codes in dimensions $2^k$, based on Hopf foliations. Using the fact that a sphere $S^{2n-1}$ is foliated by manifolds $S_{\cos\eta}^{n-1} \times S_{\sin\eta}^{n-1}$, $\eta\in[0,\pi/2]$, we distribute points in dimension $2^k$ via a recursive algorithm from a basic construction in $\mathbb{R}^4$. Our procedure outperforms some current constructive methods in several small-distance regimes and constitutes a compromise between achieving a large number of codewords for a minimum given distance and effective constructiveness with low encoding computational cost. Bounds for the asymptotic density are derived and compared with other constructions. The encoding process has storage complexity $O(n)$ and time complexity $O(n \log n)$. We also propose a sub-optimal decoding procedure, which does not require storing the codebook and has time complexity $O(n \log n)$. **

** We propose a novel scheme for efficient Dirac mixture modeling of distributions on unit hyperspheres. A so-called hyperspherical localized cumulative distribution (HLCD) is introduced as a local and smooth characterization of the underlying continuous density in hyperspherical domains. Based on HLCD, a manifold-adapted modification of the Cram\'er-von Mises distance (HCvMD) is established to measure the statistical divergence between two Dirac mixtures of arbitrary dimensions. Given a (source) Dirac mixture with many components representing an unknown hyperspherical distribution, a (target) Dirac mixture with fewer components is obtained via matching the source in the sense of least HCvMD. As the number of target Dirac components is configurable, the underlying distributions is represented in a more efficient and informative way. Based upon this hyperspherical Dirac mixture reapproximation (HDMR), we derive a density estimation method and a recursive filter. For density estimation, a maximum likelihood method is provided to reconstruct the underlying continuous distribution in the form of a von Mises-Fisher mixture. For recursive filtering, we introduce the hyperspherical reapproximation discrete filter (HRDF) for nonlinear hyperspherical estimation of dynamic systems under unknown system noise of arbitrary form. Simulations show that the HRDF delivers superior tracking performance over filters using sequential Monte Carlo and parametric modeling. **

** In this paper, we investigate application of mathematical optimization to construction of a cubature formula on Wiener space, which is a weak approximation method of stochastic differential equations introduced by Lyons and Victoir (Cubature on Wiener Space, Proc. R. Soc. Lond. A 460, 169--198). After giving a brief review of the cubature theory on Wiener space, we show that a cubature formula of general dimension and degree can be obtained through a Monte Carlo sampling and linear programming. This paper also includes an extension of stochastic Tchakaloff's theorem, which technically yields the proof of our primary result. **

** Numerical models of weather and climate critically depend on long-term stability of integrators for systems of hyperbolic conservation laws. While such stability is often obtained from (physical or numerical) dissipation terms, physical fidelity of such simulations also depends on properly preserving conserved quantities, such as energy, of the system. To address this apparent paradox, we develop a variational integrator for the shallow water equations that conserves energy, but dissipates potential enstrophy. Our approach follows the continuous selective decay framework [F. Gay-Balmaz and D. Holm. Selective decay by Casimir dissipation in inviscid fluids. Nonlinearity, 26(2):495, 2013], which enables dissipating an otherwise conserved quantity while conserving the total energy. We use this in combination with the variational discretization method [D. Pavlov, P. Mullen, Y. Tong, E. Kanso, J. Marsden and M. Desbrun. Structure-preserving discretization of incompressible fluids. Physica D: Nonlinear Phenomena, 240(6):443-458, 2011] to obtain a discrete selective decay framework. This is applied to the shallow water equations, both in the plane and on the sphere, to dissipate the potential enstrophy. The resulting scheme significantly improves the quality of the approximate solutions, enabling long-term integrations to be carried out. **

** In this paper we compute the spherical Fourier expansions coefficients for the restriction of the generalised Wendland functions from $d-$dimensional Euclidean space to the (d-1)-dimensional unit sphere. The development required to derive these coefficients relies heavily upon known asymptotic results for hypergeometric functions and the final result shows that they can be expressed in closed form as a multiple of a certain $_{3}F_{2}$ hypergeometric function. Using the closed form expressions we are able to provide the precise asymptotic rates of decay for the spherical Fourier coefficients which we observe have a close connection to the asymptotic decay rate of the corresponding Euclidean Fourier transform. **

** The list-decodable code has been an active topic in theoretical computer science since the seminal papers of M. Sudan and V. Guruswami in 1997-1998. List-decodable codes are also considered in rank-metric, subspace metric, cover-metric, pair metric and insdel metric settings. In this paper we show that rates, list-decodable radius and list sizes are closely related to the classical topic of covering codes. We prove new general simple but strong upper bounds for list-decodable codes in general finite metric spaces based on various covering codes of finite metric spaces. The general covering code upper bounds can apply to the case when the volumes of the balls depend on the centers, not only on the radius case. Then any good upper bound on the covering radius or the size of covering code imply a good upper bound on the size of list-decodable codes.Our results give exponential improvements on the recent generalized Singleton upper bound in STOC 2020 for Hamming metric list-decodable codes, when the code lengths are large. Even for the list size $L=1$ case our covering code upper bounds give highly non-trivial upper bounds on the sizes of codes with the given minimum distance.The generalized Singleton upper bound for average-radius list-decodable codes is given. The asymptotic forms of covering code bounds can partially recover the Blinovsky bound and the combinatorial bound of Guruswami-H{\aa}stad-Sudan-Zuckerman in Hamming metric setting. We also suggest to study the combinatorial covering list-decodable codes as a natural generalization of combinatorial list-decodable codes. We apply our general covering code upper bounds for list-decodable rank-metric codes, list-decodable subspace codes, list-decodable insertion codes and list-decodable deletion codes. Some new better results about non-list-decodability of rank-metric codes and subspace codes are obtained. **

** We present a new algorithmic framework for grouped variable selection that is based on discrete mathematical optimization. While there exist several appealing approaches based on convex relaxations and nonconvex heuristics, we focus on optimal solutions for the $\ell_0$-regularized formulation, a problem that is relatively unexplored due to computational challenges. Our methodology covers both high-dimensional linear regression and nonparametric sparse additive modeling with smooth components. Our algorithmic framework consists of approximate and exact algorithms. The approximate algorithms are based on coordinate descent and local search, with runtimes comparable to popular sparse learning algorithms. Our exact algorithm is based on a standalone branch-and-bound (BnB) framework, which can solve the associated mixed integer programming (MIP) problem to certified optimality. By exploiting the problem structure, our custom BnB algorithm can solve to optimality problem instances with $5 \times 10^6$ features and $10^3$ observations in minutes to hours -- over $1000$ times larger than what is currently possible using state-of-the-art commercial MIP solvers. We also explore statistical properties of the $\ell_0$-based estimators. We demonstrate, theoretically and empirically, that our proposed estimators have an edge over popular group-sparse estimators in terms of statistical performance in various regimes. We provide an open-source implementation of our proposed framework. **

** Factorization of matrices where the rank of the two factors diverges linearly with their sizes has many applications in diverse areas such as unsupervised representation learning, dictionary learning or sparse coding. We consider a setting where the two factors are generated from known component-wise independent prior distributions, and the statistician observes a (possibly noisy) component-wise function of their matrix product. In the limit where the dimensions of the matrices tend to infinity, but their ratios remain fixed, we expect to be able to derive closed form expressions for the optimal mean squared error on the estimation of the two factors. However, this remains a very involved mathematical and algorithmic problem. A related, but simpler, problem is extensive-rank matrix denoising, where one aims to reconstruct a matrix with extensive but usually small rank from noisy measurements. In this paper, we approach both these problems using high-temperature expansions at fixed order parameters. This allows to clarify how previous attempts at solving these problems failed at finding an asymptotically exact solution. We provide a systematic way to derive the corrections to these existing approximations, taking into account the structure of correlations particular to the problem. Finally, we illustrate our approach in detail on the case of extensive-rank matrix denoising. We compare our results with known optimal rotationally-invariant estimators, and show how exact asymptotic calculations of the minimal error can be performed using extensive-rank matrix integrals. **

** The recently introduced polar codes constitute a breakthrough in coding theory due to their capacityachieving property. This goes hand in hand with a quasilinear construction, encoding, and successive cancellation list decoding procedures based on the Plotkin construction. The decoding algorithm can be applied with slight modifications to Reed-Muller or eBCH codes, that both achieve the capacity of erasure channels, although the list size needed for good performance grows too fast to make the decoding practical even for moderate block lengths. The key ingredient for proving the capacity-achieving property of Reed-Muller and eBCH codes is their group of symmetries. It can be plugged into the concept of Plotkin decomposition to design various permutation decoding algorithms. Although such techniques allow to outperform the straightforward polar-like decoding, the complexity stays impractical. In this paper, we show that although invariance under a large automorphism group is valuable in a theoretical sense, it also ensures that the list size needed for good performance grows exponentially. We further establish the bounds that arise if we sacrifice some of the symmetries. Although the theoretical analysis of the list decoding algorithm remains an open problem, our result provides an insight into the factors that impact the decoding complexity. **

** The Variational Auto-Encoder (VAE) is one of the most used unsupervised machine learning models. But although the default choice of a Gaussian distribution for both the prior and posterior represents a mathematically convenient distribution often leading to competitive results, we show that this parameterization fails to model data with a latent hyperspherical structure. To address this issue we propose using a von Mises-Fisher (vMF) distribution instead, leading to a hyperspherical latent space. Through a series of experiments we show how such a hyperspherical VAE, or $\mathcal{S}$-VAE, is more suitable for capturing data with a hyperspherical latent structure, while outperforming a normal, $\mathcal{N}$-VAE, in low dimensions on other data types. **