
Rdimtools - Dimension Reduction and Estimation Methods
We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.
Last updated
dimension-estimationdimension-reductionmanifold-learningsubspace-learningopenblascppopenmp
8.80 score 56 stars 8 dependents 226 scripts 1.0k downloadsT4transport - Tools for Computational Optimal Transport
Transport theory has seen much success in many fields of statistics and machine learning. We provide a variety of algorithms to compute Wasserstein distance, barycenter, and others. See Peyré and Cuturi (2019) <doi:10.1561/2200000073> for the general exposition to the study of computational optimal transport.
Last updated
openblascppopenmp
6.28 score 7 stars 1 dependents 13 scripts 259 downloadsADMM - Algorithms using Alternating Direction Method of Multipliers
Provides algorithms to solve popular optimization problems in statistics such as regression or denoising based on Alternating Direction Method of Multipliers (ADMM). See Boyd et al (2010) <doi:10.1561/2200000016> for complete introduction to the method.
Last updated
openblascppopenmp
6.21 score 7 stars 9 dependents 17 scripts 665 downloadsmaotai - Tools for Matrix Algebra, Optimization and Inference
Matrix is an universal and sometimes primary object/unit in applied mathematics and statistics. We provide a number of algorithms for selected problems in optimization and statistical inference. For general exposition to the topic with focus on statistical context, see the book by Banerjee and Roy (2014, ISBN:9781420095388).
Last updated
openblascppopenmp
6.13 score 8 stars 9 dependents 24 scripts 1.0k downloadsNetworkDistance - Distance Measures for Networks
Network is a prevalent form of data structure in many fields. As an object of analysis, many distance or metric measures have been proposed to define the concept of similarity between two networks. We provide a number of distance measures for networks. See Jurman et al (2011) <doi:10.3233/978-1-60750-692-8-227> for an overview on spectral class of inter-graph distance measures.
Last updated
distancenetworknetwork-analysisopenblascppopenmp
5.59 score 9 stars 1 dependents 29 scripts 374 downloads
Riemann - Learning with Data on Riemannian Manifolds
We provide a variety of algorithms for manifold-valued data, including Fréchet summaries, hypothesis testing, clustering, visualization, and other learning tasks. See Bhattacharya and Bhattacharya (2012) <doi:10.1017/CBO9781139094764> for general exposition to statistics on manifolds.
Last updated
openblascppopenmp
5.43 score 12 stars 20 scripts 11k downloadsSHT - Statistical Hypothesis Testing Toolbox
We provide a collection of statistical hypothesis testing procedures ranging from classical to modern methods for non-trivial settings such as high-dimensional scenario. For the general treatment of statistical hypothesis testing, see the book by Lehmann and Romano (2005) <doi:10.1007/0-387-27605-X>.
Last updated
openblascppopenmp
5.11 score 6 stars 1 dependents 48 scripts 379 downloadsmclustcomp - Measures for Comparing Clusters
Given a set of data points, a clustering is defined as a disjoint partition where each pair of sets in a partition has no overlapping elements. This package provides 25 methods that play a role somewhat similar to distance or metric that measures similarity of two clusterings - or partitions. For a more detailed description, see Meila, M. (2005) <doi:10.1145/1102351.1102424>.
Last updated
cpp
4.88 score 2 stars 12 dependents 21 scripts 831 downloadsgraphon - A Collection of Graphon Estimation Methods
Provides a not-so-comprehensive list of methods for estimating graphon, a symmetric measurable function, from a single or multiple of observed networks. For a detailed introduction on graphon and popular estimation techniques, see the paper by Orbanz, P. and Roy, D.M.(2014) <doi:10.1109/TPAMI.2014.2334607>. It also contains several auxiliary functions for generating sample networks using various network models and graphons.
Last updated
4.87 score 8 stars 2 dependents 31 scripts 341 downloadsCovTools - Statistical Tools for Covariance Analysis
Covariance is of universal prevalence across various disciplines within statistics. We provide a rich collection of geometric and inferential tools for convenient analysis of covariance structures, topics including distance measures, mean covariance estimator, covariance hypothesis test for one-sample and two-sample cases, and covariance estimation. For an introduction to covariance in multivariate statistical analysis, see Schervish (1987) <doi:10.1214/ss/1177013111>.
Last updated
covariancecovariance-estimationopenblascpp
4.63 score 14 stars 61 scripts 347 downloads
T4cluster - Tools for Cluster Analysis
Cluster analysis is one of the most fundamental problems in data science. We provide a variety of algorithms from clustering to the learning on the space of partitions. See Hennig, Meila, and Rocci (2016, ISBN:9781466551886) for general exposition to cluster analysis.
Last updated
openblascppopenmp
4.54 score 7 stars 3 dependents 11 scripts 290 downloadsRiemBase - Functions and C++ Header Files for Computation on Manifolds
We provide a number of algorithms to estimate fundamental statistics including Fréchet mean and geometric median for manifold-valued data. Also, C++ header files are contained that implement elementary operations on manifolds such as Sphere, Grassmann, and others. See Bhattacharya and Bhattacharya (2012) <doi:10.1017/CBO9781139094764> if you are interested in statistics on manifolds, and Absil et al (2007, ISBN:9780691132983) on computational aspects of optimization on matrix manifolds.
Last updated
openblascppopenmp
4.26 score 4 stars 1 dependents 30 scripts 293 downloadsRlinsolve - Iterative Solvers for (Sparse) Linear System of Equations
Solving a system of linear equations is one of the most fundamental computational problems for many fields of mathematical studies, such as regression problems from statistics or numerical partial differential equations. We provide basic stationary iterative solvers such as Jacobi, Gauss-Seidel, Successive Over-Relaxation and SSOR methods. Nonstationary, also known as Krylov subspace methods are also provided. Sparse matrix computation is also supported in that solving large and sparse linear systems can be manageable using 'Matrix' package along with 'RcppArmadillo'. For a more detailed description, see a book by Saad (2003) <doi:10.1137/1.9780898718003>.
Last updated
openblascppopenmp
3.72 score 4 stars 1 dependents 44 scripts 448 downloadstvR - Total Variation Regularization
Provides tools for denoising noisy signal and images via Total Variation Regularization. Reducing the total variation of the given signal is known to remove spurious detail while preserving essential structural details. For the seminal work on the topic, see Rudin et al (1992) <doi:10.1016/0167-2789(92)90242-F>.
Last updated
openblascppopenmp
3.60 score 4 stars 7 scripts 218 downloadsROptSpace - Matrix Reconstruction from a Few Entries
Matrix reconstruction, also known as matrix completion, is the task of inferring missing entries of a partially observed matrix. This package provides a method called OptSpace, which was proposed by Keshavan, R.H., Oh, S., and Montanari, A. (2009) <doi:10.1109/ISIT.2009.5205567> for a case under low-rank assumption.
Last updated
cpp
3.38 score 2 stars 4 dependents 6 scripts 380 downloadsfilling - Matrix Completion, Imputation, and Inpainting Methods
Filling in the missing entries of a partially observed data is one of fundamental problems in various disciplines of mathematical science. For many cases, data at our interests have canonical form of matrix in that the problem is posed upon a matrix with missing values to fill in the entries under preset assumptions and models. We provide a collection of methods from multiple disciplines under Matrix Completion, Imputation, and Inpainting. See Davenport and Romberg (2016) <doi:10.1109/JSTSP.2016.2539100> for an overview of the topic.
Last updated
openblascppopenmp
3.02 score 4 stars 26 scripts 320 downloadsTDAkit - Toolkit for Topological Data Analysis
Topological data analysis studies structure and shape of the data using topological features. We provide a variety of algorithms to learn with persistent homology of the data based on functional summaries for clustering, hypothesis testing, visualization, and others. We refer to Wasserman (2018) <doi:10.1146/annurev-statistics-031017-100045> for a statistical perspective on the topic.
Last updated
openblascppopenmp
2.95 score 3 stars 1 dependents 5 scripts 281 downloadsZseq - Integer Sequence Generator
Generates well-known integer sequences. 'gmp' package is adopted for computing with arbitrarily large numbers. Every function has hyperlink to its corresponding item in OEIS (The On-Line Encyclopedia of Integer Sequences) in the function help page. For interested readers, see Sloane and Plouffe (1995, ISBN:978-0125586306).
Last updated
2.63 score 43 scripts 414 downloadsSBmedian - Scalable Bayes with Median of Subset Posteriors
Median-of-means is a generic yet powerful framework for scalable and robust estimation. A framework for Bayesian analysis is called M-posterior, which estimates a median of subset posterior measures. For general exposition to the topic, see the paper by Minsker (2015) <doi:10.3150/14-BEJ645>.
Last updated
openblascppopenmp
2.00 score 1 scripts 229 downloadsrepsim - Measures of Representational Similarity Across Models
Provides a collection of methods for quantifying representational similarity between learned features or multivariate data. The package offers an efficient 'C++' backend, designed for applications in machine learning, computational neuroscience, and multivariate statistics. See Klabunde et al. (2025) <doi:10.1145/3728458> for a comprehensive overview of the topic.
Last updated
cpp
1.00 score 4 scripts 180 downloads