• About
  • Documentation

  • More Universes
  • Recent Updates
  • Leader board

  • All repositories
  • All packages
  • All articles
  • All datasets
  • All system Libraries
kisungyou
  • Builds
  • Packages
  • Articles
  • Datasets
  • Contribution
  • Badges
  • API
  • Feed

Links tokisungyou

Rdimtools - Dimension Reduction and Estimation Methods

We provide linear and nonlinear dimension reduction techniques. Intrinsic dimension estimation methods for exploratory analysis are also provided. For more details on the package, see the paper by You and Shung (2022) <doi:10.1016/j.simpa.2022.100414>.

Last updated

dimension-estimationdimension-reductionmanifold-learningsubspace-learningopenblascppopenmp

8.80 score 56 stars 8 dependents 226 scripts 1.0k downloads

T4transport - Tools for Computational Optimal Transport

Transport theory has seen much success in many fields of statistics and machine learning. We provide a variety of algorithms to compute Wasserstein distance, barycenter, and others. See Peyré and Cuturi (2019) <doi:10.1561/2200000073> for the general exposition to the study of computational optimal transport.

Last updated

openblascppopenmp

6.28 score 7 stars 1 dependents 13 scripts 259 downloads

ADMM - Algorithms using Alternating Direction Method of Multipliers

Provides algorithms to solve popular optimization problems in statistics such as regression or denoising based on Alternating Direction Method of Multipliers (ADMM). See Boyd et al (2010) <doi:10.1561/2200000016> for complete introduction to the method.

Last updated

openblascppopenmp

6.21 score 7 stars 9 dependents 17 scripts 665 downloads

maotai - Tools for Matrix Algebra, Optimization and Inference

Matrix is an universal and sometimes primary object/unit in applied mathematics and statistics. We provide a number of algorithms for selected problems in optimization and statistical inference. For general exposition to the topic with focus on statistical context, see the book by Banerjee and Roy (2014, ISBN:9781420095388).

Last updated

openblascppopenmp

6.13 score 8 stars 9 dependents 24 scripts 1.0k downloads

NetworkDistance - Distance Measures for Networks

Network is a prevalent form of data structure in many fields. As an object of analysis, many distance or metric measures have been proposed to define the concept of similarity between two networks. We provide a number of distance measures for networks. See Jurman et al (2011) <doi:10.3233/978-1-60750-692-8-227> for an overview on spectral class of inter-graph distance measures.

Last updated

distancenetworknetwork-analysisopenblascppopenmp

5.59 score 9 stars 1 dependents 29 scripts 374 downloads

Riemann - Learning with Data on Riemannian Manifolds

We provide a variety of algorithms for manifold-valued data, including Fréchet summaries, hypothesis testing, clustering, visualization, and other learning tasks. See Bhattacharya and Bhattacharya (2012) <doi:10.1017/CBO9781139094764> for general exposition to statistics on manifolds.

Last updated

openblascppopenmp

5.43 score 12 stars 20 scripts 11k downloads

SHT - Statistical Hypothesis Testing Toolbox

We provide a collection of statistical hypothesis testing procedures ranging from classical to modern methods for non-trivial settings such as high-dimensional scenario. For the general treatment of statistical hypothesis testing, see the book by Lehmann and Romano (2005) <doi:10.1007/0-387-27605-X>.

Last updated

openblascppopenmp

5.11 score 6 stars 1 dependents 48 scripts 379 downloads

mclustcomp - Measures for Comparing Clusters

Given a set of data points, a clustering is defined as a disjoint partition where each pair of sets in a partition has no overlapping elements. This package provides 25 methods that play a role somewhat similar to distance or metric that measures similarity of two clusterings - or partitions. For a more detailed description, see Meila, M. (2005) <doi:10.1145/1102351.1102424>.

Last updated

cpp

4.88 score 2 stars 12 dependents 21 scripts 831 downloads

graphon - A Collection of Graphon Estimation Methods

Provides a not-so-comprehensive list of methods for estimating graphon, a symmetric measurable function, from a single or multiple of observed networks. For a detailed introduction on graphon and popular estimation techniques, see the paper by Orbanz, P. and Roy, D.M.(2014) <doi:10.1109/TPAMI.2014.2334607>. It also contains several auxiliary functions for generating sample networks using various network models and graphons.

Last updated

4.87 score 8 stars 2 dependents 31 scripts 341 downloads

CovTools - Statistical Tools for Covariance Analysis

Covariance is of universal prevalence across various disciplines within statistics. We provide a rich collection of geometric and inferential tools for convenient analysis of covariance structures, topics including distance measures, mean covariance estimator, covariance hypothesis test for one-sample and two-sample cases, and covariance estimation. For an introduction to covariance in multivariate statistical analysis, see Schervish (1987) <doi:10.1214/ss/1177013111>.

Last updated

covariancecovariance-estimationopenblascpp

4.63 score 14 stars 61 scripts 347 downloads

T4cluster - Tools for Cluster Analysis

Cluster analysis is one of the most fundamental problems in data science. We provide a variety of algorithms from clustering to the learning on the space of partitions. See Hennig, Meila, and Rocci (2016, ISBN:9781466551886) for general exposition to cluster analysis.

Last updated

openblascppopenmp

4.54 score 7 stars 3 dependents 11 scripts 290 downloads

RiemBase - Functions and C++ Header Files for Computation on Manifolds

We provide a number of algorithms to estimate fundamental statistics including Fréchet mean and geometric median for manifold-valued data. Also, C++ header files are contained that implement elementary operations on manifolds such as Sphere, Grassmann, and others. See Bhattacharya and Bhattacharya (2012) <doi:10.1017/CBO9781139094764> if you are interested in statistics on manifolds, and Absil et al (2007, ISBN:9780691132983) on computational aspects of optimization on matrix manifolds.

Last updated

openblascppopenmp

4.26 score 4 stars 1 dependents 30 scripts 293 downloads

Rlinsolve - Iterative Solvers for (Sparse) Linear System of Equations

Solving a system of linear equations is one of the most fundamental computational problems for many fields of mathematical studies, such as regression problems from statistics or numerical partial differential equations. We provide basic stationary iterative solvers such as Jacobi, Gauss-Seidel, Successive Over-Relaxation and SSOR methods. Nonstationary, also known as Krylov subspace methods are also provided. Sparse matrix computation is also supported in that solving large and sparse linear systems can be manageable using 'Matrix' package along with 'RcppArmadillo'. For a more detailed description, see a book by Saad (2003) <doi:10.1137/1.9780898718003>.

Last updated

openblascppopenmp

3.72 score 4 stars 1 dependents 44 scripts 448 downloads

tvR - Total Variation Regularization

Provides tools for denoising noisy signal and images via Total Variation Regularization. Reducing the total variation of the given signal is known to remove spurious detail while preserving essential structural details. For the seminal work on the topic, see Rudin et al (1992) <doi:10.1016/0167-2789(92)90242-F>.

Last updated

openblascppopenmp

3.60 score 4 stars 7 scripts 218 downloads

ROptSpace - Matrix Reconstruction from a Few Entries

Matrix reconstruction, also known as matrix completion, is the task of inferring missing entries of a partially observed matrix. This package provides a method called OptSpace, which was proposed by Keshavan, R.H., Oh, S., and Montanari, A. (2009) <doi:10.1109/ISIT.2009.5205567> for a case under low-rank assumption.

Last updated

cpp

3.38 score 2 stars 4 dependents 6 scripts 380 downloads

filling - Matrix Completion, Imputation, and Inpainting Methods

Filling in the missing entries of a partially observed data is one of fundamental problems in various disciplines of mathematical science. For many cases, data at our interests have canonical form of matrix in that the problem is posed upon a matrix with missing values to fill in the entries under preset assumptions and models. We provide a collection of methods from multiple disciplines under Matrix Completion, Imputation, and Inpainting. See Davenport and Romberg (2016) <doi:10.1109/JSTSP.2016.2539100> for an overview of the topic.

Last updated

openblascppopenmp

3.02 score 4 stars 26 scripts 320 downloads

TDAkit - Toolkit for Topological Data Analysis

Topological data analysis studies structure and shape of the data using topological features. We provide a variety of algorithms to learn with persistent homology of the data based on functional summaries for clustering, hypothesis testing, visualization, and others. We refer to Wasserman (2018) <doi:10.1146/annurev-statistics-031017-100045> for a statistical perspective on the topic.

Last updated

openblascppopenmp

2.95 score 3 stars 1 dependents 5 scripts 281 downloads

Zseq - Integer Sequence Generator

Generates well-known integer sequences. 'gmp' package is adopted for computing with arbitrarily large numbers. Every function has hyperlink to its corresponding item in OEIS (The On-Line Encyclopedia of Integer Sequences) in the function help page. For interested readers, see Sloane and Plouffe (1995, ISBN:978-0125586306).

Last updated

2.63 score 43 scripts 414 downloads

SBmedian - Scalable Bayes with Median of Subset Posteriors

Median-of-means is a generic yet powerful framework for scalable and robust estimation. A framework for Bayesian analysis is called M-posterior, which estimates a median of subset posterior measures. For general exposition to the topic, see the paper by Minsker (2015) <doi:10.3150/14-BEJ645>.

Last updated

openblascppopenmp

2.00 score 1 scripts 229 downloads

repsim - Measures of Representational Similarity Across Models

Provides a collection of methods for quantifying representational similarity between learned features or multivariate data. The package offers an efficient 'C++' backend, designed for applications in machine learning, computational neuroscience, and multivariate statistics. See Klabunde et al. (2025) <doi:10.1145/3728458> for a comprehensive overview of the topic.

Last updated

cpp

1.00 score 4 scripts 180 downloads