## Thursday, April 28, 2016

### Video: Sparse Identification of Nonlinear Dynamics (SINDy)

Hi Igor,

I am attaching a link to a youtube video abstract of our recent paper on sparse identification of nonlinear dynamics (SINDy) in PNAS.  Hope you enjoy, and please feel free to share with anyone who may be interested.

Also, I saw that you mentioned our algorithm on your blog — thanks very much!!  It is awesome to hear the others like the work.

Paper [open access]:  http://www.pnas.org/content/113/15/3932.abstract

Best Regards,
Steve
Thanks  Steve ! Here is the video:

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### Agnostic Estimation of Mean and Covariance

The reason we have a zoo of matrix factorizations stems in part with the need to deal with different adversarial noises. From the paper:

The Achilles heel of algorithms for generative models is the assumption that data is exactly from the model. This is crucial for known guarantees, and relaxations of it are few and specialized, e.g., in ICA, data could by noisy, but the noise itself is assumed to be Gaussian. Assumptions about rank and sparsity are made in a technique that is now called Robust PCA [CSPW11, CLMW11]. There have been attempts [Kwa08, MT+11] at achieving robustness by L1 minimization, but they don’t give any error bounds on the output produced. A natural, important and wide open problem is estimating the parameters of generative models in the presence of arbitrary, i.e., malicious noise, a setting usually referred to as agnostic learning. The simplest version of this problem is to estimate a single Gaussian in the presence of malicious noise. Alternatively, this can be posed as the problem of finding a best-fit Gaussian to data or agnostically learning a single Gaussian.

Agnostic Estimation of Mean and Covariance  by Kevin A. Lai, Anup B. Rao, Santosh Vempala

We consider the problem of estimating the mean and covariance of a distribution from iid samples in $\mathbb{R}^n$, in the presence of an $\eta$ fraction of malicious noise; this is in contrast to much recent work where the noise itself is assumed to be from a distribution of known type. The agnostic problem includes many interesting special cases, e.g., learning the parameters of a single Gaussian (or finding the best-fit Gaussian) when $\eta$ fraction of data is adversarially corrupted, agnostically learning a mixture of Gaussians, agnostic ICA, etc. We present polynomial-time algorithms to estimate the mean and covariance with error guarantees in terms of information-theoretic lower bounds. As a corollary, we also obtain an agnostic algorithm for Singular Value Decomposition.

and previously the Recursive Fourier PCA algorithm

Max vs Min: Tensor Decomposition and ICA with nearly Linear Sample Complexity by Santosh S. Vempala, Ying Xiao

We present a simple, general technique for reducing the sample complexity of matrix and tensor decomposition algorithms applied to distributions. We use the technique to give a polynomial-time algorithm for standard ICA with sample complexity nearly linear in the dimension, thereby improving substantially on previous bounds. The analysis is based on properties of random polynomials, namely the spacings of an ensemble of polynomials. Our technique also applies to other applications of tensor decompositions, including spherical Gaussian mixture models.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Wednesday, April 27, 2016

### A Tutorial on Libra: R package for the Linearized Bregman Algorithm in High Dimensional Statistics - implementation -

Here is an implementation in R of an l1 solver. All compressive sensing solvers implementation show up under the CS and implemntation tags.

A Tutorial on Libra: R package for the Linearized Bregman Algorithm in High Dimensional Statistics by Jiechao Xiong, Feng Ruan, Yuan Yao

The R package, Libra, stands for the LInearized BRegman Al- gorithm in high dimensional statistics. The Linearized Bregman Algorithm is a simple iterative procedure to generate sparse regularization paths of model estimation, which are rstly discovered in applied mathematics for image restoration and particularly suitable for parallel implementation in large scale problems. The limit of such an algorithm is a sparsity-restricted gradient descent ow, called the Inverse Scale Space, evolving along a par- simonious path of sparse models from the null model to over tting ones. In sparse linear regression, the dynamics with early stopping regularization can provably meet the unbiased Oracle estimator under nearly the same condition as LASSO, while the latter is biased. Despite their successful applications, statistical consistency theory of such dynamical algorithms remains largely open except for some recent progress on linear regression. In this tutorial, algorithmic implementations in the package are discussed for several widely used sparse models in statistics, including linear regression, logistic regres- sion, and several graphical models (Gaussian, Ising, and Potts). Besides the simulation examples, various application cases are demonstrated, with real world datasets from diabetes, publications of COPSS award winners, as well as social networks of two Chinese classic novels, Journey to the West and Dream of the Red Chamber.

The packahe is here: https://cran.r-project.org/web/packages/Libra/index.html

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

### An information theoretic formulation of the Dictionary Learning and Sparse Coding Problems on Statistical Manifolds

An information theoretic formulation of the Dictionary Learning and Sparse Coding Problems on Statistical Manifolds by Rudrasis Chakraborty, Monami Banerjee, Victoria Crawford, Baba C. Vemuri

In this work, we propose a novel information theoretic framework for dictionary learning (DL) and sparse coding (SC) on a statistical manifold (the manifold of probability distributions). Unlike the traditional DL and SC framework, our new formulation {\it does not explicitly incorporate any sparsity inducing norm in the cost function but yet yields SCs}. Moreover, we extend this framework to the manifold of symmetric positive definite matrices, $\mathcal{P}_n$. Our algorithm approximates the data points, which are probability distributions, by the weighted Kullback-Leibeler center (KL-center) of the dictionary atoms. The KL-center is the minimizer of the maximum KL-divergence between the unknown center and members of the set whose center is being sought. Further, {\it we proved that this KL-center is a sparse combination of the dictionary atoms}. Since, the data reside on a statistical manifold, the data fidelity term can not be as simple as in the case of the vector-space data. We therefore employ the geodesic distance between the data and a sparse approximation of the data element. This cost function is minimized using an acceleterated gradient descent algorithm. An extensive set of experimental results show the effectiveness of our proposed framework. We present several experiments involving a variety of classification problems in Computer Vision applications. Further, we demonstrate the performance of our algorithm by comparing it to several state-of-the-art methods both in terms of classification accuracy and sparsity.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, April 26, 2016

### Paris Machine Learning Meetup #12 Season 3: ML Hardware

Mozilla France is hosting us. Mathworks is sponsoring the networking event afterwards. Thank you to them.

Hardware is becoming key in getting computations performed at scale and using specific techniques such as Deep Learning. Today's Paris Machine Learning meetup is focused on the type of hardware and software combination that is currently and may eventually run this Machine Learning/Deep Learning/AI infrastructure.
Streaming video is below:

Right now, these are the presentations we should have tonight. Undoubtedly, we should have more meetups on this topic in the future.
Arjun Bansal, Nervana Systems, Nervana and the Future of Computing
"Nervana delivers fast, scalable AI on demand using cloud computing, deep learning and custom hardware. These trends have recently been identified as the basis for the future of computing in a recent issue of The Economist. In this talk, I will provide an overview of Nervana’s technology."

Marc Wolff, Amine El Helou , Mathworks, Un algorithme distribué de forêt aléatoire pour du risque de crédit/ A Random Forest distributed algorithm for credit risk computations.
L'algorithme Random Forest consiste à entraîner, généralement en parallèle, plusieurs arbres de décision. Une des limitations bien connues de cette méthode est la nécessité de charger voire de répliquer l’intégralité du dataset en mémoire. Dans le cas d’importantes volumétries de données, cela peut se révéler problématique. Une alternative possible consiste à développer un algorithme distribué d’arbre décisionnel en s’appuyant sur du parallélisme de type SPMD (Single Program Multiple Data) et sur l’API MPI (Message Passing Interface). Cette approche permet d’exploiter au mieux toute la puissance de traitement (machine multi-cœurs ou cluster de calcul) et d’opérer sur des données distribuées en mémoire. Nous présenterons une implémentation de ce type et son application à une analyse de risque de crédit auprès d'un groupe bancaire (Groupe de Risque Opérationnel du Crédit Agricole).

Olivier Guillaune,  Any computer for my machine learning?
"When the learning phase begin to be too long, what are the current material solutions to accelerate my calculations and which elements to assemble a reliable calculation server, powerful and at the best cost ?"

Igor Carron, LightOn.io, Approximating kernels at the speed of light

We will talk about how we use optics to perform some specific operation of interest in Machine Learning. The basis of this talk relies on this preprint.

André Reinald, Mozilla

En introduction, nous expliquerons que le lien entre machine learning et Mozilla ce sont les données personnelles, auxquelles nous accordons toute notre attention, et que nous aimerions voir traitées avec une certaine éthique. On parlera de trois sujets:
1. retour sur les tentatives dans les "content services",
2. projets dans les objets connectés (limités à la domotique pour le moment),
3. projet peerstorage.org

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Monday, April 25, 2016

### Sketching and Neural Networks

Two subjects we used to not see in the same field. Interesting!

Sketching and Neural Networks by Amit Daniely, Nevena Lazic, Yoram Singer, Kunal Talwar

High-dimensional sparse data present computational and statistical challenges for supervised learning. We propose compact linear sketches for reducing the dimensionality of the input, followed by a single layer neural network. We show that any sparse polynomial function can be computed, on nearly all sparse binary vectors, by a single layer neural network that takes a compact sketch of the vector as input. Consequently, when a set of sparse binary vectors is approximately separable using a sparse polynomial, there exists a single-layer neural network that takes a short sketch as input and correctly classifies nearly all the points. Previous work has proposed using sketches to reduce dimensionality while preserving the hypothesis class. However, the sketch size has an exponential dependence on the degree in the case of polynomial classifiers. In stark contrast, our approach of using improper learning, using a larger hypothesis class allows the sketch size to have a logarithmic dependence on the degree. Even in the linear case, our approach allows us to improve on the pesky $O({1}/{{\gamma}^2})$ dependence of random projections, on the margin $\gamma$. We empirically show that our approach leads to more compact neural networks than related methods such as feature hashing at equal or better performance.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Thursday, April 21, 2016

### Compact all-CMOS spatiotemporal compressive sensing video camera with pixel-wise coded exposure

Some new Compressive Sensing hardware, woohoo !

Compact all-CMOS spatiotemporal compressive sensing video camera with pixel-wise coded exposure by Jie Zhang, Tao Xiong, Trac Tran, Sang Chin, and Ralph Etienne-Cummings
Abstract
We present a low power all-CMOS implementation of temporal compressive sensing with pixel-wise coded exposure. This image sensor can increase video pixel resolution and frame rate simultaneously while reducing data readout speed. Compared to previous architectures, this system modulates pixel exposure at the individual photo-diode electronically without external optical components. Thus, the system provides reduction in size and power compare to previous optics based implementations. The prototype image sensor (127 × 90 pixels) can reconstruct 100 fps videos from coded images sampled at 5 fps. With 20× reduction in readout speed, our CMOS image sensor only consumes 14μW to provide 100 fps videos.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Wednesday, April 20, 2016

### Heavy hitters via cluster-preserving clustering

Interesting, clustering the most frequent elementsin a turnstile model:

Heavy hitters via cluster-preserving clustering by Kasper Green Larsen, Jelani Nelson, Huy L. Nguyen, Mikkel Thorup

In turnstile $\ell_p$ $\varepsilon$-heavy hitters, one maintains a high-dimensional $x\in\mathbb{R}^n$ subject to $\texttt{update}(i,\Delta)$ causing $x_i\leftarrow x_i + \Delta$, where $i\in[n]$, $\Delta\in\mathbb{R}$. Upon receiving a query, the goal is to report a small list $L\subset[n]$, $|L| = O(1/\varepsilon^p)$, containing every "heavy hitter" $i\in[n]$ with $|x_i| \ge \varepsilon \|x_{\overline{1/\varepsilon^p}}\|_p$, where $x_{\overline{k}}$ denotes the vector obtained by zeroing out the largest $k$ entries of $x$ in magnitude.
For any $p\in(0,2]$ the CountSketch solves $\ell_p$ heavy hitters using $O(\varepsilon^{-p}\log n)$ words of space with $O(\log n)$ update time, $O(n\log n)$ query time to output $L$, and whose output after any query is correct with high probability (whp) $1 - 1/poly(n)$. Unfortunately the query time is very slow. To remedy this, the work [CM05] proposed for $p=1$ in the strict turnstile model, a whp correct algorithm achieving suboptimal space $O(\varepsilon^{-1}\log^2 n)$, worse update time $O(\log^2 n)$, but much better query time $O(\varepsilon^{-1}poly(\log n))$.
We show this tradeoff between space and update time versus query time is unnecessary. We provide a new algorithm, ExpanderSketch, which in the most general turnstile model achieves optimal $O(\varepsilon^{-p}\log n)$ space, $O(\log n)$ update time, and fast $O(\varepsilon^{-p}poly(\log n))$ query time, and whp correctness. Our main innovation is an efficient reduction from the heavy hitters to a clustering problem in which each heavy hitter is encoded as some form of noisy spectral cluster in a much bigger graph, and the goal is to identify every cluster. Since every heavy hitter must be found, correctness requires that every cluster be found. We then develop a "cluster-preserving clustering" algorithm, partitioning the graph into clusters without destroying any original cluster.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Tuesday, April 19, 2016

### Deep Online Convex Optimization with Gated Games

David sent me the following a while back:

Dear Igor,

I’ve enjoyed following your blog for a long time, and think the following preprint may be of interest: “Deep Online Convex Optimization with Gated Games,” http://arxiv.org/abs/1604.01952. It analyzes the convergence of gradient-based methods on rectifier neural networks, which are neither smooth nor convex, using ideas from game-theory.

Abstract:
Methods from convex optimization are widely used as building blocks for deep learning algorithms. However, the reasons for their empirical success are unclear, since modern convolutional networks (convnets), incorporating rectifier units and max-pooling, are neither smooth nor convex. Standard guarantees therefore do not apply. This paper provides the first convergence rates for gradient descent on rectifier convnets. The proof utilizes the particular structure of rectifier networks which consists in binary active/inactive gates applied on top of an underlying linear network. The approach generalizes to max-pooling, dropout and maxout. In other words, to precisely the neural networks that perform best empirically. The key step is to introduce gated games, an extension of convex games with similar convergence properties that capture the gating function of rectifiers. The main result is that rectifier convnets converge to a critical point at a rate controlled by the gated-regret of the units in the network. Corollaries of the main result include: (i) a game-theoretic description of the representations learned by a neural network; (ii) a logarithmic-regret algorithm for training neural nets; and (iii) a formal setting for analyzing conditional computation in neural nets that can be applied to recently developed models of attention.

cheers,
David

Thanks David !

Image Credit: NASA/JPL-Caltech/Space Science Institute
W00097359.jpg was taken on April 14, 2016 and received on Earth April 14, 2016. The camera was pointing toward SATURN, and the image was taken using the MT2 and CL2 filters. This image has not been validated or calibrated.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

## Monday, April 18, 2016

### The 2016 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2016) is now open

Michael just let me know that registration for MMDS (2016 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2016 ) is now open (Previous MMDSs were covered under the MMDS tag)

Registration for the 2016 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2016) is now available at:

Early registration is available until May 1, 2016.

In addition to four days of talks on algorithmic and statistical aspects of modern large-scale data analysis, MMDS 2016 will have a contributed poster session one evening.  The registration fee is waived for student poster presenters.  You may apply to present a poster at the link above.

Event: MMDS 2016: Workshop on Algorithms for Modern Massive Data Sets
Dates: June 21-24, 2016
Location: UC Berkeley, Berkeley, CA
Website: http://mmds-data.org
Contact: organizers@mmds-data.org

Synopsis: The 2016 Workshop on Algorithms for Modern Massive Data Sets (MMDS 2016) will address algorithmic, mathematical, and statistical challenges in modern statistical data analysis. The goals of MMDS 2016 are to explore novel techniques for modeling and analyzing massive, high-dimensional, and nonlinearly-structured scientific and internet data sets, and to bring together computer scientists, statisticians, mathematicians, and data analysis practitioners to promote cross-fertilization of ideas.

Organizers: Michael Mahoney (UC Berkeley), Alex Shkolnik (Stanford), and Petros Drineas (RPI)

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.