Wednesday, January 18, 2017

An NlogN Parallel Fast Direct Solver for Kernel Matrices

When Matrix Factorization meets Machine Learning:



Kernel matrices appear in machine learning and non-parametric statistics. Given N points in d dimensions and a kernel function that requires O(d) work to evaluate, we present an O(dNlogN)-work algorithm for the approximate factorization of a regularized kernel matrix, a common computational bottleneck in the training phase of a learning task. With this factorization, solving a linear system with a kernel matrix can be done with O(NlogN) work. Our algorithm only requires kernel evaluations and does not require that the kernel matrix admits an efficient global low rank approximation. Instead our factorization only assumes low-rank properties for the off-diagonal blocks under an appropriate row and column ordering. We also present a hybrid method that, when the factorization is prohibitively expensive, combines a partial factorization with iterative methods. As a highlight, we are able to approximately factorize a dense 11M×11M kernel matrix in 2 minutes on 3,072 x86 "Haswell" cores and a 4.5M×4.5M matrix in 1 minute using 4,352 "Knights Landing" cores.
ASKIT is available here: http://padas.ices.utexas.edu/libaskit/




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, January 16, 2017

Edward: Deep Probabilistic Programming - implementation -

Dustin mentioned it on his Twitter feed:



Deep Probabilistic Programming by Dustin Tran, Matthew D. Hoffman, Rif A. Saurous, Eugene Brevdo, Kevin Murphy, David M. Blei

We propose Edward, a Turing-complete probabilistic programming language. Edward builds on two compositional representations---random variables and inference. By treating inference as a first class citizen, on a par with modeling, we show that probabilistic programming can be as flexible and computationally efficient as traditional deep learning. For flexibility, Edward makes it easy to fit the same model using a variety of composable inference methods, ranging from point estimation, to variational inference, to MCMC. In addition, Edward can reuse the modeling representation as part of inference, facilitating the design of rich variational models and generative adversarial networks. For efficiency, Edward is integrated into TensorFlow, providing significant speedups over existing probabilistic systems. For example, on a benchmark logistic regression task, Edward is at least 35x faster than Stan and PyMC3.
from the Edward page:

A library for probabilistic modeling, inference, and criticism.

Edward is a Python library for probabilistic modeling, inference, and criticism. It is a testbed for fast experimentation and research with probabilistic models, ranging from classical hierarchical models on small data sets to complex deep probabilistic models on large data sets. Edward fuses three fields: Bayesian statistics and machine learning, deep learning, and probabilistic programming.
It supports modeling with
  • Directed graphical models
  • Neural networks (via libraries such as Keras and TensorFlow Slim)
  • Conditionally specified undirected models
  • Bayesian nonparametrics and probabilistic programs
It supports inference with
  • Variational inference
    • Black box variational inference
    • Stochastic variational inference
    • Inclusive KL divergence: KL(pq)\text{KL}(p\|q)KL(pq)
    • Maximum a posteriori estimation
  • Monte Carlo
    • Hamiltonian Monte Carlo
    • Stochastic gradient Langevin dynamics
    • Metropolis-Hastings
  • Compositions of inference
    • Expectation-Maximization
    • Pseudo-marginal and ABC methods
    • Message passing algorithms
It supports criticism of the model and inference with
  • Point-based evaluations
  • Posterior predictive checks
Edward is built on top of TensorFlow. It enables features such as computational graphs, distributed training, CPU/GPU integration, automatic differentiation, and visualization with TensorBoard.

Authors

Edward is led by Dustin Tran with guidance by David Blei. The other developers are
We are open to collaboration, and welcome researchers and developers to contribute. Check out the contributing page for how to improve Edward’s software. For broader research challenges, shoot one of us an e-mail.
Edward has benefited enormously from the helpful feedback and advice of many individuals: Jaan Altosaar, Eugene Brevdo, Allison Chaney, Joshua Dillon, Matthew Hoffman, Kevin Murphy, Rajesh Ranganath, Rif Saurous, and other members of the Blei Lab, Google Brain, and Google Research.

Citation

We appreciate citations for Edward.
Dustin Tran, Alp Kucukelbir, Adji B. Dieng, Maja Rudolph, Dawen Liang, and David M. Blei. 2016. Edward: A library for probabilistic modeling, inference, and criticism. arXiv preprint arXiv:1610.09787.




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Thesis: Privacy-aware and Scalable Recommender Systems using Sketching Techniques by Raghavendran Balu


Congrtualtions Dr. Balu !


Privacy-aware and 
Scalable Recommender 
Systems 
using Sketching 
Techniques
 by Raghavendran Balu


In this thesis, we aim to study and evaluate the privacy and scalability properties of recommendersystems using sketching techniques and propose scalable privacy preserving personalization mechanisms. Hence, the thesis is at the intersection of three different topics: recommender systems, differential privacy and sketching techniques. On the privacy aspects, we are interested in both new privacy preserving mechanisms and the evaluation of such mechanisms. We observe that the primary parameter  in differential privacy is a control parameter and motivated to find techniques that can assess the privacy guarantees. We are also interested in proposing new mechanisms that are privacy preserving and get along well with the evaluation metrics. On the scalability aspects, weaim to solve the challenges arising in user modeling and item retrieval. User modeling with evolving data poses difficulties, to be addressed, in storage and adapting to new data. Also, addressing the retrieval aspects finds applications in various domains other than recommender systems. We evaluate the impact of our contributions through extensive experiments conducted on benchmark real datasets and through the results, we surmise that our contributions very well address the privacy and scalability challenges.






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

Thursday, January 12, 2017

NIPS 2016 Tutorial: Generative Adversarial Networks / Learning in Implicit Generative Models


Last night at the Paris Machine Learning meetup, we had a presentation on GANs designed to produce images of cracks (yes, GANs on cracks has a good sound to it Julien !). Here is a short insight for readers of Nuit Blanche as written by Eric Jang in a recent blog entry (that you should read in its entirety by the way, it's all good !):

For example, if we wanted to minimize some error for image compression/reconstruction, often what we find is that a naive choice of error metric (e.g. euclidean distance to the ground truth label) results in qualitatively bad results. The design flaw is that we don’t have good perceptual similarity metrics for images that are universally applicable for the space of all images. GANs use a second “adversarial” network learn an optimal implicit distance function (in theory).
 Here is a tutorial by Ian Goodfellow and a paper on the subject.

This report summarizes the tutorial presented by the author at NIPS 2016 on generative adversarial networks (GANs). The tutorial describes: (1) Why generative modeling is a topic worth studying, (2) how generative models work, and how GANs compare to other generative models, (3) the details of how GANs work, (4) research frontiers in GANs, and (5) state-of-the-art image models that combine GANs with other methods. Finally, the tutorial contains three exercises for readers to complete, and the solutions to these exercises.
Ian's slides from NIPS are here.

Learning in Implicit Generative Models by Shakir Mohamed, Balaji Lakshminarayanan
Generative adversarial networks (GANs) provide an algorithmic framework for constructing generative models with several appealing properties: they do not require a likelihood function to be specified, only a generating procedure; they provide samples that are sharp and compelling; and they allow us to harness our knowledge of building highly accurate neural network classifiers. Here, we develop our understanding of GANs with the aim of forming a rich view of this growing area of machine learning---to build connections to the diverse set of statistical thinking on this topic, of which much can be gained by a mutual exchange of ideas. We frame GANs within the wider landscape of algorithms for learning in implicit generative models--models that only specify a stochastic procedure with which to generate data--and relate these ideas to modelling problems in related fields, such as econometrics and approximate Bayesian computation. We develop likelihood-free inference methods and highlight hypothesis testing as a principle for learning in implicit generative models, using which we are able to derive the objective function used by GANs, and many other related objectives. The testing viewpoint directs our focus to the general problem of density ratio estimation. There are four approaches for density ratio estimation, one of which is a solution using classifiers to distinguish real from generated data. Other approaches such as divergence minimisation and moment matching have also been explored in the GAN literature, and we synthesise these views to form an understanding in terms of the relationships between them and the wider literature, highlighting avenues for future exploration and cross-pollination.




Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Wednesday, January 11, 2017

Paris Machine Learning Meetup #5 Season 4: LIME 'Why should I trust you', Apache SAMOA, GAN for Cracks, Opps, NIPS2016

  Video streaming of the event is here:
 


Mobiskill nous invitent dans ces nouveaux locaux. Voici le programme pour le meetup, si vous avez des annonces ou meme une presentation en plus, n'hésitez pas a remplir ce formulaire.


La salle aura une capacité de 120 personnes. La salle ouvrira ces portes avant 19h00.

On parlera mettre un sens aux données trop grandes, l'utlisation de GAN (qui ont fait fureur à NIPS), de crowdsourcing d'opportunités et si on a le temps de ce qui passe a NIPS à Barcelone.
Marco and Albert are likely to speak English while Julien, Daniel et Igor should be speaking French.


Marco Tulio Ribeiro, "Why Should I Trust You?" Explaining the Predictions of Any Classifier "
[code] arxiv link (short video presentation and longer KDD presentation)
Despite widespread adoption, machine learning models remain mostly black boxes. Understanding the reasons behind predictions is, however, quite important in assessing trust, which is fundamental if one plans to take action based on a prediction, or when choosing whether to deploy a new model. Such understanding also provides insights into the model, which can be used to transform an untrustworthy model or prediction into a trustworthy one. In this work, we propose LIME, a novel explanation technique that explains the predictions of any classifier in an interpretable and faithful manner, by learning an interpretable model locally around the prediction. We also propose a method to explain models by presenting representative individual predictions and their explanations in a non-redundant way, framing the task as a submodular optimization problem. We demonstrate the flexibility of these methods by explaining different models for text (e.g. random forests) and image classification (e.g. neural networks). We show the utility of explanations via novel experiments, both simulated and with human subjects, on various scenarios that require trust: deciding if one should trust a prediction, choosing between models, improving an untrustworthy classifier, and identifying why a classifier should not be trusted.


Albert Bifet, Telecom-Paristech, "Apache SAMOA"  Github repo

In this talk, we present Apache SAMOA, an open-source platform for mining big data streams with Apache Flink, Storm and Samza. Real time analytics is becoming the fastest and most efficient way to obtain useful knowledge from what is happening now, allowing organizations to react quickly when problems appear or to detect new trends helping to improve their performance.  Apache SAMOA includes algorithms for the most common machine learning tasks such as classification and clustering. It provides a pluggable architecture that allows it to run on Apache Flink, but also with other several distributed stream processing engines such as Storm and Samza.                                     
Julien Launay  "Cracking Crack Mechanics: Using GANs to replicate and learn more about fracture patterns" without animation link is here 

When modeling transfers through a medium in civil engineering, knowing the precise influence of cracks is often complicated, doubly so since the transfer and fracture problems are often heavily linked. I will present a new way to generate “fake” cracking patterns using GANs, and will then expand on how such novel techniques can be used to learn more about fracture mechanics.                                      
Daniel Benoilid   , foulefactory.com,    5 min talk "Man + Machine : Crowdsourcing opportunities"
  
How you can leverage on crowdsourcing to earn time on learning phases and provide a fall back in real time when the confidence interval isn't good.

Igor Carron, "So what happened at NIPS2016 ?"






Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Tuesday, January 10, 2017

Job: Summer 2017 IBM Social Good Fellowship (Undergraduates, Graduates, NGOs)

Kush (also @krvarshney on the twitter ) asked me to disseminate the following, so I do. I also note the following at the end of the annoucement which states:

If you are an NGO, or a social enterprise, we are currently scoping projects for our 2017 cycle. If you have an idea how we can help, drop us an email, and we will follow up. 

Summer 2017 IBM Social Good Fellowship
http://ibm.biz/socialgoodfellowship
Apply by February 1, 2017 
The IBM Social Good Fellowship is an opportunity for undergraduate and graduate students to develop their skills and develop data science solutions that benefit humanity. Mentored by leading IBM Research scientists and engineers at the T. J. Watson Research Center in Yorktown Heights, NY (north of New York City), fellows use data mining, machine learning, statistics, operations research, cloud computing, user experience design, and mobile computing methods to complete projects with social impact. Working closely with non-governmental organizations and other mission-driven partners, fellows take on real-world problems in health, energy, environment, education, international development, equality, justice, and more. We are currently seeking outstanding candidates for our summer 2017 program.

Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

CSjobs: two postdoctoral researchers, "C-SENSE: Exploiting low dimensional signal models for sensing, computation and processing" and "CS for Radar and Electronic Surveillance", Edinburgh, Scotland

Mike just sent me the following:

Hi Igor 
I am currently trying to recruit two postdoctoral researchers, one on the theory side (compressed sensing theory, sketching, concentration of measure) to work on my ERC grant "C-SENSE: Exploiting low dimensional signal models for sensing, computation and processing", and one on the applications side (CS for Radar and Electronic Surveillance) to work on our Defence signal processing project. Details of the vacancies can be found at:
https://www.vacancies.ed.ac.uk/pls/corehrrecruit/erq_jobspec_version_4.jobspec?p_id=038321
and
https://www.vacancies.ed.ac.uk/pls/corehrrecruit/erq_jobspec_version_4.jobspec?p_id=038322
I would be very grateful if you could advertise these on your blog and encourage people to apply.
Many thanks
Mike

--
Mike Davies
Professor of Signal and Image Processing
Institute for Digital Communications (IDCOM),
School of Engineering,
University of Edinburgh,
The King's buildings, Edinburgh, EH9 3JL
Director of the University Defence Research Collaboration (UDRC)
http://mod-udrc.org/ in Sensor Signal Processing
email: mike.davies@ed.ac.ukweb: http://www.research.ed.ac.uk/portal/mdavies4http://www.research.ed.ac.uk/portal/mdavies4

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !

FINN: A Framework for Fast, Scalable Binarized Neural Network Inference / Scaling Binarized Neural Networks on Reconfigurable Logic

Michaela just provided me with some of the latest results on optimizing Binarized Neural networks on FPGAs. This is quite interesting and impressive.





Research has shown that convolutional neural networks contain significant redundancy, and high classification accuracy can be obtained even when weights and activations are reduced from floating point to binary values. In this paper, we present FINN, a framework for building fast and flexible FPGA accelerators using a flexible heterogeneous streaming architecture. By utilizing a novel set of optimizations that enable efficient mapping of binarized neural networks to hardware, we implement fully connected, convolutional and pooling layers, with per-layer compute resources being tailored to user-provided throughput requirements. On a ZC706 embedded FPGA platform drawing less than 25 W total system power, we demonstrate up to 12.3 million image classifications per second with 0.31 {\mu}s latency on the MNIST dataset with 95.8% accuracy, and 21906 image classifications per second with 283 {\mu}s latency on the CIFAR-10 and SVHN datasets with respectively 80.1% and 94.9% accuracy. To the best of our knowledge, ours are the fastest classification rates reported to date on these benchmarks.


Scaling Binarized Neural Networks on Reconfigurable Logic by Nicholas J. Fraser, Yaman Umuroglu, Giulio Gambardella, Michaela Blott, Philip Leong, Magnus Jahre and Kees Vissers

Binarized neural networks (BNNs) are gaining interest in the deep learning community due to their significantly lower computational and memory cost. They are particularly well suited to recon gurable logic devices, which contain an abundance of ne-grained compute resources and can result in smaller, lower power implementations, or conversely in higher classification rates. Towards this end, the Finn framework was recently proposed for building fast and exible eld programmable gate array (FPGA) accelerators for BNNs. Finn utilized a novel set of optimizations that enable e fficient mapping of BNNs to hardware and implemented fully connected, non-padded convolutional and pooling layers, with per-layer compute resources being tailored to user-provided throughput requirements. However, FINN was not evaluated on larger topologies due to the size of the chosen FPGA, and exhibited decreased accuracy due to lack of padding. In this paper, we improve upon Finn to show how padding can be employed on BNNs while still maintaining a 1-bit datapath and high accuracy. Based on this technique, we demonstrate numerous experiments to illustrate exibility and scalability of the approach. In particular, we show that a large BNN requiring 1.2 billion operations per frame running on an ADM-PCIE-8K5 platform can classify images at 12 kFPS with 671 mus latency while drawing less than 41W board power and classifying CIFAR-10 images at 88.7% accuracy. Our implementation of this network achieves 14.8 trillion operations per second. We believe this is the fastest classification rate reported to date on this benchmark at this level of accuracy. 





Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Monday, January 09, 2017

A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers

Thanks to N Krishnaswami for the heads-up !



A Matrix Factorization Approach for Learning Semidefinite-Representable Regularizers by Yong Sheng Soh, Venkat Chandrasekaran

Regularization techniques are widely employed in optimization-based approaches for solving ill-posed inverse problems in data analysis and scientific computing. These methods are based on augmenting the objective with a penalty function, which is specified based on prior domain-specific expertise to induce a desired structure in the solution. We consider the problem of learning suitable regularization functions from data in settings in which precise domain knowledge is not directly available. Previous work under the title of `dictionary learning' or `sparse coding' may be viewed as learning a regularization function that can be computed via linear programming. We describe generalizations of these methods to learn regularizers that can be computed and optimized via semidefinite programming. Our framework for learning such semidefinite regularizers is based on obtaining structured factorizations of data matrices, and our algorithmic approach for computing these factorizations combines recent techniques for rank minimization problems along with an operator analog of Sinkhorn scaling. Under suitable conditions on the input data, our algorithm provides a locally linearly convergent method for identifying the correct regularizer that promotes the type of structure contained in the data. Our analysis is based on the stability properties of Operator Sinkhorn scaling and their relation to geometric aspects of determinantal varieties (in particular tangent spaces with respect to these varieties). The regularizers obtained using our framework can be employed effectively in semidefinite programming relaxations for solving inverse problems.


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

Printfriendly