Page Views on Nuit Blanche since July 2010


You can also join the Google+ Community (420), the CompressiveSensing subreddit (122), the LinkedIn Compressive Sensing group (2320) or the Matrix Factorization (682) and post there !
Reference pages include The Big Picture in Compressive Sensing, the Advanced Matrix Factorization Jungle Page and the Reproducible Research page

Monday, June 17, 2013

Sunday Morning Insight: A Quick Panorama of Sensing from Direct Imaging to Machine Learning

What is a sensor and how does sensing fit with Machine Learning ?  This is a common question that I think arise out of the fact that our fields are becoming too specific thereby attenuating our common sense of what the big picture entails. Be forewarned, the following brief exposition will be making numerous shortcuts and abuse of notation.

Everything starts with the simple notion that what you see is what you get:

x = x

In other words, on one side, you have x and on the other you "sense" x. This picture applies directly to what we call direct imaging: if you want to image something, you have to just sample it. A more mathy expression for this process is the simple:

x = I x

where I is the identity (yes, let's talk about matrices or operators when the explanation requires it.) Let us note that sometimes, when we remove certain rows from I by applying a restriction operator R, then we are undersampling. Reconstructing the full x from that subsampled version (RIx) is called interpolation or inpainting. Let us call IN that inpainting operator so that we have:

x = IN (R(Ix)) 




So far so good. Let us also note that in optics we generally do the following:

x = (BA) x

(BA) generally represents the lenses of the camera. In other words, the engineering behind lens design (using tools like Zemax) is geared towards designing BA (also called the PSF) to be as close as possible to I, the identity. The sensor or the measurement process obtains (BA)x. Both A and B are linear operators. The process is actually a little more complex see [1].

credit NASA

Then there is the case of indirect imaging as attempted since the early 1960s with coded aperture for instance. 

x = L(Ax)

L and A are linear operators. The sensors can only measure Ax (not x) but you are a matrix multiplication away from getting x back. Famous examples include coded aperture for X-ray imaging for orbital observatories, but also CT and MRI.  Encryption also fits that model.



With compressive sensing, things get a little more complicated, it is indirect imaging but with a twist. There we have

x = N(Ax) or even x = N(A(Bx))

with A or A and B are linear operators, x is sparse (or compressible) in the canonical basis or in some other bases, N is a nonlinear operator. Here, as in indirect imaging, what is sensed is Ax or A(Bx). The difference between this new approach and traditional indirect imaging lies the nonlinearity of the reconstruction operator N. In compressive sensing, we seek a rectangular A (subsampling) and the nonlinearity of the N comes from the fact that we seek only sparse x's. Historically, N was initially tied to linear convex programming or greedy methods, but there are other different approaches such as IHT and eventually AMP. Additional insight can be found in [2].

Muthu (see [3]) points out that there ought to be other quantities we could extract from a linear sensing mechanism so that instead of:

x = N(Ax)

we ought to look at:

Hx = N(Ax)


The sensor obtains Ax, N is a nonlinear reconstruction solver of sorts that reconstruct Hx which is  an hopefully important feature of x (in compressive sensing, that feature is the sparsity pattern of x). Instead of the full reconstruction, we aim at detecting items of interest. In optics we sometimes talk about task specific imagers or detection in signal manifold processing, in MRI, we call it fingerprinting while in Theoretical Computer Science, we are concerned about the statistics of x since in the streaming model, we cannot even store x. There the process goes by the name of Heavy Hitters and related algorithms. One could also argue that current conventional imaging also falls in that category. 

Now let us go further by making the first sensing operator a nonlinear one. This is the approach that is typically seen in nonlinear compressive sensing and also in machine learning. There we have:

x = N2(N1(x))






In machine learning, in particular, with neural networks, we have:

x = N4(N3(N2(N1(x))))

This sort of very successful construct is known as autoencoders in deep learning. Each operator N is nonlinear. A specific choice of regularization for the layers and nodes can even make it look like a compressive sensing approach. In this case. we talk about sparse autoencoders, i.e. roughly speaking , there is sparse set of weights for every images that are part of the training set on which the neural network was trained. In traditional sensing, there are also natural nonlinear sampling schemes such as the one introduced by quantization (see Petros' resource on Quantization), FROG or simply those found in phase retrieval  problems. 


And then there is the traditional area of Machine Learning that does classification:

Hx = N4(N3(N2(N1(x))))

where Hx represents whether x belongs to a class (+1/-1).

As we have seen recently, the connection between Compressive Sensing and Machine Learning, stems in large part in the nonlinearities involved in the sampling stage and specifically how to grab the features of interest. Can we be guided on the preferable nonlinearities ? When one comes back to a linear case, there are interesting information that might eventually be useful for designing nonlinearities in the sampling system, from [4]: 
We show that the suggested instance optimality is equivalent to a generalized null space property of M and discuss possible relations with generalized restricted isometry properties.
There are also other lessons to be learned from [5] but this time it involves low rank issues. We are just at the beginning of seeing these two fields merging.



[1] Let us note that this is not entirely true in conventional imaging, a 2D scene is transformed into a 2D scene whereas a 3D scene is transformed into a 2D scene. So AB is not really the identity but some sort of the projection of it Let us further note that in conventional direct imaging, things are really more complicated as AB is really two nonlinear operators that include the nonlinearities of the FPA, quantization issues and compression (JPEG, ...). This is one the reason why Mark Neifeld ells us that we are already using compressive systems at the Duke-AFRL Meetingin a tlak entitled Adaptation for Task-Specific Compressive Imaging



[2] One should note that the rapid convergence of some of these algorithms makes one think that they could be parallel to neural networks. There are conditions on A, x, N although those are better acknowledged than in the direct imaging case. But there is really no restriction due to the concept of sparsity, indeed other solvers N using l_infty, l_p and other norms as regularizers can find a signal with a certain structure within the set of signals that were not destroyed by A (or A(B.)). With an underdetermined system (rectangular A), what we are saying in essence is that by performing the measurement Ax, we are not losing information on x.

[3] From Muthu's 2010 Massive Data Streams Research: Where to Go:

Compressed Functional Sensing. Streaming and compressed sensing brought two groups of researchers (CS and signal processing) together on common problems of what is the minimal amount of data to be sensed or captured or stored, so data sources can be reconstructed, at least approximately. This has been a productive development for research with fundamental insights into geometry of high dimensional spaces as well as the Uncertainty Principle. In addition, Engineering and Industry has been impacted significantly with analog to information paradigm. This is however just the beginning. We need to extend compressed sensing to functional sensing, where we sense only what is appropriate to compute different function (rather than simply reconstructing) and furthermore, extend the theory to massively distributed and continual framework to be truly useful for new massive data applications above.

[4] Compressed Sensing and Best Approximation from Unions of Subspaces: Beyond Dictionaries by Tomer Peleg, Rémi Gribonval, Mike E. Davies
We propose a theoretical study of the conditions guaranteeing that a decoder will obtain an optimal signal recovery from an underdetermined set of linear measurements. This special type of performance guarantee is termed instance optimality and is typically related with certain properties of the dimensionality-reducing matrix M. Our work extends traditional results in sparse recovery, where instance optimality is expressed with respect to the set of sparse vectors, by replac- ing this set with an arbitrary finite union of subspaces. We show that the suggested instance optimality is equivalent to a generalized null space property of M and discuss possible relations with generalized restricted isometry properties.
[5] Learning non-parametric basis independent models from point queries via low-rank methodsTyagi, Hemant; Cevher, Volkan

Friday, June 14, 2013

This Week in Review: A Seinfeldian problem, Meet-Ups, GSI2013, Around the blogs in 78 hours.


This week we had quite a few interesting threads on the interwebs about signals acquisition and making sense of them. While the former seems to be feasible thanks to Moore's law, the latter is, in view of the different results mentioned here and elsewhere, likely substandard. The real issue is: do policymakers understand there is a difference between acquiring data and making sense of them in the same way that there is a difference between a rental car company taking a reservation and making that reservation a reality by delivering a car ?

   

If you are reading this blog and happen to be coming to (or living in) Paris and want to make a presentation in front of a technical audience, there is the possibility of doing so at a variety of Meet-ups. Lately, a good resource to find (or create) Meet-ups in Paris (and elsewhere) is Meetup.com. At the end of this entry, you'll find some of the meet-ups that I tend to attend to when time permits. In the past few weeks, there were some talks of interest:

But I am sure I missed some. There are no technical meetup groups in College Station as far as I can tell. There ought to be one, I am sure there are enough hardware hackers around there.

Frank Nielsen wants me to mention the Geometric Science of Information conference (www.gsi2013.org ) that will have an 800 pages proceedings and will take place at the end of August in Paris! The page of the conference reads:

The objective of this SEE Conference hosted by MINES ParisTech, is to bring together pure/applied mathematicians and engineers, with common interest for Geometric tools and their applications for Information analysis, with active participation of young researchers for deliberating emerging areas of collaborative research on “Information Geometry Manifolds and Their Advanced Applications”.
Current and ongoing uses of Information Geometry Manifolds in applied mathematics are the following: Advanced Signal/Image/Video Processing, Complex Data Modeling and Analysis, Information Ranking and Retrieval, Coding, Cognitive Systems, Optimal Control, Statistics on Manifolds, Machine Learning, Speech/sound recognition, natural language treatment, etc., which are also substantially relevant for the industry....This conference will be an interdisciplinary event and will federate skills from Geometry, Probability and Information Theory to address the following topics among others
  • Computational Information Geometry
  • Hessian/Symplectic Information Geometry
  • Optimization on Matrix Manifolds
  • Probability on Manifolds
  • Optimal Transport Geometry 
  • Divergence Geometry & Ancillarity
  • Machine/Manifold/Topology Learning
  • Tensor-Valued Mathematical Morphology
  • Differential Geometry in Signal Processing
  • Geometry of Audio Processing 
  • Geometry for Inverse Problems
  • Shape Spaces: Geometry and Statistic
  • Geometry of Shape Variability
  • Relational Metric
  • Discrete Metric Spaces

Of related interest is a recently published book on Matrix Information Geometry.

Here what you could read around the interwebs this past week:

Larry
Greg
Brian
2physics
  • Quantum experiment preludes the endgame for local realism – photonic Bell violation closes the fair-sampling loophole 
Danny
Dustin
Patrick

John

Gregory:

Sebastien
Roger
John
Ivan

While on Nuit Blanche,

The meet-ups of interest around Paris are:

Thursday, June 13, 2013

CSJob: PostDoc, Research Associate in Compressive Sensing and Distributed Compressive Sensing for Energy Harvesting Wireless Sensor Networks, London


Research Associate in Compressive Sensing and Distributed Compressive Sensing for Energy Harvesting Wireless Sensor Networks

EPSRC funded project

We wish to recruit a Research Associate from 1st October 2013 or as soon as possible thereafter to work on an EPSRC funded project, Efficient Energy Management in Energy Harvesting Wireless Sensor Networks: An Approach Based on Distributed Compressive Sensing.
The overarching objective of this research project is to develop transformative sensing mechanisms, which can be used in conjunction with current or upcoming energy harvesting capabilities, in order to enable the deployment of energy neutral or nearly energy neutral wireless sensor networks with practical network lifetime and data gathering rates up to two orders of magnitude higher than the current state-of-the-art. In particular, the project capitalizes on the emerging paradigms of compressive sensing and distributed compressive sensing as well as energy- and information-optimal data acquisition and transmission protocols in order to tightly the energy demand to the energy supply in wireless sensor networks in order to achieve the breakthroughs in data gathering capability.
This 3-years research project, which will also be carried out in collaboration with the University of Cambridge, involves both theoretical and practical work as well as proof-of-concept deployment via the various institutional and industrial partners. This research project will also explore the application of the technology in various domains such as the smart infrastructure and construction industry, the digital economy industry, healthcare systems and future cities.
The proposed research is also supported by the Cambridge Innovation and Knowledge Centre for Smart Infrastructure and Construction, Thales, Fujitsu Laboratories of Europe, STMicroelectronics and AquaMW.
The post is initially funded for a period of two years .Further funding to support the post may be available.
Candidates should hold a PhD degree in electrical engineering, computer science or mathematics with significant research experience in the area of compressive sensing or distributed compressive sensing or signal and information processing theory and methods.
The salary scales are as follows:
Research Associate: Grade 7, salary £32,375-£39,132 per annum, inclusive of £2834 London Allowance per annum
Interested applicants are encouraged to make informal enquiries to the Project Principal Investigator of the research project Dr Miguel Rodrigues, m.rodrigues@ucl.ac.uk, +44 (0)20 7679 3957.
Further details can be found here:
All applications should be submitted via UCL online recruitment system here:
Post reference 1340734
You will need to upload the following documents with your application:
  • CV (including the names and contact details of at least two referees who may be contacted prior to interview)
  • An up to date list of publications
If you experience any problems please contact Vicky Coombes at v.coombes@ucl.ac.uk quoting reference 1340734



Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

PyHST2: an hybrid distributed code for high speed tomographic reconstruction with iterative reconstruction and a priori knowledge capabilities = implementation-




We present the PyHST2 code which is in service at ESRF for phase-contrast and absorption tomography. This code has been engineered to sustain the high data flow typical of the third generation synchrotron facilities (10 terabytes per experiment) by adopting a distributed and pipelined architecture. The code implements, beside a default filtered backprojection reconstruction, iterative reconstruction techniques with a-priori knowledge. These latter are used to improve the reconstruction quality or in order to reduce the required data volume and reach a given quality goal. The implemented a-priori knowledge techniques are based on the total variation penalisation and a new recently found convex functional which is based on overlapping patches.
We give details of the different methods and their implementations while the code is distributed under free license.
We provide methods for estimating, in the absence of ground-truth data, the optimal parameters values for a-priori techniques.
PyHST2 is available here at: http://forge.epn-campus.eu/projects/pyhst2

I note three items that are bound to be part of our collective experience as we are surfing Moore's law. From this outstanding paper, one can read:  

  • "....10 terabytes per experiment..." Memory is cheap, if people can record it, they will. 
  • "...The total data volume is often larger than the available memory...." if one were to perform these experiment often, no make that very very often, then we would have to go into streaming algorithms. Right now, the science is such that performing these computation is a day's worth and a few of these computations are probably behind one of two thesis' work. In ten year's time, that experiment could be done every ten seconds. At that point, the algorithms will have to catch up and while memory will still be cheap, we will require data reconstruction at a faster pace.
  • "...If the ground truth is not available, statistical methods such as the discrepancy principle [19] or generalized cross-validation [20] can be used to select the optimal regularization parameter. Such methods could be implemented in future versions of PyHST...." At some point, we will have to go to a grammar based exploratory approach. The choosing of parameters while worth it, is a reminder that we don't know very well the underlying structure of the signal.  

Join the CompressiveSensing subreddit or the Google+ Community and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.