Almost Lossless Universal Source Coding

This research line proposes contributions on the interplay between learning-decision and information theory, in the context of exploring a novel variation of universal source coding on countable  infinite alphabets. Pursuing the seminal work of Han 2000, we explore the idea of relaxing the lossless block-wise assumption by introducing a non-zero distortion,  with the objective that the corresponding weak source coding formulation will be reduced to a learning criterion that becomes feasible for the  whole family finite entropy memoryless sources on countable infinite alphabets.  Our conjecture is that by relaxing the problem,  we can achieve universality for the whole family of memoryless distributions. In this weak almost lossless coding scenario, we are interested in formalizing its equivalent learning problem and in deriving a new information complexity measure to quantify the  complexity for this new learning task. With this new  complexity metric,  we will study the possibility of obtaining weak universal source coding schemes.
Contributions:
  1. J. F. Silva and P. Piantanida, “Almost Lossless Variable-Length Source Coding on Countably Infinite Alphabets,” in IEEE International Symposium on Information Theory, Barcelona, Spain, July, 2016.

source-coding-final

Compressibility of Random Sequences

9trinomialpl1

Introducing notions of compressibility for a stochastic process,  meaning that with high probability  realizations of the process can be well-approximated  by its best k-term sparse version is an important topic of active research.  Quantifying compressibility for random sequences and the identification of compressible and sparse distributions (priors) are relevant problems considering the recent development of Compressed Sensing. In our group, we have contributed in the characterization of compressible random sequences extending this analysis to the  family of stationary and ergodic processes. In particular, we have stablished a necessary and sufficient condition for a stationary and ergodic process to be \ell_p-compressible.
Contributions:
  1. Jorge F. Silva and Milan S. Derpich, "On the Characterization of l_p-Compressible Ergodic Sequences,"IEEE Transactions on Signal Processing, vol. 63, no. 11, pp. 2915-2928, June, 2015.
  2. J. F. Silva and M. S. Derpich, “Precise best k-term Approximation Error Analysis of Ergodic Processes,” in the IEEE International Symposium on Information Theory, Honolulu, Hawaii, USA, 2014.
  3. J. F. Silva and E. Pavez, "Compressibility of Infinite Sequences and its Interplay with Compressed Sensing Recovery," in APSIPA, Los Angeles, USA, 2012.

Compressed Sensing applied to Inverse Problems in Geo-statistics and Radio Interferometry

The reconstruction of images from scarce measurements is at the core of inverse problems in geostatistics as well as in some areas of Astronomy and Geosciences.   The focus will be on the problem of reconstruction of images motivated by applications in Mining and the Oil&Gas industry, in particular our short-term research focus  being the reconstruction of geological facies models from well or drillhole measurements. The conventional approach to treat the lack of data in this application is by incorporating prior information in the form of a statistical model that captures the geological structure of an image or family of images.  In this work we depart from this classical approach and we propose to explore a new avenue provided by generalized sampling theorems for image reconstruction.  In particular, we are working on establishing concrete connections with the performance guarantee theorems of the RIPless theory of compressed sensing recently elaborated by Candes and Plan.

Contributions:
  1. Hernán Calderon, Jorge F. Silva, Julián M. Ortiz, and Alvaro Egaña, ”Reconstruction of Multichannel Facies based on RIPless Compressed Sensing,” Computers and Geosciences, vol. 77, pp. 54-65, April, 2015.
  2. H. Calderon, F. Santibanez, J. F. Silva , J. M. Ortiz, and A. Egana, “Channelized Facies Recovery based on Weighted Compressed Sensing,” in IEEE International Symposium on Information Theory, 2016 IEEE Sensor Array and Multichannel Signal Processing Workshop (SAM), Sao Paulo, Brasil, July, 2016.

compressive-sensing

 

Optimal Filter Bank Selection for Pattern Recognition

tree_two_channel_iterations

This research line explores the problem of optimal filter bank selection for pattern recognition and its applications to speech recognition and texture indexing. Here we are interested in the use of the rich collection of Wavelet Packets (WPs) filter banks. This is a learning setting, where there is an estimation-approximation error tradeoff that we formulate to dynamically select the type of WP filter bank structure that offers the best scale-frequency characteristics for discrimination. As it has been shown in other applications (for instance in lossy compression based on a fidelity criterion), the idea is to take advantage of the WP tree-structure ability to propose dynamic programing solutions for the problem, where connections with the rich literature of tree-structured optimization problems is considered, in particular tree-structured vector quantization for lossy compression, regression, and classification.

Contributions:
  1. Eduardo Pavez and Jorge F. Silva, "Analysis and Design of Wavelet-Packet Cepstral Coefficients for Automatic Speech Recognition," ELSEVIER Speech Communication, January, 2012.
  2. Jorge Silva and Shrikanth S. Narayanan, "Discriminative Wavelet Packet Filter Bank Selection for Pattern Recognition," , IEEE Transactions on Signal Processing, vol. 57, no 5, pp:1796-1810, May 2009 May, 2009.
  3. Jorge F. Silva and Shrikanth S. Narayanan, "On Signal Representations within the Bayes Decision Framework," ELSEVIER Pattern Recognition, vol 45, issue 5, pp. 1853–1865, 2012.

Information Measures Estimation

This research line explores the problem of universal estimation of information theoretic quantities (divergence, mutual information, differential entropy) based on a data-driven histogram-based approach. This problem is largely unexplored, where only classical results are available based on product type of histogram-based constructions and kernel plug-in estimates. In particular, we are interested in exploring the goodness of non-product partition schemes that has been demonstrated in other statistical learning problems (classification, regression and density estimation). Our basic hypothesis here is that data-dependent partitions can offer a level of improvement with respect to kernels, plug-in, and non-adaptive histogram-based estimation techniques conventionally adopted for this task.

Contributions:

  1. Jorge F. Silva and Shrikanth S. Narayanan, "Complexity-Regularized Tree-Structured Partition for Mutual Information Estimation," IEEE Transactions on Information Theory, vol. 58, no.3, pp.1940 - 1952 ,March, 2012.
  2. Jorge Silva and Shrikanth S. Narayanan, "Information Divergence Estimation based on Data-Dependent Partitions," ELSEVIER Journal of Statistical Planning and Inference, vol. 140, pp. 3130-3198, November 2010.
  3. Jorge Silva and Shrikanth S. Narayanan, "Non-Product Data-Dependent Partitions for Mutual Information Estimation: Strong Consistency and Applications," IEEE Transactions on Signal Processing, vol. 58, no. 7, pp. 3497-3511, July, 2010.

screenshot-from-2016-10-14-10-33-25

Convergence Properties of the Shannon Entropy

 dsc_0033

 This research direction focuses on the unexplored interplay between information measure estimation, density estimation and the convergence properties of information quantities, with focus on the Shannon differential entropy. The objective is being able to stipulate a set of necessary and sufficient conditions that connect the aforementioned learning problems. On the application side,  this work tries to  contextualize these findings in two important histogram-based schemes: data-driven partitions originally proposed by Lugosi and Nobel: and the Barron density estimator. The objective here is to find new universal consistency results and schemes to estimate densities (in information divergence) as well as the Shannon differential entropy.

Contributions:

  1. Jorge F. Silva and Patricio Parada, "On the Convergence of Shannon Differential Entropy, and its Connections with Density and Entropy Estimation,” ELSEVIER Journal of Statistical Planning and Inference, vol 142, issue 7, pp. 1716-1732, July, 2012.
  2. J. F. Silva and P. Parada, “Shannon Entropy Estimation from Convergence Results in the Countable Alphabet Case,” in IEEE Information Theory Workshop (ITW2013), Sevilla, Espana, 2013.
  3. J. F. Silva and P. Parada, "Shannon Entropy Convergence Results in the Countable Infinite Case," in IEEE International Symposium on Information Theory (ISIT2012), Boston, USA, 2012.