Skip to main content

    Vincent Tan

    ABSTRACT We consider state-dependent memoryless channels with general state available at both encoder and decoder. We establish the ε-capacity and the optimistic ε-capacity. This allows us to prove a necessary and sufficient condition for... more
    ABSTRACT We consider state-dependent memoryless channels with general state available at both encoder and decoder. We establish the ε-capacity and the optimistic ε-capacity. This allows us to prove a necessary and sufficient condition for the strong converse to hold. We also provide a simpler sufficient condition on the first- and second-order statistics of the state process that ensures that the strong converse holds.
    ABSTRACT We study the (almost lossless) joint source-channel coding problem from the moderate deviations perspective where the bandwidth expansion ratio tends towards the ratio of the channel capacity and source entropy at a rate larger... more
    ABSTRACT We study the (almost lossless) joint source-channel coding problem from the moderate deviations perspective where the bandwidth expansion ratio tends towards the ratio of the channel capacity and source entropy at a rate larger than n-1/2 (n being the channel blocklength) and the error probability decays subexponentially. We consider the stationary ergodic Markov (SEM) source as well as discrete memoryless and additive SEM channels. We also discuss the loss due to separation in the moderate deviations setting.
    This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. These linear measurements are obtained by taking inner products of the low-rank matrix with random... more
    This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. These linear measurements are obtained by taking inner products of the low-rank matrix with random sensing matrices. Necessary and sufficient conditions on the number of measurements required are provided. It is shown that these conditions are sharp and the minimum-rank decoder is asymptotically optimal. The reliability function of this decoder is also derived by appealing to de Caen's lower bound on the probability of a union. The sufficient condition also holds when the sensing matrices are sparse - a scenario that may be amenable to efficient decoding. More precisely, it is shown that if the n\times n-sensing matrices contain, on average, \Omega(nlog n) entries, the number of measurements required is the same as that when the sensing matrices are dense and contain entries drawn uniformly at random from the field. Analogies are drawn between the above results and rank-metric codes in the coding theory literature. In fact, we are also strongly motivated by understanding when minimum rank distance decoding of random rank-metric codes succeeds. To this end, we derive distance properties of equiprobable and sparse rank-metric codes. These distance properties provide a precise geometric interpretation of the fact that the sparse ensemble requires as few measurements as the dense one. Finally, we provide a non-exhaustive procedure to search for the unknown low-rank matrix.
    We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is,... more
    We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using so-called information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a pre-processing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We also present regularized versions of our algorithms that learn latent tree approximations of arbitrary distributions. We compare the proposed algorithms to other methods by performing extensive numerical experiments on various latent tree graphical models such as hidden Markov models and star graphs. In addition, we demonstrate the applicability of our methods on real-world datasets by modeling the dependency structure of monthly stock returns in the S&P index and of the words in the 20 newsgroups dataset.
    The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both... more
    The problem of learning forest-structured discrete graphical models from i.i.d. samples is considered. An algorithm based on pruning of the Chow-Liu tree through adaptive thresholding is proposed. It is shown that this algorithm is both structurally consistent and risk consistent and the error probability of structure learning decays faster than any polynomial in the number of samples under fixed model size. For the high-dimensional scenario where the size of the model d and the number of edges k scale with the number of samples n, sufficient conditions on (n,d,k) are given for the algorithm to satisfy structural and risk consistencies. In addition, the extremal structures for learning are identified; we prove that the independent (resp. tree) model is the hardest (resp. easiest) to learn using the proposed algorithm in terms of error rates for structure learning.
    Abstract This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. Necessary and sufficient conditions on the number of measurements required are provided. It... more
    Abstract This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. Necessary and sufficient conditions on the number of measurements required are provided. It is shown that these conditions are ...
    Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating high-dimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary... more
    Sparse graphical models have proven to be a flexible class of multivariate probability models for approximating high-dimensional distributions. In this paper, we propose techniques to exploit this modeling ability for binary classification by discriminatively learning such models from labeled training data, i.e., using both positive and negative samples to optimize for the structures of the two models. We motivate why it is difficult to adapt existing generative methods, and propose an alternative method consisting of two parts. First, we develop a novel method to learn tree-structured graphical models which optimizes an approximation of the log-likelihood ratio. We also formulate a joint objective to learn a nested sequence of optimal forests-structured models. Second, we construct a classifier by using ideas from boosting to learn a set of discriminative trees. The final classifier can interpreted as a likelihood ratio test between two models with a larger set of pairwise features. We use cross-validation to determine the optimal number of edges in the final model. The algorithm presented in this paper also provides a method to identify a subset of the edges that are most salient for discrimination. Experiments show that the proposed procedure outperforms generative methods such as Tree Augmented Naïve Bayes and Chow-Liu as well as their boosted counterparts.
    ABSTRACT In this work, achievable dispersions for the discrete memoryless interference channel (DM-IC) are derived. In other words, we characterize the backoff from the Han-Kobayashi (HK) achievable region, the largest inner bound known... more
    ABSTRACT In this work, achievable dispersions for the discrete memoryless interference channel (DM-IC) are derived. In other words, we characterize the backoff from the Han-Kobayashi (HK) achievable region, the largest inner bound known to date for the DM-IC. In addition, we also characterize the backoff from Sato's region in the strictly very strong interference regime, and the backoff from Costa and El Gamal's region in the strong interference regime. To do so, Feinstein's lemma is first generalized to be applicable to the interference channel. Making use of the generalized Feinstein's lemma, it is found that the dispersions for the DM-IC can be represented by the information variances of eight information densities when HK message splitting is involved, and of six information densities for another encoding strategy. We also derive an outer bound that leverages on a known dispersion result for channels with random state by Ingber-Feder. It is shown that for the strictly very strong interference regime, the inner and outer bound have similar algebraic forms.
    ABSTRACT We consider state-dependent memoryless channels with general state available at both encoder and decoder. We establish the ε-capacity and the optimistic ε-capacity. This allows us to prove a necessary and sufficient condition for... more
    ABSTRACT We consider state-dependent memoryless channels with general state available at both encoder and decoder. We establish the ε-capacity and the optimistic ε-capacity. This allows us to prove a necessary and sufficient condition for the strong converse to hold. We also provide a simpler sufficient condition on the first- and second-order statistics of the state process that ensures that the strong converse holds.
    ABSTRACT We study the (almost lossless) joint source-channel coding problem from the moderate deviations perspective where the bandwidth expansion ratio tends towards the ratio of the channel capacity and source entropy at a rate larger... more
    ABSTRACT We study the (almost lossless) joint source-channel coding problem from the moderate deviations perspective where the bandwidth expansion ratio tends towards the ratio of the channel capacity and source entropy at a rate larger than n-1/2 (n being the channel blocklength) and the error probability decays subexponentially. We consider the stationary ergodic Markov (SEM) source as well as discrete memoryless and additive SEM channels. We also discuss the loss due to separation in the moderate deviations setting.
    This paper proposes a consistent and computationally efficient FFT-based algorithm for inferring the network topology where each node in the network is associated to a wide-sense stationary, ergodic, Gaussian process. Each edge of the... more
    This paper proposes a consistent and computationally efficient FFT-based algorithm for inferring the network topology where each node in the network is associated to a wide-sense stationary, ergodic, Gaussian process. Each edge of the tree network is characterized by a linear, time-invariant dynamical system and additive white Gaussian noise. The proposed algorithm uses Bartlett's procedure to produce periodogram estimates of cross power spectral densities between processes. Under appropriate assumptions, we prove that the number of vector-valued samples from a single sample path required for consistent estimation is polylogarithmic in the number of nodes in the network. Thus, the sample complexity is low. Our proof uses properties of spectral estimates and analysis for learning tree-structured graphical models.
    We consider the problem of learning the structure of ferromagnetic Ising models Markov on sparse Erdos-Renyi random graph. We propose simple local algorithms and analyze their performance in the regime of correlation decay. We prove that... more
    We consider the problem of learning the structure of ferromagnetic Ising models Markov on sparse Erdos-Renyi random graph. We propose simple local algorithms and analyze their performance in the regime of correlation decay. We prove that an algorithm based on a set of conditional mutual information tests is consistent for structure learning throughout the regime of correlation decay. This algorithm requires the number of samples to scale as \omega(\log n), and has a computational complexity of O(n^4). A simpler algorithm based on correlation thresholding outputs a graph with a constant edit distance to the original graph when there is correlation decay, and the number of samples required is \Omega(\log n). Under a more stringent condition, correlation thresholding is consistent for structure estimation. We finally prove a lower bound that \Omega(c\log n) samples are also needed for consistent reconstruction of random graphs by any algorithm with positive probability, where c is the average degree. Thus, we establish that consistent structure estimation is possible with almost order-optimal sample complexity throughout the regime of correlation decay.
    This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. These linear measurements are obtained by taking inner products of the low-rank matrix with random... more
    This paper establishes information-theoretic limits in estimating a finite field low-rank matrix given random linear measurements of it. These linear measurements are obtained by taking inner products of the low-rank matrix with random sensing matrices. Necessary and sufficient conditions on the number of measurements required are provided. It is shown that these conditions are sharp and the minimum-rank decoder is asymptotically optimal. The reliability function of this decoder is also derived by appealing to de Caen's lower bound on the probability of a union. The sufficient condition also holds when the sensing matrices are sparse - a scenario that may be amenable to efficient decoding. More precisely, it is shown that if the n\times n-sensing matrices contain, on average, \Omega(nlog n) entries, the number of measurements required is the same as that when the sensing matrices are dense and contain entries drawn uniformly at random from the field. Analogies are drawn between the above results and rank-metric codes in the coding theory literature. In fact, we are also strongly motivated by understanding when minimum rank distance decoding of random rank-metric codes succeeds. To this end, we derive distance properties of equiprobable and sparse rank-metric codes. These distance properties provide a precise geometric interpretation of the fact that the sparse ensemble requires as few measurements as the dense one. Finally, we provide a non-exhaustive procedure to search for the unknown low-rank matrix.
    We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is,... more
    We study the problem of learning a latent tree graphical model where samples are available only from a subset of variables. We propose two consistent and computationally efficient algorithms for learning minimal latent trees, that is, trees without any redundant hidden nodes. Unlike many existing methods, the observed nodes (or variables) are not constrained to be leaf nodes. Our first algorithm, recursive grouping, builds the latent tree recursively by identifying sibling groups using so-called information distances. One of the main contributions of this work is our second algorithm, which we refer to as CLGrouping. CLGrouping starts with a pre-processing procedure in which a tree over the observed variables is constructed. This global step groups the observed nodes that are likely to be close to each other in the true latent tree, thereby guiding subsequent recursive grouping (or equivalent procedures) on much smaller subsets of variables. This results in more accurate and efficient learning of latent trees. We also present regularized versions of our algorithms that learn latent tree approximations of arbitrary distributions. We compare the proposed algorithms to other methods by performing extensive numerical experiments on various latent tree graphical models such as hidden Markov models and star graphs. In addition, we demonstrate the applicability of our methods on real-world datasets by modeling the dependency structure of monthly stock returns in the S&P index and of the words in the 20 newsgroups dataset.

    And 52 more