Yann LeCun

Abstract Many successful models for scene or object recognition transform low-level descriptors (such as Gabor filter responses, or SIFT descriptors) into richer representations of intermediate complexity. This process can often be broken... more

Abstract Many successful models for scene or object recognition transform low-level descriptors (such as Gabor filter responses, or SIFT descriptors) into richer representations of intermediate complexity. This process can often be broken down into two steps:(1) a coding step, which performs a pointwise transformation of the descriptors into a representation better adapted to the task, and (2) a pooling step, which summarizes the coded features over larger neighborhoods.

Publication Date: Jun 13, 2010

Download (.pdf)

Abstract Energy-Based Models (EBMs) capture dependencies between variables by associating a scalar energy to each configuration of the variables. Inference consists in clamping the value of observed variables and finding configurations of... more

Abstract Energy-Based Models (EBMs) capture dependencies between variables by associating a scalar energy to each configuration of the variables. Inference consists in clamping the value of observed variables and finding configurations of the remaining variables that minimize the energy. Learning consists in finding an energy function in which observed configurations of the variables are given lower energies than unobserved ones.

Publication Date: Aug 19, 2006

ABSTRACT We have used information-theoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected:... more

ABSTRACT We have used information-theoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improved speed of learning and/or classification. The basic idea is to use second-derivative information to make a tradeoff between network complexity and training set error.

Journal Name: Advances in neural information processing systems 2, NIPS 1989

Publication Date: Feb 1990

Download (.pdf)

Abstract Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms... more

Abstract Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing.

Journal Name: Proceedings of the IEEE

Publication Date: Nov 1998

Download (.pdf)

We introduce a new approach for on-line recognition of handwritten words written in unconstrained mixed style. The preprocessor performs a word-level normalization by fitting a model of the word structure using the EM algorithm. Words are... more

We introduce a new approach for on-line recognition of handwritten words written in unconstrained mixed style. The preprocessor performs a word-level normalization by fitting a model of the word structure using the EM algorithm. Words are then coded into low resolution" annotated images" where each pixel contains information about trajectory direction and curvature. The recognizer is a convolution network that can be spatially replicated. From the network output, a hidden Markov model produces word scores.

Journal Name: Neural Computation

Publication Date: Nov 1995

Download (.pdf)

ABSTRACT This paper compares the performance of several classi er algorithms on a standard database of handwritten digits. We consider not only raw accuracy, but also training time, recognition time, and memory requirements. When... more

ABSTRACT This paper compares the performance of several classi er algorithms on a standard database of handwritten digits. We consider not only raw accuracy, but also training time, recognition time, and memory requirements. When available, we report measurements of the fraction of patterns that must be rejected so that the remaining patterns have misclassi cation rates less than a given threshold.

Journal Name: Neural networks: the statistical mechanics perspective

Publication Date: 1995

Download (.pdf)

Publication Date: 1995

Download (.pdf)

Abstract Back-propagation has proven to be a robust algorithm for difficult connectionist learning problems. However, as with many gradient based optimization methods, it converges slowly. We describe an extension of the back-propagation... more

Abstract Back-propagation has proven to be a robust algorithm for difficult connectionist learning problems. However, as with many gradient based optimization methods, it converges slowly. We describe an extension of the back-propagation algorithm which uses a simple approximation to the second derivative terms. This method is shown to reduce the required number of iterations to learn a random classification problem, with only a small increase in the complexity of each iteration.

Journal Name: Proceedings of the 1988 connectionist models summer school

Publication Date: Sep 1988

Abstract We present a method for training a similarity metric from data. The method can be used for recognition or verification applications where the number of categories is very large and not known during training, and where the number... more

Abstract We present a method for training a similarity metric from data. The method can be used for recognition or verification applications where the number of categories is very large and not known during training, and where the number of training samples for a single category is very small. The idea is to learn a function that maps input patterns into a target space such that the L 1 norm in the target space approximates the" semantic" distance in the input space. The method is applied to a face verification task.

Publication Date: Jun 20, 2005

Download (.pdf)

Abstract An application of back-propagation networks to handwritten zip code recognition is presented. Minimal preprocessing of the data is required, but the architecture of the network is highly constrained and specifically designed for... more

Abstract An application of back-propagation networks to handwritten zip code recognition is presented. Minimal preprocessing of the data is required, but the architecture of the network is highly constrained and specifically designed for the task. The input of the network consists of size-normalized images of isolated digits. The performance on zip code digits provided by the US Postal Service is 92% recognition, 1% substitution, and 7% rejects.

Publication Date: Jun 16, 1990

Abstract This paper compares the performance of several classifier algorithms on a standard database of handwritten digits. We consider not only raw accuracy, but also training time, recognition time, and memory requirements. When... more

Abstract This paper compares the performance of several classifier algorithms on a standard database of handwritten digits. We consider not only raw accuracy, but also training time, recognition time, and memory requirements. When available, we report measurements of the fraction of patterns that must be rejected so that the remaining patterns have misclassification rates less than a given threshold

Publication Date: Oct 9, 1994

Download (.pdf)

The ability of multilayer back-propagation networks to learn complex, high-dimensional, nonlinear mappings from large collections of examples makes them obvious candidates for image recognition or speech recognition tasks (see PATTERN... more

The ability of multilayer back-propagation networks to learn complex, high-dimensional, nonlinear mappings from large collections of examples makes them obvious candidates for image recognition or speech recognition tasks (see PATTERN RECOGNITION AND NEURAL NETWORKS). In the traditional model of pattern recognition, a hand-designed feature extractor gathers relevant information from the input and eliminates irrelevant variabilities.

Journal Name: The handbook of brain theory and neural networks

Publication Date: 1995

Download (.pdf)

Abstract We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a... more

Abstract We present an unsupervised method for learning a hierarchy of sparse feature detectors that are invariant to small shifts and distortions. The resulting feature extractor consists of multiple convolution filters, followed by a feature-pooling layer that computes the max of each filter output within adjacent windows, and a point-wise sigmoid non-linearity. A second level of larger and more invariant features is obtained by training the same algorithm on patches of features from the first level.

Publication Date: Jun 17, 2007

Download (.pdf)

We describe a system which can recognize digits and uppercase letters handprinted on a touch terminal. A character is input as a sequence of [x (t), y (t)] coordinates, subjected to very simple preprocessing, and then classified by a... more

We describe a system which can recognize digits and uppercase letters handprinted on a touch terminal. A character is input as a sequence of [x (t), y (t)] coordinates, subjected to very simple preprocessing, and then classified by a trainable neural network. The classifier is analogous to “time delay neural networks” previously applied to speech recognition. The network was trained on a set of 12,000 digits and uppercase letters, from approximately 250 different writers, and tested on 2500 such characters from other writers.

Journal Name: Pattern Recognition

Publication Date: Dec 31, 1991

Abstract The architecture, implementation, and applications of a special-purpose neural network processor are described. The chip performs over 2000 multiplications and additions simultaneously. Its data path is particularly suitable for... more

Abstract The architecture, implementation, and applications of a special-purpose neural network processor are described. The chip performs over 2000 multiplications and additions simultaneously. Its data path is particularly suitable for the convolutional topologies that are typical in classification networks, but can also be configured for fully connected or feedback topologies. Resources can be multiplexed to permit implementation of networks with several hundreds of thousands of connections on a single chip.

Journal Name: Solid-State Circuits, IEEE Journal of

Publication Date: Dec 1991

Abstract One long-term goal of machine learning research is to produce methods that are applicable to highly complex tasks, such as perception (vision, audition), reasoning, intelligent control, and other artificially intelligent... more

Abstract One long-term goal of machine learning research is to produce methods that are applicable to highly complex tasks, such as perception (vision, audition), reasoning, intelligent control, and other artificially intelligent behaviors. We argue that in order to progress toward this goal, the Machine Learning community must endeavor to discover algorithms that can learn highly complex functions, with minimal need for prior knowledge, and with minimal human intervention.

Journal Name: Large-Scale Kernel Machines

Publication Date: 2007

Download (.pdf)

Abstract Two novel methods for achieving handwritten digit recognition are described. The first method is based on a neural network chip that performs line thinning and feature extraction using local template matching. The second method... more

Abstract Two novel methods for achieving handwritten digit recognition are described. The first method is based on a neural network chip that performs line thinning and feature extraction using local template matching. The second method is implemented on a digital signal processor and makes extensive use of constrained automatic learning. Experimental results obtained using isolated handwritten digits taken from postal zip codes, a rather difficult data set, are reported and discussed.<>

Journal Name: Communications Magazine, IEEE

Publication Date: Nov 1989

Abstract We assess the applicability of several popular learning methods for the problem of recognizing generic visual categories with invariance to pose, lighting, and surrounding clutter. A large dataset comprising stereo image pairs of... more

Abstract We assess the applicability of several popular learning methods for the problem of recognizing generic visual categories with invariance to pose, lighting, and surrounding clutter. A large dataset comprising stereo image pairs of 50 uniform-colored toys under 36 azimuths, 9 elevations, and 6 lighting conditions was collected (for a total of 194,400 individual images). The objects were 10 instances of 5 generic categories: four-legged animals, human figures, airplanes, trucks, and cars.

Publication Date: Jun 27, 2004

Download (.pdf)

The convergence of back-propagation learning is analyzed so as to explain common phenomenon observedb y practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposedin serious technical... more

The convergence of back-propagation learning is analyzed so as to explain common phenomenon observedb y practitioners. Many undesirable behaviors of backprop can be avoided with tricks that are rarely exposedin serious technical publications. This paper gives some of those tricks, ando. ers explanations of why they work. Many authors have suggested that second-order optimization methods are advantageous for neural net training. It is shown that most “classical” second-order methods are impractical for large neural networks.

Journal Name: Neural networks: Tricks of the trade

Publication Date: 1998

Download (.pdf)

Abstract Signal processing and pattern recognition algorithms make extensive use of convolution. In many cases, computational accuracy is not as important as computational speed. In feature extraction, for instance, the features of... more

Abstract Signal processing and pattern recognition algorithms make extensive use of convolution. In many cases, computational accuracy is not as important as computational speed. In feature extraction, for instance, the features of interest in a signal are usually quite distorted. This form of noise justifies some level of quantization in order to achieve faster feature extraction.

Journal Name: Advances in Neural Information Processing Systems (NIPS 1999)

Publication Date: Jul 20, 1999

Download (.pdf)

Abstract Discusses coding standards for still images and motion video. We first briefly discuss standards already in use, including: Group 3 and Group 4 for bilevel fax images; JPEG for still color images; and H. 261, H. 263, MPEG-1, and... more

Abstract Discusses coding standards for still images and motion video. We first briefly discuss standards already in use, including: Group 3 and Group 4 for bilevel fax images; JPEG for still color images; and H. 261, H. 263, MPEG-1, and MPEG-2 for motion video. We then cover newly emerging standards such as JBIG1 and JBIG2 for bilevel fax images, JPEG-2000 for still color images, and H. 263+ and MPEG-4 for motion video.

Journal Name: Circuits and Systems for Video Technology, IEEE Transactions on

Publication Date: Nov 1998

Abstract Several recently-proposed architectures for high-performance object recognition are composed of two main stages: a feature extraction stage that extracts locally-invariant feature vectors from regularly spaced image patches, and... more

Abstract Several recently-proposed architectures for high-performance object recognition are composed of two main stages: a feature extraction stage that extracts locally-invariant feature vectors from regularly spaced image patches, and a somewhat generic supervised classifier.

Publication Date: Jun 20, 2009

Download (.pdf)

Abstract Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on... more

Abstract Unsupervised learning algorithms aim to discover the structure hidden in the data, and to learn representations that are more suitable as input to a supervised machine than the raw input. Many unsupervised methods are based on reconstructing the input from the representation, while constraining the representation to have certain desirable properties (eg low dimension, sparsity, etc). Others are based on approximating density by stochastically reconstructing the input from the representation.

Journal Name: Advances in neural information processing systems

Publication Date: 2007

Download (.pdf)

Abstract We present a feed-forward network architecture for recognizing an unconstrained handwritten multi-digit string. This is an extension of previous work on recognizing isolated digits. In this architecture a single digit recognizer... more

Abstract We present a feed-forward network architecture for recognizing an unconstrained handwritten multi-digit string. This is an extension of previous work on recognizing isolated digits. In this architecture a single digit recognizer is replicated over the input. The output layer of the network is coupled to a Viterbi alignment module that chooses the best interpretation of the input. Training errors are propagated through the Viterbi module.

Journal Name: Advances in Neural Information Processing Systems (NIPS 1992)

Publication Date: 1993

Download (.pdf)

Abstract In order to generalize from a training set to a test set, it is desirable that small changes in the input space of a pattern do not change the output components. This can be done by forcing this behavior as part of the training... more

Abstract In order to generalize from a training set to a test set, it is desirable that small changes in the input space of a pattern do not change the output components. This can be done by forcing this behavior as part of the training algorithm. This is done in double backpropagation by forming an energy function that is the sum of the normal energy term found in backpropagation and an additional term that is a function of the Jacobian.

Journal Name: Neural Networks, IEEE Transactions on

Publication Date: Nov 1992

Download (.pdf)

Abstract We describe a novel method for simultaneously detecting faces and estimating their pose in real time. The method employs a convolutional network to map images of faces to points on a low-dimensional manifold parametrized by pose,... more

Abstract We describe a novel method for simultaneously detecting faces and estimating their pose in real time. The method employs a convolutional network to map images of faces to points on a low-dimensional manifold parametrized by pose, and images of non-faces to points far away from that manifold. Given an image, detecting a face and estimating its pose is viewed as minimizing an energy function with respect to the face/non-face binary variable and the continuous pose parameters.

Journal Name: The Journal of Machine Learning Research

Publication Date: May 1, 2007

Download (.pdf)

Abstract An interestmg property of connectiomst systems is their ability to learn from examples. Although most recent work in the field concentrates on reducing learning times, the most important feature of a learning machine is its... more

Abstract An interestmg property of connectiomst systems is their ability to learn from examples. Although most recent work in the field concentrates on reducing learning times, the most important feature of a learning machine is its generalization performance. It is usually accepted that good generalization performance on real-world problems cannot be achieved unless some a pnon knowledge about the task is butlt Into the system.

Journal Name: Connectionism in perspective

Publication Date: Jun 1989

Download (.pdf)

In pattern recognition, statistical modeling, or regression, the amount of data is a critical factor a. ecting the performance. If the amount of data and computational resources are unlimited, even trivial algorithms will converge to the... more

In pattern recognition, statistical modeling, or regression, the amount of data is a critical factor a. ecting the performance. If the amount of data and computational resources are unlimited, even trivial algorithms will converge to the optimal solution. However, in the practical case, given limited data and other resources, satisfactory performance requires sophisticated methods to regularize the problem by introducing a priori knowledge.

Journal Name: Neural networks: tricks of the trade

Publication Date: 1998

Download (.pdf)

The ability of learning networks to generalize can be greatly enhanced by providing constraints from the task domain. This paper demonstrates how such constraints can be integrated into a backpropagation network through the architecture... more

The ability of learning networks to generalize can be greatly enhanced by providing constraints from the task domain. This paper demonstrates how such constraints can be integrated into a backpropagation network through the architecture of the network. This approach has been successfully applied to the recognition of handwritten zip code digits provided by the US Postal Service. A single network learns the entire recognition operation, going from the normalized image of the character to the final classification.

Journal Name: Neural computation

Publication Date: Dec 1989

Download (.pdf)

A method for measuring the capacity of learning machines is described. The method is based on fitting a theoretically derived function to empirical measurements of the maximal difference between the error rates on two separate data sets... more

A method for measuring the capacity of learning machines is described. The method is based on fitting a theoretically derived function to empirical measurements of the maximal difference between the error rates on two separate data sets of varying sizes. Experimental measurements of the capacity of various types of linear classifiers are presented.

Journal Name: Neural Computation

Publication Date: Sep 1994

Download (.pdf)

Publication Date: 1998

Publication Name: IEEE Transactions on Circuits and Systems for Video Technology

Research Interests: Unsupervised Learning, Feature Extraction, Bit Error Rate, Shift Invariant, Scale Invariant Feature Transform, and Compression Ratio<div>()</div>

Research Interests: Handwritten Character Recognition using Neural Network<div>()</div>

Research Interests: Neural Network, Spatial Representation, hidden Markov model, and EM algorithm<div>()</div>

Research Interests: Image compression, Document Analysis, Real Time, and High Resolution<div>()</div>

Research Interests: Computer Vision, System on Chip, and Silicon on Insulator<div>()</div>

Research Interests: Pattern Recognition<div>()</div>

Research Interests: Software Architecture, Wavelets, Segmentation, and Compression Ratio<div>()</div>

Research Interests: Character Recognition, Image recognition, Optical Character Recognition, High Speed, Neural Net, and Hardware architecture<div>()</div>

Research Interests: Cognitive Science, FPGA, and Neurosciences<div>()</div>

Research Interests: Mechanical Engineering, Long Range, Field Robotics, Field, and Electrical And Electronic Engineering<div>()</div>

Research Interests: Low Energy Buildngs, Boolean Satisfiability, and Tagucghi Loss Function<div>()</div>

Research Interests: Neural Network, Object Recognition, and Shape Recognition<div>()</div>

Research Interests: Neural<div>()</div>

Research Interests: Neural Network, Systems, Optical Character Recognition, Digital Signal Processor, and Neural Net<div>()</div>

Research Interests: Cognitive Science, Neural Net, and Electrical And Electronic Engineering<div>()</div>

Research Interests: Graph Transformation<div>()</div>

Research Interests: Signal Processing, Fundamental Frequency, Real Time, Front end, and Pitch Tracking<div>()</div>

Research Interests: Computer Vision, Image Processing, Machine Learning, Image Classification, Human Performance, and New record<div>()</div>

Research Interests: Field-Programmable Gate Arrays, Face Detection, Low Power, Field Programmable Gate Array, Programmable Logic, and Feed-Forward<div>()</div>

Research Interests: Collaborative Filtering, Computational Efficiency, Domain Knowledge, and Matrix factorization<div>()</div>

Research Interests: Cognitive Science, Pattern Recognition, Mixture of Gaussians, Electrical And Electronic Engineering, Feature Space, and Generic model<div>()</div>

Research Interests: Mechanical Engineering, Field Robotics, Field, Electrical And Electronic Engineering, and Robot Navigation<div>()</div>

Research Interests: Online Learning, Supervised Learning, Field Experiment, Long Range, and Obstacle Detection<div>()</div>

Publication Date: Jun 13, 2010

Publication Date: Aug 19, 2006

Journal Name: Advances in neural information processing systems 2, NIPS 1989

Publication Date: Feb 1990

Journal Name: Proceedings of the IEEE

Publication Date: Nov 1998

Journal Name: Neural Computation

Publication Date: Nov 1995

Journal Name: Neural networks: the statistical mechanics perspective

Publication Date: 1995

Publication Date: 1995

Journal Name: Proceedings of the 1988 connectionist models summer school

Publication Date: Sep 1988

Publication Date: Jun 20, 2005

Publication Date: Jun 16, 1990

Publication Date: Oct 9, 1994

Journal Name: The handbook of brain theory and neural networks

Publication Date: 1995

Publication Date: Jun 17, 2007

Journal Name: Pattern Recognition

Publication Date: Dec 31, 1991

Journal Name: Solid-State Circuits, IEEE Journal of

Publication Date: Dec 1991

Journal Name: Large-Scale Kernel Machines

Publication Date: 2007

Journal Name: Communications Magazine, IEEE

Publication Date: Nov 1989

Publication Date: Jun 27, 2004

Journal Name: Neural networks: Tricks of the trade

Publication Date: 1998

Journal Name: Advances in Neural Information Processing Systems (NIPS 1999)

Publication Date: Jul 20, 1999

Journal Name: Circuits and Systems for Video Technology, IEEE Transactions on

Publication Date: Nov 1998

Publication Date: Jun 20, 2009

Journal Name: Advances in neural information processing systems

Publication Date: 2007

Journal Name: Advances in Neural Information Processing Systems (NIPS 1992)

Publication Date: 1993

Journal Name: Neural Networks, IEEE Transactions on

Publication Date: Nov 1992

Journal Name: The Journal of Machine Learning Research

Publication Date: May 1, 2007

Journal Name: Connectionism in perspective

Publication Date: Jun 1989

Journal Name: Neural networks: tricks of the trade

Publication Date: 1998

Journal Name: Neural computation

Publication Date: Dec 1989

Journal Name: Neural Computation

Publication Date: Sep 1994

Log In

Research Interests:
Unsupervised Learning, Feature Extraction, Bit Error Rate, Shift Invariant, Scale Invariant Feature Transform, and Compression Ratio

Research Interests:
Handwritten Character Recognition using Neural Network

Research Interests:
Neural Network, Spatial Representation, hidden Markov model, and EM algorithm

Research Interests:
Image compression, Document Analysis, Real Time, and High Resolution

Research Interests:
Computer Vision, System on Chip, and Silicon on Insulator

Research Interests:
Pattern Recognition

Research Interests:
Software Architecture, Wavelets, Segmentation, and Compression Ratio

Research Interests:
Character Recognition, Image recognition, Optical Character Recognition, High Speed, Neural Net, and Hardware architecture

Research Interests:
Cognitive Science, FPGA, and Neurosciences

Research Interests:
Mechanical Engineering, Long Range, Field Robotics, Field, and Electrical And Electronic Engineering

Research Interests:
Low Energy Buildngs, Boolean Satisfiability, and Tagucghi Loss Function

Research Interests:
Neural Network, Object Recognition, and Shape Recognition

Research Interests:
Neural

Research Interests:
Neural Network, Systems, Optical Character Recognition, Digital Signal Processor, and Neural Net

Research Interests:
Cognitive Science, Neural Net, and Electrical And Electronic Engineering

Research Interests:
Graph Transformation

Research Interests:
Signal Processing, Fundamental Frequency, Real Time, Front end, and Pitch Tracking

Research Interests:
Computer Vision, Image Processing, Machine Learning, Image Classification, Human Performance, and New record

Research Interests:
Field-Programmable Gate Arrays, Face Detection, Low Power, Field Programmable Gate Array, Programmable Logic, and Feed-Forward

Research Interests:
Collaborative Filtering, Computational Efficiency, Domain Knowledge, and Matrix factorization

Research Interests:
Cognitive Science, Pattern Recognition, Mixture of Gaussians, Electrical And Electronic Engineering, Feature Space, and Generic model

Research Interests:
Mechanical Engineering, Field Robotics, Field, Electrical And Electronic Engineering, and Robot Navigation

Research Interests:
Online Learning, Supervised Learning, Field Experiment, Long Range, and Obstacle Detection