ABSTRACT Enriching linear videos by offering continuative and related information via, e.g., audi... more ABSTRACT Enriching linear videos by offering continuative and related information via, e.g., audio streams, web pages, as well as other videos, is typically hampered by its demand for massive editorial work. While a large number of analysis techniques that extract knowledge automatically from video content exists, their produced raw data are typically not of interest to the end user. In this paper, we review our analysis efforts as defined within the LinkedTV project and present the recent advances in core technologies for automatic speech recognition and object-redetection. Furthermore, we introduce our approach for an automatically generated localized person identification database. Finally, the processing of the raw data into a linked resource available in a web compliant format is described.
Proceedings Cvpr Ieee Computer Society Conference on Computer Vision and Pattern Recognition Ieee Computer Society Conference on Computer Vision and Pattern Recognition, Jun 23, 1998
2002 11th European Signal Processing Conference, 2002
Image similarity is a key issue for many multimedia applications. Video summarization is no excep... more Image similarity is a key issue for many multimedia applications. Video summarization is no exception. We have recently proposed a number of methodologies for creating visually significant summaries of videos. Our approach relies heavily on the metric which decides on whether two video key-frames are similar or not. In this paper, we compare a number of histogram representations and possible
ABSTRACT For the four year we have participated to the high-level feature extraction task and we ... more ABSTRACT For the four year we have participated to the high-level feature extraction task and we pursued our effort on the fusion of classifier outputs. Unfortunatly a single run was submitted for evaluation this year, due to lack of computationnal ressources during the limited time available for training and tuning the entire sys-tem. This year's run is based on a SVM classification scheme. Localised color and texture features were extracted from shot key-frames. Then, SVM classifiers were build per concept on the training data set. The fusion of classifier outputs is finally provided by a multilayer neural network. In BBC rushes exploitation, we explore the description of rushes through a visual dictionary. A set of non-redundant images are segmented into blocks. These blocks are clustered in a small number of classes to create a visual dictionary. Then, we can describe each image by the number of blocks of each class. After, we evaluate the power of this visual dictionary for retrieving images from rushes: if we use one or more blocks from an image as a query, are we able to retrieve the original image, and in which position in the result list. And finally, we organize and present video using this visual dictionary.
... Department, Institut EURECOM, BP 193, 06904 Sophia-Antipolis, FRANCE {Itheri.Yahiaoui, Bernar... more ... Department, Institut EURECOM, BP 193, 06904 Sophia-Antipolis, FRANCE {Itheri.Yahiaoui, Bernard.Merialdo,Benoit.Huet,Fabrice.Souvannavong ... Here, we generalize the Maximum Recollection Principle we were employing and we show how this same principle can be used ...
ABSTRACT Enriching linear videos by offering continuative and related information via, e.g., audi... more ABSTRACT Enriching linear videos by offering continuative and related information via, e.g., audio streams, web pages, as well as other videos, is typically hampered by its demand for massive editorial work. While a large number of analysis techniques that extract knowledge automatically from video content exists, their produced raw data are typically not of interest to the end user. In this paper, we review our analysis efforts as defined within the LinkedTV project and present the recent advances in core technologies for automatic speech recognition and object-redetection. Furthermore, we introduce our approach for an automatically generated localized person identification database. Finally, the processing of the raw data into a linked resource available in a web compliant format is described.
Multimedia indexing is about developing techniques allowing people to effectively find media. Con... more Multimedia indexing is about developing techniques allowing people to effectively find media. Content-based methods become necessary when dealing with big databases. Current technology allows exploring the emotional space which is known to carry very interesting semantic information. In this paper we state the need for an integrated method which extracts reliable affective information and attaches this semantic information to the medium itself. We present a list of possible applications and advantages that the emotional information can bring about together with a framework called SAMMI and the preliminary results of this newly initiated research work.
ABSTRACT Enriching linear videos by offering continuative and related information via, e.g., audi... more ABSTRACT Enriching linear videos by offering continuative and related information via, e.g., audio streams, web pages, as well as other videos, is typically hampered by its demand for massive editorial work. While a large number of analysis techniques that extract knowledge automatically from video content exists, their produced raw data are typically not of interest to the end user. In this paper, we review our analysis efforts as defined within the LinkedTV project and present the recent advances in core technologies for automatic speech recognition and object-redetection. Furthermore, we introduce our approach for an automatically generated localized person identification database. Finally, the processing of the raw data into a linked resource available in a web compliant format is described.
Proceedings Cvpr Ieee Computer Society Conference on Computer Vision and Pattern Recognition Ieee Computer Society Conference on Computer Vision and Pattern Recognition, Jun 23, 1998
2002 11th European Signal Processing Conference, 2002
Image similarity is a key issue for many multimedia applications. Video summarization is no excep... more Image similarity is a key issue for many multimedia applications. Video summarization is no exception. We have recently proposed a number of methodologies for creating visually significant summaries of videos. Our approach relies heavily on the metric which decides on whether two video key-frames are similar or not. In this paper, we compare a number of histogram representations and possible
ABSTRACT For the four year we have participated to the high-level feature extraction task and we ... more ABSTRACT For the four year we have participated to the high-level feature extraction task and we pursued our effort on the fusion of classifier outputs. Unfortunatly a single run was submitted for evaluation this year, due to lack of computationnal ressources during the limited time available for training and tuning the entire sys-tem. This year's run is based on a SVM classification scheme. Localised color and texture features were extracted from shot key-frames. Then, SVM classifiers were build per concept on the training data set. The fusion of classifier outputs is finally provided by a multilayer neural network. In BBC rushes exploitation, we explore the description of rushes through a visual dictionary. A set of non-redundant images are segmented into blocks. These blocks are clustered in a small number of classes to create a visual dictionary. Then, we can describe each image by the number of blocks of each class. After, we evaluate the power of this visual dictionary for retrieving images from rushes: if we use one or more blocks from an image as a query, are we able to retrieve the original image, and in which position in the result list. And finally, we organize and present video using this visual dictionary.
... Department, Institut EURECOM, BP 193, 06904 Sophia-Antipolis, FRANCE {Itheri.Yahiaoui, Bernar... more ... Department, Institut EURECOM, BP 193, 06904 Sophia-Antipolis, FRANCE {Itheri.Yahiaoui, Bernard.Merialdo,Benoit.Huet,Fabrice.Souvannavong ... Here, we generalize the Maximum Recollection Principle we were employing and we show how this same principle can be used ...
ABSTRACT Enriching linear videos by offering continuative and related information via, e.g., audi... more ABSTRACT Enriching linear videos by offering continuative and related information via, e.g., audio streams, web pages, as well as other videos, is typically hampered by its demand for massive editorial work. While a large number of analysis techniques that extract knowledge automatically from video content exists, their produced raw data are typically not of interest to the end user. In this paper, we review our analysis efforts as defined within the LinkedTV project and present the recent advances in core technologies for automatic speech recognition and object-redetection. Furthermore, we introduce our approach for an automatically generated localized person identification database. Finally, the processing of the raw data into a linked resource available in a web compliant format is described.
Multimedia indexing is about developing techniques allowing people to effectively find media. Con... more Multimedia indexing is about developing techniques allowing people to effectively find media. Content-based methods become necessary when dealing with big databases. Current technology allows exploring the emotional space which is known to carry very interesting semantic information. In this paper we state the need for an integrated method which extracts reliable affective information and attaches this semantic information to the medium itself. We present a list of possible applications and advantages that the emotional information can bring about together with a framework called SAMMI and the preliminary results of this newly initiated research work.
Uploads
Papers