The paper describes a deep neural network-based detector dedicated for ball and players detection... more The paper describes a deep neural network-based detector dedicated for ball and players detection in high resolution, long shot, video recordings of soccer matches. The detector, dubbed FootAndBall, has an efficient fully convolutional architecture and can operate on input video stream with an arbitrary resolution. It produces ball confidence map encoding the position of the detected ball, player confidence map and player bounding boxes tensor encoding players' positions and bounding boxes. The network uses Feature Pyramid Network desing pattern, where lower level features with higher spatial resolution are combined with higher level features with bigger receptive field. This improves discriminability of small objects (the ball) as larger visual context around the object of interest is taken into account for the classification. Due to its specialized design, the network has two orders of magnitude less parameters than a generic deep neural network-based object detector, such as ...
2017 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)
In this paper we propose a novel efficient method of characteristic image point detection based o... more In this paper we propose a novel efficient method of characteristic image point detection based on the fractional order derivative. The concept of this approach called (FSIFT: Fractional-SIFT) is inspired by the Scale-Invariant Feature Transform (SIFT) proposed by Lowe and can be viewed as a certain generalization of this formula. The classical SIFT detector is implemented efficiently by using a difference of Gaussian (DoG) functions applied to image, in order to identify potential interest points. This difference is an approximation of the LoG (Laplacian of Gaussians) operator, which can be treated as the sum of the second order derivatives of the Gaussian image. In our method we take advantage of the fractional order derivative performed on the Gaussian images. In order to extract distinctive invariant features we have omitted the step of calculating DoG images. Instead of it, we have applied the fractional derivatives of different orders not far from the values of two. We have chosen the robust and efficient method of calculating the fractional order derivative using the Fourier domain. Many practical experiments have been performed. The promising results of our approach have been compared with the results of application of the well known algorithms like SURF and SIFT.
2019 16th International Conference on Machine Vision Applications (MVA)
The paper describes a deep network based system specialized for ball detection in long shot video... more The paper describes a deep network based system specialized for ball detection in long shot videos. System comprises of flexible detector and classical particle tracking. The core contribution is incorporation of hy-percolumn concept in the processing pipeline achieving real-time tracking on 12MPx videos. System achieves state-of-the-art results in ISSIA-CNR Soccer Dataset and its feasibility has been tested on 4 camera prototype system.
Numerous computer vision applications rely on local feature descriptors, such as SIFT, SURF or FR... more Numerous computer vision applications rely on local feature descriptors, such as SIFT, SURF or FREAK, for image matching. Although their local character makes image matching processes more robust to occlusions, it often leads to geometrically inconsistent keypoint matches that need to be filtered out, e.g. using RANSAC. In this paper we propose a novel, more discriminative, descriptor that includes not only local feature representation, but also information about the geometric layout of neighbouring keypoints. To that end, we use a Siamese architecture that learns a low-dimensional feature embedding of keypoint constellation by maximizing the distances between non-corresponding pairs of matched image patches, while minimizing it for correct matches. The 48-dimensional floating point descriptor that we train is built on top of the state-of-the-art FREAK descriptor achieves significant performance improvement over the competitors on a challenging TUM dataset.
This paper presents a framework designed for the multi-object detection purposes and adjusted for... more This paper presents a framework designed for the multi-object detection purposes and adjusted for the application of product search on the market shelves. The framework uses a single feedback loop and a pattern resizing mechanism to demonstrate the top effectiveness of the state-of-the-art local features. A high detection rate with a low false detection chance can be achieved with use of only one pattern per object and no manual parameters adjustments. The method incorporates well known local features and a basic matching process to create a reliable voting space. Further steps comprise of metric transformations, graphical vote space representation, two-phase vote aggregation process and a cascade of verifying filters.
The paper presents simple graph features based on a well-known image keypoints. We discuss the ex... more The paper presents simple graph features based on a well-known image keypoints. We discuss the extraction method and geometrical properties that can be used. Chosen methods are tested in KNN tasks for almost 1000 object classes. The approach addresses problems in applications that cannot use learning methods explicitly, as real-time tracking, chosen object detection scenarios and structure from motion. Results imply that the idea is worth further research for chosen systems.
Optics, Photonics and Digital Technologies for Imaging Applications IV, 2016
This paper presents a system designed for the multi-object detection purposes and adjusted for th... more This paper presents a system designed for the multi-object detection purposes and adjusted for the application of product search on the market shelves. System uses well known binary keypoint detection algorithms for finding characteristic points in the image. One of the main idea is object recognition based on Implicit Shape Model method. Authors of the article proposed many improvements of the algorithm. Originally fiducial points are matched with a very simple function. This leads to the limitations in the number of objects parts being success- fully separated, while various methods of classification may be validated in order to achieve higher performance. Such an extension implies research on training procedure able to deal with many objects categories. Proposed solution opens a new possibilities for many algorithms demanding fast and robust multi-object recognition.
The paper describes a deep neural network-based detector dedicated for ball and players detection... more The paper describes a deep neural network-based detector dedicated for ball and players detection in high resolution, long shot, video recordings of soccer matches. The detector, dubbed FootAndBall, has an efficient fully convolutional architecture and can operate on input video stream with an arbitrary resolution. It produces ball confidence map encoding the position of the detected ball, player confidence map and player bounding boxes tensor encoding players' positions and bounding boxes. The network uses Feature Pyramid Network desing pattern, where lower level features with higher spatial resolution are combined with higher level features with bigger receptive field. This improves discriminability of small objects (the ball) as larger visual context around the object of interest is taken into account for the classification. Due to its specialized design, the network has two orders of magnitude less parameters than a generic deep neural network-based object detector, such as ...
2017 Signal Processing: Algorithms, Architectures, Arrangements, and Applications (SPA)
In this paper we propose a novel efficient method of characteristic image point detection based o... more In this paper we propose a novel efficient method of characteristic image point detection based on the fractional order derivative. The concept of this approach called (FSIFT: Fractional-SIFT) is inspired by the Scale-Invariant Feature Transform (SIFT) proposed by Lowe and can be viewed as a certain generalization of this formula. The classical SIFT detector is implemented efficiently by using a difference of Gaussian (DoG) functions applied to image, in order to identify potential interest points. This difference is an approximation of the LoG (Laplacian of Gaussians) operator, which can be treated as the sum of the second order derivatives of the Gaussian image. In our method we take advantage of the fractional order derivative performed on the Gaussian images. In order to extract distinctive invariant features we have omitted the step of calculating DoG images. Instead of it, we have applied the fractional derivatives of different orders not far from the values of two. We have chosen the robust and efficient method of calculating the fractional order derivative using the Fourier domain. Many practical experiments have been performed. The promising results of our approach have been compared with the results of application of the well known algorithms like SURF and SIFT.
2019 16th International Conference on Machine Vision Applications (MVA)
The paper describes a deep network based system specialized for ball detection in long shot video... more The paper describes a deep network based system specialized for ball detection in long shot videos. System comprises of flexible detector and classical particle tracking. The core contribution is incorporation of hy-percolumn concept in the processing pipeline achieving real-time tracking on 12MPx videos. System achieves state-of-the-art results in ISSIA-CNR Soccer Dataset and its feasibility has been tested on 4 camera prototype system.
Numerous computer vision applications rely on local feature descriptors, such as SIFT, SURF or FR... more Numerous computer vision applications rely on local feature descriptors, such as SIFT, SURF or FREAK, for image matching. Although their local character makes image matching processes more robust to occlusions, it often leads to geometrically inconsistent keypoint matches that need to be filtered out, e.g. using RANSAC. In this paper we propose a novel, more discriminative, descriptor that includes not only local feature representation, but also information about the geometric layout of neighbouring keypoints. To that end, we use a Siamese architecture that learns a low-dimensional feature embedding of keypoint constellation by maximizing the distances between non-corresponding pairs of matched image patches, while minimizing it for correct matches. The 48-dimensional floating point descriptor that we train is built on top of the state-of-the-art FREAK descriptor achieves significant performance improvement over the competitors on a challenging TUM dataset.
This paper presents a framework designed for the multi-object detection purposes and adjusted for... more This paper presents a framework designed for the multi-object detection purposes and adjusted for the application of product search on the market shelves. The framework uses a single feedback loop and a pattern resizing mechanism to demonstrate the top effectiveness of the state-of-the-art local features. A high detection rate with a low false detection chance can be achieved with use of only one pattern per object and no manual parameters adjustments. The method incorporates well known local features and a basic matching process to create a reliable voting space. Further steps comprise of metric transformations, graphical vote space representation, two-phase vote aggregation process and a cascade of verifying filters.
The paper presents simple graph features based on a well-known image keypoints. We discuss the ex... more The paper presents simple graph features based on a well-known image keypoints. We discuss the extraction method and geometrical properties that can be used. Chosen methods are tested in KNN tasks for almost 1000 object classes. The approach addresses problems in applications that cannot use learning methods explicitly, as real-time tracking, chosen object detection scenarios and structure from motion. Results imply that the idea is worth further research for chosen systems.
Optics, Photonics and Digital Technologies for Imaging Applications IV, 2016
This paper presents a system designed for the multi-object detection purposes and adjusted for th... more This paper presents a system designed for the multi-object detection purposes and adjusted for the application of product search on the market shelves. System uses well known binary keypoint detection algorithms for finding characteristic points in the image. One of the main idea is object recognition based on Implicit Shape Model method. Authors of the article proposed many improvements of the algorithm. Originally fiducial points are matched with a very simple function. This leads to the limitations in the number of objects parts being success- fully separated, while various methods of classification may be validated in order to achieve higher performance. Such an extension implies research on training procedure able to deal with many objects categories. Proposed solution opens a new possibilities for many algorithms demanding fast and robust multi-object recognition.
Uploads
Papers