About

Location

76 (CAR) 2214, 54 Lombard Memorial Dr., Rochester, NY 14623

Ashish Gupta

Case Western Reserve University, Biomedical Engineering, Post-Doc

Other Affiliations:
Indian Institute of Technology, Kanpur, Electrical Engineering, AlumnusOhio State University, Dept. of Civil, Env. and Geodetic Engineering, Post-Doc, and 4 more
Rochester Institute of Technology, Chester F. Carlson Center for Imaging Science, Post-DocUniversité de Caen Normandie, Department of Informatics, Post-DocUniversity of Mumbai, Instrumentation Engineering, AlumnusUniversity of Surrey, Electronics Engineering, Alumnus
add
Research Interests:
Machine Learning, Image Processing, Computer Vision, Signal Processing, Pattern Recognition, Computer Science, and 8 moreArtificial Intelligence, Human Computer Interaction, Image Analysis, Statistical machine learning, Robotics, Generative AI, Large language models, and Deep Learningedit
About:
I am a multi-disciplinary scientist and software developer, researching the technology domains of machine and deep le... moreI am a multi-disciplinary scientist and software developer, researching the technology domains of machine and deep learning, computer vision, data science, signal and image processing.
I am a professional member of the IEEE, ACM, AAAI and review papers for the journals CVIU, IEEE-TIP, PE&RS and several technical conferences.
My driving vision is designing thinking machines that function symbiotically with high dimensional and dynamic data in complex systems.
I wear several hats including scientist, engineer, teacher, and entrepreneur. Besides co-founding Ubihere, an intelligent geo-spatial localization solutions company, I serve as technical adviser and collaborator on multiple technology ventures. In academia, I mentor graduate students and also collaborate on research grant proposals to funding organizations including NSF, NGA, NASA, NOAA, and DoD.
My research focus at present is development of Deep Learning methods, especially Adversarial Networks towards media forensic analysis on a DARPA funded project. I have ongoing collaborations in the indoor navigation and automated driving technology space.
I look forward to opportunities to work with technologists in academia, industry or start-up space.edit
Supervisors:
Alper Yilmaz, Frederic Jurie, Josef Kittler, K S Venkatesh, Christopher Kanan, Chistye Sisson, Pingbo Tang, Marc Kozakedit

Papers

Publication Date: 2012

Publication Name: 2012 19th IEEE International Conference on Image Processing

Research Interests:
Mathematics, Computer Science, Artificial Intelligence, Computer Vision, Image Processing, and 6 moreFuzzy Logic, Pattern Recognition, Fuzzy Clustering, Image Classification, Cluster Analysis, and Codebook

Download (.pdf)

Publication Date: 2011

Research Interests:
Mathematics, Computer Science, Artificial Intelligence, Visualization, Computational Modeling, and 11 morePrincipal Component Analysis, Machine Vision, Entropy, Dimensionality Reduction, Measurement, Vectors, Euclidean Distance, Histograms, Curse of Dimensionality, Rényi Entropy, and Feature vector

Download (.pdf)

Publisher: Elsevier BV

Publication Date: 2017

Publication Name: Automation in Construction

Research Interests:
Engineering, Automation and robotics in Construction, and Construction Automation

Download (.pdf)

Digital Image Processing is an emerging field finding applications in different domains of science and engineering. Image processing forms the basis for pattern recognition and machine learning technologies. Color Image Processing deals with color spaces and models for performing operations on an image of one type and transform it to another model for efficient analysis and feature manipulation. In this paper, a new image enhancement method is established for better visual perception and improving image quality. Histogram of an image is useful in determining the contents in the image and its distribution. Histogram equalization is the technique used to improve the dynamic range of an image and help in distribution of parameter wrights over the entire domain range rather than it getting concentrated over specific regions. This paper proposes an effective approach to enhance image quality by histogram equalization without compromising on mean brightness aspect of the image.

Publication Date: 2017

Download (.pdf)

According to the embodiments provided herein, a trajectory determination device for geo-localization can include one or more relative position sensors, one or more processors, and memory. The one or more processors can execute machine readable instructions to receive the relative position signals from the one or more relative position sensors. The relative position signals can be transformed into a sequence of relative trajectories. Each of the relative trajectories can include a distance and directional information indicative of a change in orientation of the trajectory determination device. A progressive topology can be created based upon the sequence of relative trajectories; this progressive topology can be compared to map data. A geolocation of the trajectory determination device can be determined.

DOI: 10101466B2

Publication Date: 2018

Publication Name: US Patent

Research Interests:
SLAM, Autonomous navigation, GNSS applications, Indoor Localization, and Geographic Information Systems (GIS)

Download (.pdf)

A B S T R A C T Nuclear power plant (NPP) outages are challenging construction projects. Delays in NPP outage processes can cause significant economic losses. In typical NPP outages, extremely busy outage schedules, large crew sizes, dynamic workspaces and zero tolerance of accidents pose challenges to ensuring the resilience of outage control, which should rapidly and proactively respond to delays, errors, or unexpected tasks added during outages. Two mutually interacting practical problems impede NPPs from achieving such resilient outage control: 1) how to control errors and wastes effectively during the " handoffs " between tasks, and 2) how to respond to numerous contingencies in NPP outage workflows in a responsive and proactive manner. A resilient NPP outage control system should address these two practical problems through " Human-Centered Automation (HCA), " which is improving the control process automation while fully considering human factors. Previous studies examined two categories of technologies that potentially enable HCA in outage control: 1) computational modeling and simulation methods for predicting states of field operations and workflows; 2) data collection and processing methods for capturing the reality and thus providing feedback to computational models. Unfortunately, limited studies systematically synthesize technological challenges related to these practical problems and underlying HCA principles. This paper identifies the domain requirements, challenges, and potential solutions of achieving the HCA system that effectively supports resilient NPP outage control. This proposed system aims at significantly improving the performance of handoff monitoring/control and responding to contingencies during the outage. Firstly, the authors identified information acquisition and modeling challenges of achieving human-center automation for outage control. The rest of the paper then synthesizes potential techniques available in the domains of computer science, cognitive science, system science, and construction engineering that can potentially address these challenges. The authors concluded this literature and technological review with a research roadmap for achieving HCA in construction.

Download (.pdf)

Nuclear power plant (NPP) outages involve maintenance and repair activities of a large number of workers in limited workspaces, while having tight schedules and zero-tolerance for accidents. During an outage, thousands of workers will be working around the NPP. Extremely high outage costs and expensive delays in maintenance projects (around $1.5 million per day) require tight outage schedules (typically 20 days). In such packed workspaces, real-time human behavior monitoring is critical for ensuring safe collaboration among workers, minimal wastes of time and resources due to the lack of situational awareness, and timely project control. Current methods for detailed human behavior monitoring on construction sites rely on manual imagery data collection and analysis, which is tedious and error-prone. This paper presents a framework of automatic imagery data analysis that enables real-time detection and diagnosis of anomalous human behaviors during outages, through the integration of 4D construction simulation and object tracking algorithms.

Download (.pdf)

Images have become the most popular type of mul-timedia in the Big Data era. Widely used applications like automatic CBIR underscore the importance of image understanding, especially in terms of the semantically meaningful information. Typically, high dimensional image descriptors are embedded to a subspace using a simple linear projection. However, semantic information has a complex distribution in feature space that requires a non-linear projection. We first estimate an intrinsic dimensionality of image data. Next we build a measure of visual information in embedded subspace. We compare several linear and non-linear projection methods. We use multiple image databases towards a comprehensive evaluation. We report results in terms of information content, consequent recognition rates, and computational cost. This paper is relevant for researchers interested in dimensionality reduction for large scale image understanding that is both quick and preserves semantically relevant information.

DOI: 10.1109/BigMM.2017.75

Location: Los Angeles, CA

More Info: A. Gupta and A. Yilmaz, "Subspace Projection Methods for Large Scale Image Data Analysis," 2017 IEEE Third International Conference on Multimedia Big Data (BigMM), Laguna Hills, CA, 2017, pp. 260-267.

Publisher: IEEE

Event Date: Apr 14, 2017

Journal Name: Proceedings of IEEE BigMM

Publication Date: 2017

Publication Name: Proceedings of IEEE BigMM 2017

Research Interests:
Computer Vision, Machine Learning, Object Recognition (Computer Vision), Dimensionality Reduction, Big Data, and Subspace Methods

Download (.pdf)

Rapidly growing technologies like autonomous navigation require accurate geo-localization in both outdoor and indoor environments. GNSS based outdoor localization has limitation of accuracy, which deteriorates in urban canyons, forested region and is unavailable indoors. Technologies like RFID, UWB, WiFi are used for indoor localization. These suffer limitations of high infrastructure costs, and signal transmission issues like multi-path, and frequent replacement of transciever batteries. We propose an alternative to localize an individual or a vehcile that is moving inside or outside a building. Instead of mobile RF transceivers, we utilize a sensor suite that includes a video camera and an inertial measurement unit. We estimate a motion trajectory of this sensor suite using Visual Odometery. Instead of pre-installed transceivers, we use GIS map for outdoors, or a BIM model for indoors. The transport layer in GIS map or navigable paths in BIM are abstracted as a graph structure. The geo-location of the mobile platform is inferred by first localizing its trajectory. We introduce an adaptive probabilistic inference approach to search for this trajectory in the entire map with no initialization information. Using an effective graph traversal spawn-and-prune strategy, we can localize the mobile platform in real-time. In comparison to other technologies, our approach requires economical sensors and the required map data is typically available in the public domain. Additionally, unlike other technologies which function exclusively indoors or outdoors, our approach functions in both environments. We demonstrate our approach on real world examples of both indoor and outdoor locations.

Location: San Francisco, CA

Publisher: ACM Press

Event Date: Oct 31, 2016

Organization: ACM

Publication Date: Nov 2016

Publication Name: Proceedings of Indoor Spatial Awareness 2016

Research Interests:
Geo-spatial analysis with GIS and GPS, Digital mapping, Geospatial Analysis, Visual SLAM for mobile robots, and Visual Odometry

Download (.pdf)

Indoor positioning system is a rapidly emerging technology. Unlike outdoor positioning, which uses triangulation from satellites in line-of-sight, current indoor positioning methods attempt triangulation using Received Signal Strength Indicator (RSSI) from indoor transmitters, like WiFi and RFID. These methods, however, are not accurate and suffer from issues like multi-path and absorption by walls and other objects. In this paper we propose an alternate and novel approach to indoor positioning, that combines signals from multiple sensors. In particular, we focus on visual and inertial sensors that are ubiquitously found in mobile devices. We utilize a Building Information Model (BIM) of the indoor environment as a guideline for navigable paths. The sensor suite signals are processed to generate a trajectory of device moving through the indoor environment. We compute features on this trajectory in real-time and data mine pre-computed features on BIM's navigable paths to determine the location of the device in real-time. We demonstrate our approach on BIM in our university campus. The key benefit of our approach is that unlike previous methods that require installation of a wireless sensor network of several transmitters spanning the indoor environment, we only require a floor-plan BIM and cheap ubiquitous sensor suite on board a mobile device for indoor positioning.

Location: Orlando FL

Publisher: IEEE

Event Date: Oct 31, 2016

Journal Name: Proceedings of IEEE SENSORS 2016

Publication Date: Nov 2016

Research Interests:
Geo-spatial analysis with GIS and GPS, Graph/Network Algorithms, Indoor Localization, and Visual odometry, SLAM, obstacle detection

Download (.pdf)

The primary method for geo-localization is based on GPS which has issues of localization accuracy, power consumption, and unavailability. This paper proposes a novel approach to geo-localization in a GPS-denied environment for a mobile platform. Our approach has two principal components: public domain transport network data available in GIS databases or OpenStreetMap; and a trajectory of a mobile platform. This trajectory is estimated using visual odometry and 3D view geometry. The transport map information is abstracted as a graph data structure, where various types of roads are modeled as graph edges and typically intersections are modeled as graph nodes. A search for the trajectory in real time in the graph yields the geo-location of the mobile platform. Our approach uses a simple visual sensor and it has a low memory and computational footprint. In this paper, we demonstrate our method for trajectory estimation and provide examples of geolocalization using public-domain map data. With the rapid proliferation of visual sensors as part of automated driving technology and continuous growth in public domain map data, our approach has the potential to completely augment, or even supplant, GPS based navigation since it functions in all environments.

Location: Prague, Czech Republic

Event Date: Jul 17, 2016

Organization: ISPRS

Publication Date: Jul 2016

Publication Name: Annals of ISPRS

Research Interests:
Geo-spatial analysis with GIS and GPS, Geolocation, Visual SLAM for mobile robots, and Visual odometry, SLAM, obstacle detection

Download (.pdf)

This paper presents a novel adaptation of fuzzy clustering and feature encoding for image classification. Visual word ambiguity has recently been successfully modelled by kernel codebooks to provide improvement in classification performance over the standard `Bag-of-Features'(BoF) approach, which uses hard partitioning and crisp logic for assignment of features to visual words. Motivated by this progress we utilize fuzzy logic to model the ambiguity and combine it with clustering to discover fuzzy visual words. The feature descriptors of an image are encoded using the learned fuzzy membership function associated with each word. The codebook built using this fuzzy encoding technique is demonstrated to provide superior performance over BoF. We use the Gustafson-Kessel algorithm which is an improvement over Fuzzy C-Means clustering and can adapt to local distributions. We evaluate our approach on several popular datasets and demonstrate that it consistently provides superior performance to the BoF approach.

Location: Orlando, FL

Publisher: IEEE

Conference End Date: Oct 3, 2012

Conference Start Date: Sep 30, 2012

Research Interests:
Computer Vision, Image Processing, and Pattern Recognition

Download (.pdf)

"Visual category recognition is a difficult task of significant interest to the machine learning and vision community. One of the principal hurdles is the high dimensional feature space. This paper evaluates several linear and non-linear dimensionality reduction techniques. A novel evaluation metric, the renyi entropy of the inter-vector Euclidean distance distribution, is introduced. This information theoretic measure judges the techniques on their preservation of structure in lower-dimensional sub-space. The popular dataset, Caltech-101 is utilized in the experiments. The results indicate that the techniques which preserve local neighbourhood structure performed best amongst the techniques evaluated in this paper.
"

Location: Barcelona, Spain

More Info: Co-authored with Prof.Richard Bowden, published in proceedings of European Signal Processing Conference, 2011.

Publisher: EURASIP

Research Interests:
Computer Vision and Machine Learning

Download (.pdf)

This paper presents a novel approach to learning a codebook for visual categorization, that resolves the key issue of intra-category appearance variation found in complex real world datasets. The codebook of visual-topics (semantically equivalent descriptors) is made by grouping visual-words (syntactically equivalent descriptors) that are scattered in feature space. We analyze the joint distribution of images and visual-words using information theoretic co-clustering to discover visual-topics. Our approach is compared with the standard `Bag-of-Words' approach. The statistically significant performance improvement in all the datasets utilized (Pascal VOC 2006; VOC 2007; VOC 2010; Scene-15) establishes the efficacy of our approach.

Location: Rome, Italy

More Info: editor : Csurka, Gabriela and Braz, José

Publisher: SciTePress

Publication Date: May 25, 2012

Publication Name: VISAPP 2012

Research Interests:
Computer Vision, Image Processing, and Pattern Recognition

Download (.pdf)

This paper presents a novel approach to learning a visual dictionary from sub-manifolds, using co-clustering, where each sub-manifold is associated with a semantically relevant part of a visual category. The standard dictionary learning technique, called `Bag-of-Features' is limited by problems of high-dimensionality, sparsity, and noise associated with affine invariant feature descriptors. Our approach draws inspiration from the relation between object part-based models; semantic topic models; non-negative matrix factorization of multivariate data; and sub-spaces in feature space, to resolve these issues in learning a dictionary. We use co-clustering, which performs simultaneous clustering and dimensionality reduction in an optimal way, to discover multiple semantically relevant sub-spaces. We use an information-theoretic and Euclidean divergence based co-clustering. Our approach is comprehensively evaluated on several popular datasets. This work constitutes a principled first step towards a semantically
meaningful dictionary, with regards to correspondence between object parts and multiple sub-manifolds, and is not intended to compete with state-of-the-art methods like sparse coding. It is specially pertinent for the future for learning a dictionary with increasing complexity of visual categories.

Location: Edinburgh, Scotland

Journal Name: Proceedings of ICML 2012

Publication Date: Jun 30, 2012

Publication Name: Sparsity, Dictionaries and Projections in Machine Learning and Signal Processing, ICML Workshop

Research Interests:
Computer Vision and Machine Learning

Download (.pdf)

Books

[Draft version: to appear as chapter in Encyclopedia of GIS 2nd Ed., Springer.]
Using Computer Vision towards geospatial localization in GPS-denied or degraded environment.

Research Interests:
Computer Vision, Geospatial Analysis, and Geographic Information Systems (GIS)

Download (.pdf)

In this thesis, realistic looking isolated character images indistinguishable from a writer’s individualistic writing were pseudo randomly generated by using a statistical model which learns that writer’s characteristic handwriting style. Hitherto research focus had been on modeling the human writing process or analyzing dynamic handwriting data, neither of which are viable approaches for widespread application in the growing human computer interaction technology. A writer specific statistical model of the most influential handwriting features was trained from multiple samples of each letter written by that writer from an optimal handwriting sampling text passage. Each sample letter was analyzed and new letters synthesized as a sequence of connected sub-strokes, using control-point extraction and clustering by correspondence search of multiple samples, followed by stroke curve synthesis using spline interpolation functions.

Several significant algorithms were tested for each stage of the synthesis procedure to find the techniques optimal for static handwriting data: entropy based threshold for character image extraction; Kuwahara filter for de-noising; Zhang-Suen algorithm for skeletonization; distance transform for control-point selection; shape-context descriptor for control-point correspondence search; thin plate splines for control-point transformation; and interpolating splines for generating stroke curves. Empirical results indicate that the novel combination of handwriting specific algorithms in this thesis can generate realistic synthetic handwriting in a given writer’s unique style, of satisfactory quality.

Location: Kanpur, India

More Info: Gupta, A. (2006). Writer Dependent Handwriting Synthesis. Indian Institute of Technology Kanpur, India.

Organization: Indian Institute of Technology, Kanpur

Page Numbers: 142

Publication Date: Jul 15, 2006

Research Interests:
Computer Vision, Image Processing, Applied Statistics, Image Analysis, Handwriting Recognition (Computer Vision), and 3 moreShape Analysis (Computer Science), Document Image Analysis, and Morphological Image Processing

Download (.pdf)

This thesis deals with the problem of estimating structure in data due to the semantic relations between data elements and leveraging this information to learn a visual model for category recognition. A visual model consists of dictionary learning, which computes a succinct representation of training data by partitioning feature space, and feature encoding, which learns a representation of each image as a combination of dictionary elements. Besides variations in lighting and pose, a key challenge of classifying a category is intra-category appearance variation. The key idea in this thesis is that feature data describing a category has latent structure due to visual content idiomatic to a category. However, popular algorithms in literature disregard this structure when computing a visual model.

Towards incorporating this structure in the learning algorithms, this thesis analyses two facets of feature data to discover relevant structure. The first is structure amongst the sub-spaces of the feature descriptor. Several sub-space embedding techniques that use global or local information to compute a projection function are analysed. A novel entropy based measure of structure in the embedded descriptors suggests that relevant structure has local extent. The second is structure amongst the partitions of feature space. Hard partitioning of feature space leads to ambiguity in feature encoding. To address this issue, novel fuzzy logic based dictionary learning and feature encoding algorithms are employed that are able to model the local feature vectors distributions and provide performance benefits.

To estimate structure amongst sub-spaces, co-clustering is used with a training descriptor data matrix to compute groups of sub-spaces. A dictionary learnt on feature vectors embedded in these multiple sub-manifolds is demonstrated to model data better than a dictionary learnt on feature vectors embedded in a single sub-manifold computed using principal components. In a similar manner, co-clustering is used with encoded feature data matrix to compute groups of dictionary elements - referred to as `topics'. A topic dictionary is demonstrated to perform better than a regular dictionary of comparable size. Both these results suggest that the groups of sub-spaces and dictionary elements have semantic relevance.

All the methods developed here have been viewed from the unifying perspective of matrix factorization, where a data matrix is decomposed to two factor matrices which are interpreted as a dictionary matrix and a co-efficient matrix. Sparse coding methods, which are currently enjoying much success, can be viewed as matrix factorization with a regularization constraint on the vectors of the dictionary or co-efficient matrices. With regards to sub-space embedding, the sparse principal component analysis is one such method that induces sparsity amongst the sub-spaces selected to represent each descriptor. Similarly, Lasso is used to induce sparsity in feature encoding by using only a sub-set of dictionary elements to represent each image. While these methods are effective, they disregard structure in the data matrix. To improve on this, structured sparse principal component analysis is used in conjunction with co-clustered groups of sub-spaces to induce sparsity at group level. The resultant structured sparse sub-
manifold dictionary is demonstrated to provide performance benefits. In a similar manner, group Lasso is used with co-clustered groups of dictionary elements to induce sparsity in terms of topics. The structured sparse encoding is demonstrated to improve aggregate performance in comparison to a regular sparse coding.

In conclusion, this thesis estimates structure in descriptor sub-spaces and learnt dictionary, uses co-clustering to compute semantically relevant sub-manifolds and topic dictionary, and finally incorporates the estimated structure in sparse coding methods, demonstrating performance gain for visual category recognition.

Location: Guildford, U.K.

More Info: PhD Thesis, University of Surrey

Event Date: Jul 5, 2013

Organization: University of Surrey

Publication Date: Sep 18, 2013

Research Interests:
Computer Vision, Image Processing, Machine Learning, Clustering and Classification Methods, Fuzzy Clustering, and Sparse Respresentation

Download (.pdf)

More Info: Paperback: 124 pages; Language: English; ISBN-10: 383836399X; ISBN-13: 978-3838363998

Publisher: LAP LAMBERT Academic Publishing

Publication Date: Aug 8, 2010

Research Interests:
Pattern Recognition and Handwriting Recognition (Computer Vision)

Talks

Automated detection of diseased plants using visual analysis. This was a weekend hackathon event on developing an app towards processing mobile device picture for possible diseases in plants.

Location: Columbus, OH

Organization: Rev1 Ventures, Columbus StartUp

Conference End Date: May 15, 2016

Conference Start Date: May 13, 2016

Research Interests:
Computer Vision, Image Processing, and Image segmentation

Download (.pdf)

Learning a semantically relevant visual dictionary using semantic group discovery by co-clustering.

Research Interests:
Computer Vision, Pattern Recognition, Clustering and Classification Methods, Object Recognition (Computer Vision), and Coclustering

Download (.pdf)

Learning a Structured Model for Visual Category Recognition

Abstract:
This thesis deals with the problem of estimating structure in data due to the semantic relations between data elements and leveraging this information to learn a visual model for category recognition. A visual model consists of dictionary learning, which computes a succinct representation of training data by partitioning feature space, and feature encoding, which learns a representation of each image as a combination of dictionary elements. Besides variations in lighting and pose, a key challenge of classifying a category is intra-category appearance variation. The key idea in this thesis is that feature data describing a category has latent structure due to visual content idiomatic to a category. However, popular algorithms in literature disregard this structure when computing a visual model.

Towards incorporating this structure in the learning algorithms, this thesis analyses two facets of feature data to discover relevant structure. The first is structure amongst the sub-spaces of the feature descriptor. Several sub-space embedding techniques that use global or local information to compute a projection function are analysed. A novel entropy based measure of structure in the embedded descriptors suggests that relevant structure has local extent. The second is structure amongst the partitions of feature space. Hard partitioning of feature space leads to ambiguity in feature encoding. To address this issue, novel fuzzy logic based dictionary learning and feature encoding algorithms are employed that are able to model the local feature vectors distributions and provide performance benefits.

To estimate structure amongst sub-spaces, co-clustering is used with a training descriptor data matrix to compute groups of sub-spaces. A dictionary learnt on feature vectors embedded in these multiple sub-manifolds is demonstrated to model data better than a dictionary learnt on feature vectors embedded in a single sub-manifold computed using principal components. In a similar manner, co-clustering is used with encoded feature data matrix to compute groups of dictionary elements - referred to as `topics'. A topic dictionary is demonstrated to perform better than a regular dictionary of comparable size. Both these results suggest that the groups of sub-spaces and dictionary elements have semantic relevance.

All the methods developed here have been viewed from the unifying perspective of matrix factorization, where a data matrix is decomposed to two factor matrices which are interpreted as a dictionary matrix and a co-efficient matrix. Sparse coding methods, which are currently enjoying much success, can be viewed as matrix factorization with a regularization constraint on the vectors of the dictionary or co-efficient matrices. With regards to sub-space embedding, the sparse principal component analysis is one such method that induces sparsity amongst the sub-spaces selected to represent each descriptor. Similarly, Lasso is used to induce sparsity in feature encoding by using only a sub-set of dictionary elements to represent each image. While these methods are effective, they disregard structure in the data matrix. To improve on this, structured sparse principal component analysis is used in conjunction with co-clustered groups of sub-spaces to induce sparsity at group level. The resultant structured sparse sub-manifold dictionary is demonstrated to provide performance benefits. In a similar manner, group Lasso is used with co-clustered groups of dictionary elements to induce sparsity in terms of topics. The structured sparse encoding is demonstrated to improve aggregate performance in comparison to a regular sparse coding.

In conclusion, this thesis estimates structure in descriptor sub-spaces and learnt dictionary, uses co-clustering to compute semantically relevant sub-manifolds and topic dictionary, and finally incorporates the estimated structure in sparse coding methods, demonstrating performance gain for visual category recognition.

Time: 11 AM to 12 PM

Location: Guildford, U.K.

Event Date: Jul 5, 2013

Organization: CVSSP, University of Surrey

Research Interests:
Computer Vision, Machine Learning, Clustering and Classification Methods, Object Recognition (Computer Vision), Image Analysis, and 8 moreStatistical machine learning, Biclustering, Fuzzy Clustering, Dimensionality Reduction, Visual Categorisation, Subspace Methods, Coclustering, and Sparse Respresentation

Download (.pdf)

In this thesis, realistic looking isolated character images indistinguishable from a writer’s individualistic writing were pseudo randomly generated by using a statistical model which
learns that writer’s characteristic handwriting style. Hitherto research focus had been on modeling the human writing process or analyzing dynamic handwriting data, neither of which are viable approaches for widespread application in the growing human computer interaction technology. A writer specific statistical model of the most influential handwriting features was trained from multiple samples of each letter written by that writer from an optimal handwriting sampling text passage. Each sample letter was analyzed and new letters synthesized as a sequence of connected sub-strokes, using control-point extraction and clustering by correspondence search of multiple samples, followed by stroke curve synthesis using spline interpolation functions.
Several significant algorithms were tested for each stage of the synthesis procedure to find the techniques optimal for static handwriting data: entropy based threshold for character
image extraction; Kuwahara filter for de-noising; Zhang-Suen algorithm for skeletonization; distance transform for control-point selection; shape-context descriptor for control-point correspondence search; thin plate splines for control-point transformation; and interpolating
splines for generating stroke curves. Empirical results indicate that the novel combination of handwriting specific algorithms in this thesis can generate realistic synthetic handwriting in a given writer’s unique style, of satisfactory quality.

Location: Kanpur, India

Organization: Indian Institute of Technology, Kanpur

Research Interests:
Computer Vision, Image Processing, Handwriting Recognition (Computer Vision), Digital Image Processing, and Document Image Analysis

Download (.pdf)

Drafts

Objective: To describe the use and experiences with external fetal monitoring devices among obstetrical providers.
Methods: Nurse, midwife, and physician obstetrical providers at an academic center were surveyed in this cross-sectional study regarding their experiences with the external fetal monitoring device utilized by their hospital system in the outpatient, inpatient, and labor and delivery (L&D) settings. The 217 providers were invited to participate between April and July, 2017. Associations between provider characteristics, device use, perception of challenging patients, and potential usefulness of an improved system were assessed with Fisher’s exact tests.
Results: Of 137 respondents, providers reported difficulties monitoring specific patient populations including obese women (98.5%), multiple gestation pregnancies (90.5%), and early gestational ages (71.5%). Most providers (84.6%) noted that patients find current devices uncomfortable. Over half (59.5%) of L&D nurses reported spending greater than 1-hour during a typical 12-hour shift interacting with these devices. Most respondents believed that an automated system would be moderately or very useful (92.6%). We found no statistically significant associations identified between provider age, years of experience, or time spent utilizing the devices with perception of challenging patient types or reporting patient discomfort.
Conclusion: This study reveals widespread views among obstetrical providers that current external fetal monitoring devices have shortcomings and that an improved system may provide benefits. These perceptions remained regardless of provider experience or time utilizing these devices. Nurses reported significant amounts of time operating the devices, representing an opportunity to reduce time and costs with an improved device.

Research Interests:
Gynecology and Obstetrics and Maternal-Fetal Medicine

Download (.pdf)

Video analysis with the aim of discovering social relations between the people in that video is an important and unexplored topic with significant benefit towards a higher level understanding of videos. This article focuses on the inference of two social groups in each video where members of each group share friendly relations with each other and have an adversarial relation with members of the other social group. Using low-level audiovisual features and motion trajectories we compute a measure of expression of social relation in each scene in video. The occurrence of actors in each scene is computed using face recognition with LBP descriptor. The actor-scene forms a 2-model social network, which we use to compute a 1-mode network of actors. The leaders of each group, which are the actors with greater social impact are estimated using Eigencentrality. We demonstrate our approach on several Hollywood films, which span genres of action, adventure, drama, sci-fi, thriller, historic, and fantasy. This approach is successful at using video content analysis to infer the two social groups and typically the principal protagonist and antagonist in the films as well.

Research Interests:
Computer Vision, Face Recognition, Social Network Analysis (SNA), Video Analysis, and Video summarization

Download (.pdf)

Conference Presentations

The task is estimating geolocation utilizing sensors and databases that reliably fun ction in GPS degraded environments and are compatible with the computing and c ommunication resources available onboard a mobile platform. Objective: Real-time geospatial localization of a mobile platform in GPS-denied e nvironment using static maps obtained from GIS database and trajectory estimatio n of onboard video camera using visual odometry. Significance: Foundational framework for coalescing information rich domains of GIS and Computer Vision technologies towards automated and ubiquitous navigati on. Approach: 3D localization of onboard camera-by computing visual features an d camera video inter-frame view geometry-is used to generate a geospatial traj ectory. This trajectory is utilized as a shape based query in a static map.

Location: Columbus, OH

Organization: Consortium of Ohio Universities on Navigation and Timekeeping

Publication Date: Apr 6, 2016

Conference End Date: Apr 6, 2016

Conference Start Date: Apr 5, 2016

Research Interests:
GNSS and GPS, GPS Denied Indoor Navigation, Visual odometry, SLAM, obstacle detection, and Geographic Information Systems (GIS)

Related Authors

Indonesian Journal of Electrical Engineering and Computer Science

Ashish Gupta

Publication Date: 2012

Publication Name: 2012 19th IEEE International Conference on Image Processing

Publication Date: 2011

Publisher: Elsevier BV

Publication Date: 2017

Publication Name: Automation in Construction

Research Interests: Engineering, Automation and robotics in Construction, and Construction Automation<div>()</div>

Publication Date: 2017

DOI: 10101466B2

Publication Date: 2018

Publication Name: US Patent

Research Interests: SLAM, Autonomous navigation, GNSS applications, Indoor Localization, and Geographic Information Systems (GIS)<div>()</div>

DOI: 10.1109/BigMM.2017.75

Location: Los Angeles, CA

More Info: A. Gupta and A. Yilmaz, "Subspace Projection Methods for Large Scale Image Data Analysis," 2017 IEEE Third International Conference on Multimedia Big Data (BigMM), Laguna Hills, CA, 2017, pp. 260-267.

Publisher: IEEE

Event Date: Apr 14, 2017

Journal Name: Proceedings of IEEE BigMM

Publication Date: 2017

Publication Name: Proceedings of IEEE BigMM 2017

Research Interests: Computer Vision, Machine Learning, Object Recognition (Computer Vision), Dimensionality Reduction, Big Data, and Subspace Methods<div>()</div>

Location: San Francisco, CA

Publisher: ACM Press

Event Date: Oct 31, 2016

Organization: ACM

Publication Date: Nov 2016

Publication Name: Proceedings of Indoor Spatial Awareness 2016

Research Interests: Geo-spatial analysis with GIS and GPS, Digital mapping, Geospatial Analysis, Visual SLAM for mobile robots, and Visual Odometry<div>()</div>

Location: Orlando FL

Publisher: IEEE

Event Date: Oct 31, 2016

Journal Name: Proceedings of IEEE SENSORS 2016

Publication Date: Nov 2016

Research Interests: Geo-spatial analysis with GIS and GPS, Graph/Network Algorithms, Indoor Localization, and Visual odometry, SLAM, obstacle detection<div>()</div>

Location: Prague, Czech Republic

Event Date: Jul 17, 2016

Organization: ISPRS

Publication Date: Jul 2016

Publication Name: Annals of ISPRS

Research Interests: Geo-spatial analysis with GIS and GPS, Geolocation, Visual SLAM for mobile robots, and Visual odometry, SLAM, obstacle detection<div>()</div>

Location: Orlando, FL

Publisher: IEEE

Conference End Date: Oct 3, 2012

Conference Start Date: Sep 30, 2012

Research Interests: Computer Vision, Image Processing, and Pattern Recognition<div>()</div>

Location: Barcelona, Spain

More Info: Co-authored with Prof.Richard Bowden, published in proceedings of European Signal Processing Conference, 2011.

Publisher: EURASIP

Research Interests: Computer Vision and Machine Learning<div>()</div>

Location: Rome, Italy

More Info: editor : Csurka, Gabriela and Braz, José

Publisher: SciTePress

Publication Date: May 25, 2012

Publication Name: VISAPP 2012

Research Interests: Computer Vision, Image Processing, and Pattern Recognition<div>()</div>

Location: Edinburgh, Scotland

Journal Name: Proceedings of ICML 2012

Publication Date: Jun 30, 2012

Publication Name: Sparsity, Dictionaries and Projections in Machine Learning and Signal Processing, ICML Workshop

Research Interests: Computer Vision and Machine Learning<div>()</div>

Research Interests: Computer Vision, Geospatial Analysis, and Geographic Information Systems (GIS)<div>()</div>

Location: Kanpur, India

More Info: Gupta, A. (2006). Writer Dependent Handwriting Synthesis. Indian Institute of Technology Kanpur, India.

Organization: Indian Institute of Technology, Kanpur

Page Numbers: 142

Publication Date: Jul 15, 2006

Location: Guildford, U.K.

More Info: PhD Thesis, University of Surrey

Event Date: Jul 5, 2013

Organization: University of Surrey

Publication Date: Sep 18, 2013

Research Interests: Computer Vision, Image Processing, Machine Learning, Clustering and Classification Methods, Fuzzy Clustering, and Sparse Respresentation<div>()</div>

More Info: Paperback: 124 pages; Language: English; ISBN-10: 383836399X; ISBN-13: 978-3838363998

Publisher: LAP LAMBERT Academic Publishing

Publication Date: Aug 8, 2010

Research Interests: Pattern Recognition and Handwriting Recognition (Computer Vision)<div>()</div>

Location: Columbus, OH

Organization: Rev1 Ventures, Columbus StartUp

Conference End Date: May 15, 2016

Research Interests:
Engineering, Automation and robotics in Construction, and Construction Automation

Research Interests:
SLAM, Autonomous navigation, GNSS applications, Indoor Localization, and Geographic Information Systems (GIS)

Research Interests:
Computer Vision, Machine Learning, Object Recognition (Computer Vision), Dimensionality Reduction, Big Data, and Subspace Methods

Research Interests:
Geo-spatial analysis with GIS and GPS, Digital mapping, Geospatial Analysis, Visual SLAM for mobile robots, and Visual Odometry

Research Interests:
Geo-spatial analysis with GIS and GPS, Graph/Network Algorithms, Indoor Localization, and Visual odometry, SLAM, obstacle detection

Research Interests:
Geo-spatial analysis with GIS and GPS, Geolocation, Visual SLAM for mobile robots, and Visual odometry, SLAM, obstacle detection

Research Interests:
Computer Vision, Image Processing, and Pattern Recognition

Research Interests:
Computer Vision and Machine Learning

Research Interests:
Computer Vision, Image Processing, and Pattern Recognition

Research Interests:
Computer Vision and Machine Learning

Research Interests:
Computer Vision, Geospatial Analysis, and Geographic Information Systems (GIS)

Research Interests:
Computer Vision, Image Processing, Machine Learning, Clustering and Classification Methods, Fuzzy Clustering, and Sparse Respresentation

Research Interests:
Pattern Recognition and Handwriting Recognition (Computer Vision)

Research Interests:
Computer Vision, Image Processing, and Image segmentation

Research Interests:
Computer Vision, Pattern Recognition, Clustering and Classification Methods, Object Recognition (Computer Vision), and Coclustering

Research Interests:
Computer Vision, Image Processing, Handwriting Recognition (Computer Vision), Digital Image Processing, and Document Image Analysis

Research Interests:
Gynecology and Obstetrics and Maternal-Fetal Medicine

Research Interests:
Computer Vision, Face Recognition, Social Network Analysis (SNA), Video Analysis, and Video summarization

Research Interests:
GNSS and GPS, GPS Denied Indoor Navigation, Visual odometry, SLAM, obstacle detection, and Geographic Information Systems (GIS)