Skip to main content
PurposeNet is an ontology based on the principle that all artifacts (man-made objects) exist for a purpose and all its features and relations with other entities are goverened by its purpose. We provide instances of ontology creation for... more
PurposeNet is an ontology based on the principle that all artifacts (man-made objects) exist for a purpose and all its features and relations with other entities are goverened by its purpose. We provide instances of ontology creation for two
varied domains from scratch in the PurposeNet architecture. These domains include MMTS domain and recipe domain. The methodology of creation was totally different for the two  domains. MMTS domain was more computationally oriented
ontology while recipe domain required a post-processing after manually entering the data. The post-processing step uses  hierarchical clustering to cluster very close actions. MMTS ontology is further used to create a simple template based QA  system and the results are compared with a database system for the same domain.
Research Interests:
Recent studies in machine translation support the fact that multi-model systems perform better than the individual models. In this paper, we describe a Hindi to English statistical machine translation system and improve over the baseline... more
Recent studies in machine translation support the fact that multi-model systems perform better than the individual models. In this paper, we describe a Hindi to English statistical machine translation system and improve over the baseline using multiple translation models. We have considered phrase based as well as hierarchical models and enhanced over both these baselines using a regression model. The system is trained over textual as well as syntactic features extracted from source and target of the aforementioned translations. Our system shows significant improvement over the baseline systems for both automatic as well as human evaluations. The proposed methodology is quite generic and can easily be extended to other language pairs as well.
Hindi is the lingua-franca of India. Although all non-native speakers can communicate well in Hindi, there are only a few who can read and write in it. In this work, we aim to bridge this gap by building transliteration systems that could... more
Hindi is the lingua-franca of India. Although all non-native speakers can communicate well in Hindi, there are only a few who can read and write in it. In this work, we aim to bridge this gap by building transliteration systems that could transliterate Hindi into at-least 7 other Indian languages. The transliteration systems are developed as a reading aid for non-Hindi readers. The systems are trained on the transliteration pairs extracted automatically from a parallel corpora. All the transliteration systems perform satisfactorily for a non-Hindi reader to understand a Hindi text.
Artifacts are man-made objects taken as a whole. This paper presents various kinds of artifacts based on different division criteria and methods to create a list of artifacts. Different methods have been discussed and then we show how... more
Artifacts are man-made objects taken as a whole. This paper presents various kinds of artifacts based on different division criteria and methods to create a list of artifacts. Different methods have been discussed and then we show how some of them can be used on specific kinds of text to create an exhaustive list of artifacts.
I propose to work on the Petfinder.my Adoption Prediction challenge on kaggle (​https://www.kaggle.com/c/petfinder-adoption-prediction​). ​PetFinder.my​ has been Malaysia’s leading animal welfare platform since 2008, with a database of... more
I propose to work on the Petfinder.my Adoption Prediction challenge on kaggle (​https://www.kaggle.com/c/petfinder-adoption-prediction​). ​PetFinder.my​ has been Malaysia’s leading animal welfare platform since 2008, with a database of more than 150,000 animals.
PurposeNet is a semantic knowledgebase of artifacts, developed with purpose as the underlying principle of design. The principle is based on the observation that human beings tend to not only organize and categorize physical entities... more
PurposeNet is a semantic knowledgebase of artifacts, developed with purpose as the underlying principle of design. The principle is based on the observation that human beings tend to not only organize and categorize physical entities around them intuitively based on purpose, but, the morphology, anatomy and physiology of an artifact as well as its relations with the other artifacts around it are purpose-based. We aim at extracting different semantic relations (descriptive properties) from the Wikipedia for artifacts and then develop ontologies automatically using the web ontology language (OWL). Next we devised methods to populate the list of artifacts which are the basic unit of this system. We also improvised the PurposeNet architecture including various new concepts into it.
Hypergraphs are important data structures used to repre- sent and model the concepts in various areas of Computer Science and Discrete Mathematics. As of now an adjacency matrix representation and a bipartite incidence representation... more
Hypergraphs are important data structures used to repre-
sent and model the concepts in various areas of Computer Science and Discrete Mathematics. As of now an adjacency matrix representation and a bipartite incidence representation have been given for its implementation. The present paper proposes two novel methods for hypergraph representation using adjacency list. A comparison has been made with the existing representations to show that the proposed approach is better in terms of time complexity. Various graph algorithms such as Breadth- first search, Depth-first search, Strongly connected components, Dijkstras shortest path algorithm are implemented and studied in detail using the proposed representation for hypergraphs.
This paper presents the concept of using surface text patterns along with POS tagger for automatically extracting three properties, color, state and shape, of artifacts, any man-made entity, from the corpus. The approach has been... more
This paper presents the concept of using surface text patterns
along with POS tagger for automatically extracting three properties, color, state and shape, of artifacts, any man-made entity, from the corpus. The approach has been compared with the approach of using dependency parser for extraction of relation. The efficiency of both the approaches is also examined. The paper presents an insightful discussion on the issues that we have come across while using STPs for the purpose of information extraction.