Kaizhong Zhang

Followers

Following

Co-author

Public Views

Interests

Uploads

Papers by Kaizhong Zhang

An algorithm for detecting homologues of known structured rnas in genomes

Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004.

Download

Improving the Sensitivity and Specificity of Protein Homology Search by Incorporating Predicted Secondary Structures

Journal of Bioinformatics and Computational Biology, 2006

In this paper, we improve the homology search performance by the combination of the predicted pro... more In this paper, we improve the homology search performance by the combination of the predicted protein secondary structures and protein sequences. Previous research suggested that the straightforward combination of predicted secondary structures did not improve the homology search performance, mostly because of the errors in the structure prediction. We solved this problem by taking into account the confidence scores output by the prediction programs.

An Improved Algorithm for Tree Edit Distance Incorporating Structural Linearity

Lecture Notes in Computer Science

Download

Combinatorial pattern discovery for scientific data

ACM SIGMOD Record, 1994

Suppose you are given a set of natural entities (e.g., proteins, organisms, weather patterns, etc... more Suppose you are given a set of natural entities (e.g., proteins, organisms, weather patterns, etc.) that possess some important common externally observable properties. You also have a structural description of the entities (e.g., sequence, topological, or geometrical data) and a distance metric. Combinatorial pattern discovery is the activity of finding patterns in the structural data that might explain these common properties based on the metric. This paper presents an example of combinatorial pattern discovery: the discovery of patterns in protein databases. The structural representation we consider are strings and the distance metric is string edit distance permitting variable length don't cares. Our techniques incorporate string matching algorithms and novel heuristics for discovery and optimization, most of which generalize to other combinatorial structures. Experimental results of applying the techniques to both generated data and functionally related protein families obt...

Download

Complexities and Algorithms for Glycan Structure Sequencing Using Tandem Mass Spectrometry

Proceedings of the 5th Asia-Pacific Bioinformatics Conference, 2007

Download

RNA Secondary Structure Prediction Via Energy Density Minimization

Lecture Notes in Computer Science, 2006

SPIDER: software for protein identification from sequence tags with de novo sequencing error

Proceedings. 2004 IEEE Computational Systems Bioinformatics Conference, 2004. CSB 2004.

Download

Theoretical Computer Science, 2002

Download

The longest common subsequence problem for arc-annotated sequences

Journal of Discrete Algorithms, 2004

Download

An effective algorithm for peptide de novo sequencing from MS/MS spectra

Journal of Computer and System Sciences, 2005

Download

RNA–RNA Interaction Prediction and Antisense RNA Target Search

Journal of Computational Biology, 2006

A General Edit Distance between RNA Structures

Journal of Computational Biology, 2002

Download

Locality and Gaps in RNA Comparison

Journal of Computational Biology, 2007

Download

Perfect Phylogenetic Networks with Recombination

Journal of Computational Biology, 2001

Download

An improved algorithm for tree edit distance with applications for RNA secondary structure comparison

Journal of Combinatorial Optimization, 2012

Download

On the Editing Distance Between Undirected Acyclic Graphs

International Journal of Foundations of Computer Science, 1996

We consider the problem of comparing CUAL graphs (Connected, Undirected, Acyclic graphs with node... more We consider the problem of comparing CUAL graphs (Connected, Undirected, Acyclic graphs with nodes being Labeled). This problem is motivated by the study of information retrieval for bio-chemical and molecular databases. Suppose we define the distance between two CUAL graphs G1 and G2 to be the weighted number of edit operations (insert node, delete node and relabel node) to transform G1 to G2. By reduction from exact cover by 3-sets, one can show that finding the distance between two CUAL graphs is NP-complete. In view of the hardness of the problem, we propose a constrained distance metric, called the degree-2 distance, by requiring that any node to be inserted (deleted) have no more than 2 neighbors. With this metric, we present an efficient algorithm to solve the problem. The algorithm runs in time O(N1N2D2) for general weighting edit operations and in time [Formula: see text] for integral weighting edit operations, where Ni, i=1, 2, is the number of nodes in Gi, D=min{d1, d2} a...