[go: up one dir, main page]

Skip to content
/ pHash Public

Software to identify known plasmid sequence from metagenomic assembly using Minhash

License

Notifications You must be signed in to change notification settings

haradama/pHash

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

pHash

pHash is a software to identify known plasmid from metagenomic assembly with the very lightweight database.

Installation

pHash is available in release page:(https://github.com/haradama/pHash/releases)

Usage

Please download the plasmid database file on Zenodo: (http://doi.org/10.5281/zenodo.1991549)

Identifier of plasmid using database

Usage:
  pHash identify [flags]

Flags:
  -d, --db string       Database
  -h, --help            help for identify
  -i, --in string       Input FASTA file
  -k, --kmer int        Length of k-mer (default 17)
  -o, --out string      Output FASTA file
  -p, --paralell int    Number of parallel processing (default 4)
  -s, --sketch int      Sketch size (default 1024)
  -t, --threshold int   Threshold of probability (default 10)

for example,

pHash identify -d PLASMID_DATABASE -i YOUR_METAGEMOMIC_DATA

If you want to build your own database, please execute the following command.

pHash makedb -i YOUR_PLASMID_DATA -o YOUR_DATABASE_NAME

Test

sh ./tests/install_test_data.sh
pHash identify -d plasmidDB11062018.phash -i testData.fna

License

GNU General Public License v3.0