Archives

Indian Journal of Pure & Applied Biosciences (IJPAB)
Year : 2014 , Volume 2, Issue 1
Page No. : 35-39
Article doi: : http://dx.doi.org/10.18782

Extracting Database Properties for Sequence Alignment and Secondary Structure Prediction

Maulika S Patel^1* and Himanshu S Mazumdar²

¹Head, Department of Computer Engineering, G H Patel College of Engineering & Technology,
Vallabh Vidyanagar, Gujarat, India, http://www.gcet.ac.in
²Head, Research & Development Center, Dharmsinh Desai University, Nadiad, India, http://www.ddu.ac.in
^*Corresponding Author E-mail: Maulika.sandip@gmail.com, hsmazumdar@hotmail.com

ABSTRACT

A plethora of continuously increasing data exists in genomic and proteomic domains. Computational
tools are of vital importance for research in these areas. Biologists, who are involved in identifying new
sequences or genes would like to compare their findings with the existing data sets locally. In this paper,
we present a set of utilities that can help the researchers to conveniently extract the fields of interest from
the public protein databases. UniRef100 is a large comprehensive set of unique, non-redundant protein
sequences. The utilities described are used to index, sort, access the records randomly, and extract the
properties of UniRef100 database. The properties derived are used for creating a synthetic bio-random
database for further research in sequence analysis and secondary structure prediction.
Keywords—amino acid pair; amino acid trio; protein database; protein secondary structure prediction.

Full Text : PDF; Journal doi : http://dx.doi.org/10.18782

Cite this article:

Int. J. Pure App. Biosci. 2 (1): 35-39 (2014)

No. 772, Basant Vihar, Kota

Call Us On

Mail Us @