Online Journal of
Bioinformatics©
Volume
4 : 96-105, 2003
Moreno FB1, Facó F1, Ceccatto VM2, Sampaio AH3, Costa ASB1, Freitas JLT1, Nogueira LL1, Lima ME4, Lima-Filho JL5, Cavada BS1*
1Departamento de Bioquímica e Biologia Molecular, Universidade Federal do Ceará, Fortaleza-CE Campus do Pici S/N CEP: 60451-970 Caixa Postal 6033 Brasil. 2Universidade Estadual do Ceará (UECE). 3Departamento de Engenharia de Pesca, Universidade Federal do Ceará, Fortaleza-CE, 4LIKA – Universidade Federal do Pernambuco (UFPE), 5Cin – Universidade Federal do Pernambuco(UFPE) *Correspondence bscavada@ufc.br
Moreno FB, Facó F, Ceccatto VM, Sampaio AH, Costa ASB, Freitas JLT, Nogueira LL, Lima ME, Lima-Filho JL, Cavada BS, Matching carbohydrate-binding domains in Arabidopsis thaliana genome: development of a lectin database, Online J Bioinformatics, 4: 96-195, 2003. Processing of databases used for homology searching requires great computational power. Processing time can be reduced by integrating databases. PERL was used to filter specific sequences from a non-redundant protein database by counting and classifying sequences taxonomically. A regular expression was matched against a string. The script was used to build the first lectin database with 1,639 sequences entries in FASTA format. The program was applied to the analysis of Arabidopsis thaliana genome. All the unclassified open reading frames from this genome were catalogued and analyzed by homology searching. Six possible proteins containing carbohydrate domains were found. The proposed lectin database and PERL scripts could be used as a generic proteomic tool.
KEYWORDS: lectin, Arabdopsis, database, genome, proteome