Advance Search              Latest Recources

Showing search results (1–1 of 1):

  1. Benchmark (View Publication)
    Full Name of the Resource : Protein classification benchmark collection: training/test sets for machine learning
    Resource Category : Databases -> Protein Sequence Databases -> Protein Domain Databases (Protein Classification)

    Brief Description : The Protein Classification Benchmark collection ( was created in order to provide standard datasets on which the performance of machine learning methods can be compared. It is primarily meant for methods developers and users interested in comparing methods under standardized conditions. The collection contains datasets of sequences and structures, and each set is subdivided into positive/negative, training/test sets in several ways. There is a total of 6405 classification tasks, 3297 on protein sequences, 3095 on protein structures and 10 on protein coding regions in DNA. Typical tasks include the classification of structural domains in the SCOP and CATH databases based on their sequences or structures, as well as various functional and taxonomic classification problems. In the case of hierarchical classification schemes, the classification tasks can be defined at various levels of the hierarchy (such as classes, folds, superfamilies, etc.). For each dataset there are distance matrices available that contain all vs. all comparison of the data based on various sequence or structure comparison methods, as well as a set of classification performance measures computed with various classifier algorithms.
    Subject Area : Protein Classification

    Institute/s :
    International Centre for Genetic Engineering and Biotechnology, Italy
    Address of Institute/s :
    International Centre for Genetic Engineering and Biotechnology, Area Science Park, 34012 Trieste, Italy
    Country : Italy

    Associated Institutes :

    • Protein Structure and Bioinformatics Group, International Centre for Genetic Engineering and Biotechnology Padriciano 99, 34012 Trieste, Italy
    • Research Group on Artificial Intelligence of the Hungarian Academy of Sciences and University of Szeged, Aradi vértanúk tere 1. H-6720 Szeged, Hungary
    • Institute of Chemistry, Eötvös Loránd University Pázmány Péter sétány 1/A, H-1117 Budapest, Hungary
    • Bioinformatics Group, Biological Research Centre Hungarian Academy of Sciences, Temesvári krt. 62, H-6701 Szeged, Hungary
    • Laboratory of Bioinformatics, Wageningen University and Research Centre PO Box 8128, 6700 ET Wageningen, The Netherlands

    Associated Country : Italy; Hungary; Netherlands

    Authors/Contributors : Pongor, S
    Contact Email :
    Year : 2007
    Language : English

    Keywords : Algorithms; Artificial Intelligence; Databases, Protein; Internet; Protein Structure, Tertiary; Proteins / chemistry / classification; Reproducibility of Results; Sequence Analysis, Protein; User-Computer Interface