Advance Search              Latest Recources





















Showing search results (1–1 of 1):

  1. iProLINK (View Publication)
    Full Name of the Resource : Annotated literature sources for protein features and names
    Resource Category : Databases -> Protein Sequence Databases -> Protein properties

    Brief Description : iProLINK (integrated Protein Literature, INformation and Knowledge) is a resource to facilitate text mining research in the area of literature-based database curation, named entity recognition, and protein ontology development. This collection of annotated data sources can be utilized by computational and biological researchers to explore literature information on proteins and their features or properties (Hu et al., 2004). The data sets for bibliography mapping and feature evidence attribution include mapped citations (PubMed ID to protein entry and feature line mapping) and annotation-tagged literature corpora. The latter includes ~800 abstracts and/or full-text articles in which text evidence was tagged for ~1200 experimentally validated post-translational modifications (PTMs) annotated in the PIR protein sequence database (PIR-PSD). The data sets for entity recognition and ontology development include protein name dictionaries, word token dictionaries, protein name-tagged literature corpora along with tagging guidelines, and a protein ontology based on PIRSF protein family names. All datasets are freely accessible and can be downloaded at http://pir.georgetown.edu/iprolink/.
    Subject Area : Protein Literature


    Institute/s :
    Georgetown University Medical Center, 3900 Reservoir Road, NW, Washington, DC 20057, USA.
    Address of Institute/s :
    Georgetown University Medical Center, 3900 Reservoir Road, NW, Washington, DC 20057, USA.
    Country : United States

    Associated Institutes :

    • Georgetown University Medical Center, 3900 Reservoir Road, NW, Washington, DC 20057, USA

    Associated Country : USA


    Authors/Contributors : Wu, C.H.
    Contact Email : wuc@georgetown.edu
    Year : 2004
    Language : English

    Keywords : Computational Biology; Databases, Bibliographic; Databases, Protein; Information Services; Internet; Proteins / chemistry / classification / genetics; PubMed; Systems Integration