Top List Curated by Listnerd
  • Public list
  • Nov 27th 2012
  • 316 views
  • 94 votes
  • 94 voters
  • 6%
Best SPARQL endpoint provided by Bio2RDF of All Time

More about Best SPARQL endpoint provided by Bio2RDF of All Time:

Best SPARQL endpoint provided by Bio2RDF of All Time is a public top list created by Listnerd on rankly.com on November 27th 2012. Items on the Best SPARQL endpoint provided by Bio2RDF of All Time top list are added by the rankly.com community and ranked using our secret ranking sauce. Best SPARQL endpoint provided by Bio2RDF of All Time has gotten 316 views and has gathered 94 votes from 94 voters. O O

Best SPARQL endpoint provided by Bio2RDF of All Time is a top list in the General category on rankly.com. Are you a fan of General or Best SPARQL endpoint provided by Bio2RDF of All Time? Explore more top 100 lists about General on rankly.com or participate in ranking the stuff already on the all time Best SPARQL endpoint provided by Bio2RDF of All Time top list below.

If you're not a member of rankly.com, you should consider becoming one. Registration is fast, free and easy. At rankly.com, we aim to give you the best of everything - including stuff like the Best SPARQL endpoint provided by Bio2RDF of All Time list.

Get your friends to vote! Spread this URL or share:

Items just added

    1
    OMIM : Online Mendelian Inheritance in Man

    OMIM : Online Mendelian Inheritance in Man

    This database is a catalog of human genes and genetic disorders authored and edited by Dr. Victor A. McKusick and his colleagues at Johns Hopkins and elsewhere, and developed for the World Wide Web by NCBI, the National Center for Biotechnology Information. The database contains textual information and references. It also contains copious links to MEDLINE and sequence records in the Entrez system, and links to additional related resources at NCBI and elsewhere.http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=OMIM
    8.20
    5 votes
    2
    UniSTS : Integrating Markers and Maps

    UniSTS : Integrating Markers and Maps

    UniSTS is a comprehensive database of sequence tagged sites (STSs) derived from STS-based maps and other experiments. STSs are defined by PCR primer pairs and are associated with additional information such as genomic position, genes, and sequences.
    5.57
    7 votes
    3
    Gene: Database of genes from NCBI

    Gene: Database of genes from NCBI

    Gene integrates information from a wide range of species. A record may include nomenclature, Reference Sequences (RefSeqs), maps, pathways, variations, phenotypes, and links to genome-, phenotype-, and locus-specific resources worldwide.
    7.80
    5 votes
    4
    ChEBI

    ChEBI

    Chemical Entities of Biological Interest, also known as ChEBI, is a database and ontology of molecular entities focused on 'small' chemical compounds, that is part of the Open Biomedical Ontologies effort. The term "molecular entity" refers to any "constitutionally or isotopically distinct atom, molecule, ion, ion pair, radical, radical ion, complex, conformer, etc., identifiable as a separately distinguishable entity". The molecular entities in question are either products of nature or synthetic products used to intervene in the processes of living organisms. Molecules directly encoded by the genome, such as nucleic acids, proteins and peptides derived from proteins by proteolytic cleavage, are not as a rule included in ChEBI. ChEBI uses nomenclature, symbolism and terminology endorsed by the International Union of Pure and Applied Chemistry (IUPAC) and Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (NC-IUBMB). All data in the database is non-proprietary or is derived from a non-proprietary source. It is thus freely accessible and available to anyone. In addition, each data item is fully traceable and explicitly referenced to the original
    7.20
    5 votes
    5
    BioCyc : Collection of Pathway/Genome Databases

    BioCyc : Collection of Pathway/Genome Databases

    BioCyc is a collection of 371 Pathway/Genome Databases. Each Pathway/Genome Database in the BioCyc collection describes the genome and metabolic pathways of a single organism, with the exception of the MetaCyc database, which is a reference source on metabolic pathways from many organisms. To learn more about BioCyc, read the Introduction to BioCyc or watch our free online instructional videos. The BioCyc databases are divided into three tiers, based on their quality. BioCyc Tier 1: Intensively Curated Databases  EcoCyc Escherichia coli K-12 MetaCyc Metabolic pathways and enzymes from more than 900 organisms The  BioCyc Open Chemical Database is also an intensively curated database. It is an open database of chemical compounds from other BioCyc databases. Because it contains chemical compounds only, it is not a Pathway/Genome Database. BioCyc Tier 2: Computationally-Derived Databases Subject to Moderate Curation 20 databases are available. [list of tier 2 DBs] BioCyc Tier 3: Computationally-Derived Databases Subject to No Curation 349 databases are available and ready for adoption [more] by interested scientists for curation and updating. PGDBs in Tier 3 were produced as a collaboration between the groups of Peter D. Karp at SRI International and Christos Ouzounis at the European Bioinformatics Institute. [list of tier 3 DBs]
    7.00
    5 votes
    6
    MGI : Mouse genome database (MGD) from Mouse Genome Informatics (MGI)

    MGI : Mouse genome database (MGD) from Mouse Genome Informatics (MGI)

    The Mouse Genome Informatics (MGI) Database provides integrated access to data on the genetics, genomics and biology of the laboratory mouse. The projects contributing to this resource are: Mouse Genome Database (MGD) Project
    MGD includes data on gene characterization, nomenclature, mapping, gene homologies among mammals, sequence links, phenotypes, allelic variants and mutants, and strain data. (See About MGD.) Gene Expression Database (GXD) Project
    GXD integrates different types of gene expression information from the mouse and provides a searchable index of published experiments on endogenous gene expression during development.(See About GXD.) Mouse Tumor Biology (MTB) Database Project
    MTB integrates data on the frequency, incidence, genetics, and pathology of neoplastic disorders, emphasizing data on tumors that develop characteristically in different genetically defined strains of mice. (See About the MTB.) Gene Ontology (GO) Project at MGI
    The Mouse Genome Informatics group is a founding member of the Gene Ontology Consortium (www.geneontology.org). MGI fully incorporates the GO in the database and provides a GO browser. (See The Gene Ontology (GO) Project.)
    10.00
    3 votes
    7
    Protein Data Bank (RCSB PDB)

    Protein Data Bank (RCSB PDB)

    The Protein Data Bank (PDB) is a repository for the 3-D structural data of large biological molecules, such as proteins and nucleic acids. (See also crystallographic database). The data, typically obtained by X-ray crystallography or NMR spectroscopy and submitted by biologists and biochemists from around the world, are freely accessible on the Internet via the websites of its member organisations (PDBe, PDBj, and RCSB). The PDB is overseen by an organization called the Worldwide Protein Data Bank, wwPDB. The PDB is a key resource in areas of structural biology, such as structural genomics. Most major scientific journals, and some funding agencies, such as the NIH in the USA, now require scientists to submit their structure data to the PDB. If the contents of the PDB are thought of as primary data, then there are hundreds of derived (i.e., secondary) databases that categorize the data differently. For example, both SCOP and CATH categorize structures according to type of structure and assumed evolutionary relations; GO categorize structures based on genes. Two forces converged to initiate the PDB: 1) a small but growing collection of sets of protein structure data determined by
    7.00
    4 votes
    8
    GL : KEGG Ligand Database for Carbohydrate Structure

    GL : KEGG Ligand Database for Carbohydrate Structure

    Functional information of genes and proteins is organized in KEGG as ortholog groups, called KEGG Orthology (KO) groups, to cover all organisms. The KO groups for glycosyltransferases are finely classified ortholog groups distinguishing known substrate specificity, which can be viewed as functional hierarchies of the KEGG BRITE database. GlycosyltransferasesGlycosyltransferase reactions
    8.67
    3 votes
    9
    IProClass : Integrated Protein Knowledgebase

    IProClass : Integrated Protein Knowledgebase

    The iProClass database provides value-added information reports for UniProtKB and unique NCBI Entrez protein sequences in UniParc, with links to over 90 biological databases, including databases for protein families, functions and pathways, interactions, structures and structural classifications, genes and genomes, ontologies, literature, and taxonomy. iProClass combines both data warehouse and hypertext navigation methods for integrating data, providing a comprehensive picture of protein properties that may lead to novel prediction and functional inference for previously uncharacterized "hypothetical" proteins and protein groups.
    6.75
    4 votes
    10
    SID : PubChem Substance

    SID : PubChem Substance

    The PubChem Substances Database contains descriptions of chemical samples, from a variety of sources, and links to PubMed citations, protein 3D structures, and biological screening results that are available in PubChem BioAssay.
    7.33
    3 votes
    11
    PubMed

    PubMed

    PubMed is a free database accessing primarily the MEDLINE database of references and abstracts on life sciences and biomedical topics. The United States National Library of Medicine (NLM) at the National Institutes of Health maintains the database as part of the Entrez information retrieval system. PubMed was first released in January 1996. In addition to MEDLINE, PubMed provides access to: Many PubMed records contain links to full text articles, some of which are freely available, often in PubMed Central and local mirrors such as UK PubMed Central. Information about the journals indexed in PubMed is found in the NLM Catalog. As of 20 September 2012 (2012 -09-20), PubMed has over 22.1 million records going back to 1966, selectively to the year 1865, and very selectively to 1809; about 500,000 new records are added each year; 12.38 million of these articles are listed with their abstracts, and 12.81 million articles have links to full-text (of which 3.54 million articles are available full-text for free for any user). To see the current size of the database type "1800:2100[dp]" or "all[sb]" into the PubMed search window. Simple searches on PubMed can be carried out by entering key
    9.00
    2 votes
    12
    DBpedia

    DBpedia

    DBpedia is a project aiming to extract structured content from the information created as part of the Wikipedia project. This structured information is then made available on the World Wide Web. DBpedia allows users to query relationships and properties associated with Wikipedia resources, including links to other related datasets. DBpedia has been described by Tim Berners-Lee as one of the more famous parts of the Linked Data project. The project was started by people at the Free University of Berlin and the University of Leipzig, in collaboration with OpenLink Software, and the first publicly available dataset was published in 2007. It is made available under free licences, allowing others to reuse the dataset. Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles, such as "infobox" tables, categorisation information, images, geo-coordinates and links to external Web pages. This structured information is extracted and put in a uniform dataset which can be queried. As of September 2011, the DBpedia dataset describes more than 3.64 million things, out of which 1.83 million are classified in a consistent ontology, including
    6.67
    3 votes
    13
    ProDom : A protein domain database

    ProDom : A protein domain database

    ProDom is a comprehensive database of protein
    domain families generated from the global comparison
    of all available protein sequences. Recent
    improvements include the use of three-dimensional
    (3D) information from the SCOP database; a completely
    redesigned web interface (http://www.
    toulouse.inra.fr/prodom.html); visualization of
    ProDom domains on 3D structures; coupling of
    ProDom analysis with the Geno3D homology modelling
    server; Bayesian inference of evolutionary
    scenarios for ProDom families. In addition, we have
    developed ProDom-SG, a ProDom-based server
    dedicated to the selection of candidate proteins for
    structural genomics.
    6.67
    3 votes
    14
    MeSH : Medical Subject Headings

    MeSH : Medical Subject Headings

    MeSH is the National Library of Medicine's controlled vocabulary thesaurus. It consists of sets of terms naming descriptors in a hierarchical structure that permits searching at various levels of specificity. MeSH descriptors are arranged in both an alphabetic and a hierarchical structure. At the most general level of the hierarchical structure are very broad headings such as "Anatomy" or "Mental Disorders." More specific headings are found at more narrow levels of the eleven-level hierarchy, such as "Ankle" and "Conduct Disorder." There are 24,767 descriptors in 2008 MeSH. In addition to these headings, there are more than 172,000 headings called Supplementary Concept Records (formerly Supplementary Chemical Records) within a separate thesaurus. There are also over 97,000 entry terms that assist in finding the most appropriate MeSH Heading, for example, "Vitamin C" is an entry term to "Ascorbic Acid."
    8.50
    2 votes
    15
    Affymetrix : Alliance for Cellular Signaling

    Affymetrix : Alliance for Cellular Signaling

    Affymetrix' GeneChip® technology was invented in the late 1980's by a team of scientists led by Stephen P.A. Fodor, Ph.D. The theory behind their work was revolutionary - a notion that semiconductor manufacturing techniques could be united with advances in combinatorial chemistry to build vast amounts of biological data on a small glass chip. This technology became the basis of a new company, Affymetrix, formed as a division of Affymax, N.V. in 1991. Affymetrix began operating independently in 1992.
    6.33
    3 votes
    16
    Taxonomy : NEWT is the taxonomy database maintained by the UniProt

    Taxonomy : NEWT is the taxonomy database maintained by the UniProt

    Taxonomy NEWT is the taxonomy database maintained by the UniProt group. It integrates taxonomy data compiled in the NCBI database and data specific to the UniProt Knowledgebase. [Reference]. Species with protein sequences stored in the UniProt Knowledgebase are named according to UniProt nomenclature [Guide to organism denomination]. We endeavour to maintain a list of manually curated species names for which protein sequence data is available. In particular, we have adopted a systematic convention for naming viral and bacterial strains and isolates. Links to external sites are chosen by the UniProt taxonomy team and show pictures and various scientific data of interest (taxonomy, biology, physiology,...). Due to the sheer volume of data present on the world-wide web, it is unfortunately not possible to contact each site individually. Should you wish to have your site linked to NEWT, or would prefer us to have the link to your site removed, please do not hesitate to contact us. Query the database by keywords (species name) or NCBI taxonomic identifier. NOTE: search by keywords is case-insensitive and scientific as well as common names in plain English can be used; you may use asterisks as wildcards anywhere in the query. Why is it called NEWT? because we like that name. It also happens that newt is the English translation of the French word salamandre, which is the name of a cute little animal that slides through the most impenetrable looking walls. French speakers may enjoy reading la Salamandre.
    6.33
    3 votes
    17
    PubChem Compound database

    PubChem Compound database

    The PubChem Compounds Database contains validated chemical depiction information provided to describe substances in PubChem Substance.

    Structures stored within PubChem Compounds are pre-clustered and cross-referenced by identity and similarity groups. Additionally, calculated properties and descriptors are available for searching and filtering of chemical structures.
    8.00
    2 votes
    18
    PROSITE : A protein domain and family database

    PROSITE : A protein domain and family database


    PROSITE consists of documentation entries describing protein domains, families and functional sites as well as associated patterns and profiles to identify them [More details / References / Disclaimer / Commercial users].
    PROSITE is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of profiles and patterns by providing additional information about functionally and/or structurally critical amino acids [More details].
    6.00
    3 votes
    19
    DR : KEGG Ligand Database for Drug

    DR : KEGG Ligand Database for Drug

    KEGG DRUG is a chemical structure based information resource for all approved drugs in Japan and the U.S.A. Each chemical structure is identified by the D number, and is associated with generic names, trade names, efficacy, target information, etc. KEGG DRUG is maintained in the KEGG LIGAND relational database.
    • DBGET search
    • LIGAND relational database search
    10.00
    1 votes
    20
    PID : Pathway Interaction Database

    PID : Pathway Interaction Database

    The Pathway Interaction Database is a highly-structured, curated collection of information about known biomolecular interactions and key cellular processes assembled into signaling pathways. It is a collaborative project between the US National Cancer Institute (NCI) and Nature Publishing Group (NPG), and is an open access online resource.
    10.00
    1 votes
    21
    UniParc : UniProt Archive a non-redundant archive of protein sequences extracted from Swi

    UniParc : UniProt Archive a non-redundant archive of protein sequences extracted from Swi

    UniProt Archive (UniParc) is part of UniProt project. It is a non-redundant archive of protein sequences extracted from public databases UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, PIR-PSD, EMBL, EMBL WGS, Ensembl, IPI, PDB, PIR-PSD, RefSeq, FlyBase, WormBase, H-Invitational Database, TROME database, European Patent Office proteins, United States Patent and Trademark Office proteins (USPTO) and Japan Patent Office proteins.

    UniParc contains only protein sequences. All other information about the protein must be retrieved from the source databases using the database cross-references. Each unique sequence is stored only once with a stable identifier. The format of the identifier is UPI followed by ten hexadecimal numbers, e.g. UPI000000000A.

    UniParc proteins are linked to their source databases by database cross-references. Each cross-reference links one protein in UniParc to an accession number in a source database. The database cross-reference is active as long as the sequence identified by the source accession number remains unchanged. When the sequence is modified or removed in the source database, the cross-reference from UniParc becomes inactive. Active cross-reference can be used to directly access the source databases but inactive cross-references can only be used to access sequences archives, such as the Sequence Version Archive.

    UniParc is available for text- and sequence-based searches. Sequences, which are no longer part of any source database, are excluded from sequence-based searches, but they are available for text-based SRS searches. Performing a similarity search against UniParc is equivalent to performing the same search against all databases cross-referenced in UniParc, as UniParc contains all proteins from its source databases. Sequence similarity searches can be done using FASTA, BLAST or Mpsrch.
    10.00
    1 votes
    22
    UniRef : UniProt Non-redundant Reference Databases

    UniRef : UniProt Non-redundant Reference Databases

    The UniProt NREF (UniProt Reference Clusters) database.

    The two major objectives of UniRef are:
    (i) to facilitate sequence merging in UniProt, and
    (ii) to allow faster and more informative sequence similarity searches. Although the UniProt Knowledgebase is much less redundant than UniParc, it still contains a certain level of redundancy because it is not possible to use fully automatic merging without risking merging of similar sequences from different proteins. However, such automatic procedures are extremely useful in compiling the UniRef databases to obtain complete coverage of sequence space while hiding redundant sequences (but not their descriptions) from view.

    A high level of redundancy results in several problems, including slow database searches and long lists of similar or identical alignments that can obscure novel matches in the output. Thus, a more even sampling of sequence space is advantageous. This can be addressed by clustering closely similar sequences to yield a representative subset of sequences. Therefore, we have created various non-redundant databases with different sequence identity cut-offs. In the UniRef90 and UniRef50 databases no pair of sequences in the representative set has >90% or >50% mutual sequence identity. The UniRef100 database presents identical sequences and sub-fragments as a single entry with protein IDs, sequences, bibliography, and links to protein databases.
    4.50
    4 votes
    23
    BioPAX : Biological Pathways Exchange

    BioPAX : Biological Pathways Exchange

    BioPAX is a collaborative effort to create a data exchange format for biological pathway data. Get involved...

    BioPAX Level 3 covers metabolic pathways, molecular interactions, signaling pathways (including molecular states and generics), gene regulation and genetic interactions. BioPAX Level 3 is currently under development and review by pathway databases and is scheduled for release by mid-2008.
    7.50
    2 votes
    24
    PC : Pathway Commons

    PC : Pathway Commons

    Pathway Commons is a collection of publicly available pathways from multiple organisms. It provides researchers with convenient access to a comprehensive collection of pathways from multiple sources represented in a common language. Access is via a web portal for browsing, query and download. Database providers can share their pathway data via a common repository and avoid duplication and reduce software development costs. Bioinformatics software developers can increase efficiency by sharing software components. Pathways include biochemical reactions, complex assembly, transport and catalysis events, and physical interactions involving proteins, DNA, RNA, small molecules and complexes.
    7.50
    2 votes
    25
    7.00
    2 votes
    26
    CPD : KEGG Ligand Database for Chemical Compound

    CPD : KEGG Ligand Database for Chemical Compound

    KEGG COMPOUND is a chemical structure database for metabolic compounds and other chemical substances that are relevant to biological systems. Each entry is identified by the C number, such as C00047 for L-lysine, and contains various links to other KEGG databases and also to outside databases.
    • DBGET search
    • LIGAND relational database search
    • Compounds with biological roles
    5.33
    3 votes
    27
    INOH : Pathway Database

    INOH : Pathway Database

    INOH (Integrating Network Objects with Hierarchies) is a pathway database of model organisms including human, mouse, rat and others. In INOH, the term pathway refers to higher order functional knowledge such as relationships among multiple bio-molecules that constitute signal transduction pathways or biological events in general. As most part of this knowledge resides in scientific articles, the database focuses on curating and encoding textual knowledge into a machine-processable form. The system contains a number of unique features to encode this type of knowledge. Biological terms such as protein names typically represent abstract, conceptual molecules that are used for unspecified organisms. Biologists interpret the name as a specific instance of protein using background knowledge. For example, the term "MAP-kinase" indicates ERK1 of a human, JNK1 of a mouse, p38alpha of a rat, etc.
    These abstract names are collected from the literature and are organized into an ontology to annotate abstract objects in pathways. In addition, each term has links to database such as SWISS-PROT and Gene Ontology (GO). The system provides pathway information as a composite of biological events, since functional knowledge is usually described as a set of fragmented processes. Each event is annotated with entries of a event ontology, which also has links to GO.
    9.00
    1 votes
    28
    PubChem : The PubChem Project

    PubChem : The PubChem Project

    PubChem provides information on the biological activities of small molecules. It is a component of NIH's Molecular Libraries Roadmap Initiative. If you would like to learn more about how to use the PubChem resources, please go to our help page.
    9.00
    1 votes
    29
    RN : KEGG Ligand Database for Chemical Reaction

    RN : KEGG Ligand Database for Chemical Reaction

    KEGG REACTION contains all reactions taken from KEGG ENZYME and additional reactions taken from the KEGG metabolic pathways. Each reaction is identified by the R number, such as R00259 for the acetylation of L-glutamate. Reactions are linked to ortholog groups of enzymes as defined by the KEGG ORTHOLOGY database, enabling integrated analysis genomic (enzyme genes) and chemical (compound pairs) information.
    • DBGET search
    • IUBMB reactions
    • IUBMB reaction hierarchy
    6.50
    2 votes
    30
    BioCarta : Charting pathways of life

    BioCarta : Charting pathways of life

    Observe how genes interact in dynamic graphical models. Our online maps depict molecular relationships from areas of active research. In an "open source" approach, this community-fed forum constantly integrates emerging proteomic information from the scientific community. It also catalogs and summarizes important resources providing information for over 120,000 genes from multiple species. Find both classical pathways as well as current suggestions for new pathways.
    8.00
    1 votes
    31
    GO : Gene Ontology

    GO : Gene Ontology

    The Gene Ontology, or GO, is a major bioinformatics initiative to unify the representation of gene and gene product attributes across all species. More specifically, the project aims to: The GO is part of a larger classification effort, the Open Biomedical Ontologies (OBO). There is no universal standard terminology in biology and related domains, and term usages may be specific to a species, research area or even a particular research group. This makes communication and sharing of data more difficult. The Gene Ontology project provides an ontology of defined terms representing gene product properties. The ontology covers three domains: Each GO term within the ontology has a term name, which may be a word or string of words; a unique alphanumeric identifier; a definition with cited sources; and a namespace indicating the domain to which it belongs. Terms may also have synonyms, which are classed as being exactly equivalent to the term name, broader, narrower, or related; references to equivalent concepts in other databases; and comments on term meaning or usage. The GO ontology is structured as a directed acyclic graph, and each term has defined relationships to one or more other
    6.00
    2 votes
    32
    Path : KEGG PATHWAY Database

    Path : KEGG PATHWAY Database


    KEGG PATHWAY is a collection of manually drawn pathway maps representing our knowledge on the molecular interaction and reaction networks for:

    1. Metabolism
        Carbohydrate   Energy   Lipid   Nucleotide   Amino acid   Other amino acid
        Glycan   PK/NRP   Cofactor/vitamin   Secondary metabolite   Xenobiotics
    2. Genetic Information Processing
    3. Environmental Information Processing
    4. Cellular Processes
    5. Human Diseases
    and also on the structure relationships (KEGG drug structure maps) in:

    6. Drug Development
    4.33
    3 votes
    33
    EC : The Enzyme Commission

    EC : The Enzyme Commission

    ENZYME is a repository of information relative to the nomenclature of enzymes. It is primarily based on the recommendations of the Nomenclature Committee of the International Union of Biochemistry and Molecular Biology (IUBMB) and it describes each type of characterized enzyme for which an EC (Enzyme Commission) number has been provided [More details / References / Linking to ENZYME / Disclaimer].
    7.00
    1 votes
    34
    CellMap : The Cancer Cell Map

    CellMap : The Cancer Cell Map

    The Cancer Cell Map is a selected set of human cancer focused pathways. Biologists can browse and search the Cancer Cell Map pathways. View gene expression data on any pathway. Computational biologists can download all pathways in BioPAX format for global analysis. Software developers can build software on top of the Cancer Cell Map using the web service API. Download and install the cPath pathway database software to create a local mirror of the Cancer Cell Map. All data is freely available.
    6.00
    1 votes
    35
    InterPro : Integrated resource of protein families, domains and functional sites

    InterPro : Integrated resource of protein families, domains and functional sites

    What is the history of the InterPro project, who established it and when? The InterPro database was established in 1999 when the InterPro Consortium was formed between the SWISS-PROT group at EBI and SIB, and the founding member databases Prints, PROSITE, Pfam and ProDom. The first release was later that year. There are several publications on InterPro, please see: R.Apweiler, T.K.Attwood, A.Bairoch, A.Bateman, E.Birney, M.Biswas, P.Bucher, L.Cerutti, F.Corpet, M.D.R.Croning, R.Durbin, L.Falquet, W.Fleischmann, J.Gouzy, H.Hermjakob, N.Hulo, I.Jonassen, D.Kahn, A.Kanapin, Y.Karavidopoulou, R.Lopez, B.Marx, N.J.Mulder, T.M.Oinn, M.Pagni, F.Servant, C.J.A.Sigrist, E.M.Zdobnov. The InterPro database, an integrated documentation resource for protein families, domains and functional sites. Nucleic Acids Research vol 29(1):37-40 (2001). and Mulder N.J., Apweiler R., Attwood T.K., Bairoch A., Bateman A., Binns D., Biswas M., Bradley P., Bork P., Bucher P., Copley R., Courcelle E., Durbin R., Falquet L., Fleischmann W., Gouzy J., Griffith-Jones S., Haft D., Hermjakob H., Hulo N., Kahn D., Kanapin A., Krestyaninova M., Lopez R., Letunic I., Pagni M., Peyruc D., Ponting C.P., Servant F. and Sigrist C.J.A. InterPro - An integrated documentation resource for protein families, domains and functional sites. Briefings in Bioinformatics 3(3):285-295 (2002). In this issue there are all papers related to InterPro. What is InterPro? What is the difference between member databases and matches? InterPro is a consortium of member databases (PROSITE, Pfam, Prints, ProDom, SMART and TIGRFAMs). Each member database devises methods that can be applied computationally to assign a score for a protein according to how well it matches a given signature. For some types of methods, the classification is binary (i.e. hit or miss), in other cases a numerical value is produced and a cut off point chosen to separate hits from misses. Different member databases create methods/signatures in different ways: some groups build them from alignments studied manually, others use automatic processes with some human input and correction, while ProDom use
    5.00
    1 votes
    36
    HGNC : Human Gene Nomenclature Database

    HGNC : Human Gene Nomenclature Database

    Authority and Responsibilities
    For each known human gene we approve a gene name and symbol (short-form abbreviation).  All approved symbols are stored in the HGNC database.  Each symbol is unique and we ensure that each gene is only given one approved gene symbol.  It is necessary to provide a unique symbol for each gene so that we and others can talk about them, it also facilitates electronic data retrieval from publications.  In preference each symbol maintains parallel construction in different members of a gene family and can also be used in other species, especially the mouse.

    We have already approved over 24,000 symbols; the vast majority of these are for protein-coding genes, but also include symbols for pseudogenes, non-coding RNAs, phenotypes and genomic features (see HGNC Search).  Our current priority is assigning nomenclature to genes submitted to us from the Human Genome Project. In addition to this, individual new symbols are requested by scientists,  journals (e.g. Genomics, Nature Genetics) and databases (e.g. Ensembl, Entrez Gene, MGD, RGD and OMIM), and groups of new symbols by those working on gene families, chromosome segments or whole chromosomes.  In all cases considerable efforts are made to use a symbol acceptable to workers in the field.

    History
    Problems of nomenclature in human genetics were recognised as early as the 1960s and in 1979 full guidelines for human gene nomenclature were presented at the Edinburgh Human Genome Meeting (HGM).  Since then we have continued to strike a compromise between the convenience and simplicity required for the everyday use of human gene nomenclature and the need for adequate definition of the concepts involved.

    The committee has grown from a single force (Dr Phyllis J. McAlpine) to a team of post-docs and bioinformaticians. For eleven years, from 1996-2007, the HGNC was chaired by Professor Sue Povey and based at University College London (UCL). In September 2007 the HGNC relocated to the European Bioinformatics Institute (EBI), to join the PANDA (Protein and Nucleotide Database) group. We regularly attend international meetings such as American Society of Human Genetics (ASHG) and Human Genome Meeting (HGM), and sometimes hold workshops in conjunction with these. This ensures that we are approving gene names in line with the needs of the scientific community, see previous workshops.

    Organisation
    We are a non-profit making body which is jointly funded by the US National Human Genome Research Institute (NHGRI) and the Wellcome Trust (UK).  We operate under the auspices of HUGO, with key policy advice from an International Advisory Committee (IAC).  We also use a team of specialist advisors who provide support on specific gene family nomenclature issues, and work in close collaboration with staff at MGNC.

    Confidentiality
    All enquiries are handled confidentially and unpublished information is never disclosed without explicit permission from the submitters. 

    Other Activities

    • Current Nomenclature Guidelines are available online
    • Recent Publications
    • Nome News, our bimonthly newsletter
    • Forthcoming Meetings we will be attending
    • HUMOT Human and Mouse Orthologous Gene Nomenclature
    0.00
    0 votes
    37
    KEGG : Kyoto Encyclopedia of Genes and Genomes

    KEGG : Kyoto Encyclopedia of Genes and Genomes

    A grand challenge in the post-genomic era is a complete computer representation of the cell, the organism, and the biosphere, which will enable computational prediction of higher-level complexity of cellular processes and organism behaviors from genomic and molecular information. Towards this end we have been developing a bioinformatics resource named KEGG as part of the research projects of the Kanehisa Laboratories in the Bioinformatics Center of Kyoto University and the Human Genome Center of the University of Tokyo.
    0.00
    0 votes
    38
    OBO Foundry

    OBO Foundry

    The Open Biomedical Ontologies (OBO) Foundry (now The Open Biological and Biomedical Ontologies (OBO) Foundry) is a collaborative experiment involving developers of science-based ontologies. The foundry is concerned with establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain. The value of any kind of data is greatly enhanced when it exists in a form that allows it to be integrated with other data. One approach to integration is through the annotation of multiple bodies of data using common controlled vocabularies. Ideally, such controlled vocabularies take the form of 'ontologies', which means that they are constructed in such a way as to support logical reasoning over the data annotated in their terms. The success of this general approach in helping to tame the explosive proliferation of data in the biomedical domain has led to the development of principles of good practice in ontology development, which are now being put into practice within the framework of the Open Biomedical Ontologies consortium through its OBO Foundry initiative. Existing OBO ontologies, including
    0.00
    0 votes
    39
    Pfam

    Pfam

    Pfam is a database of protein families that includes their annotations and multiple sequence alignments generated using hidden Markov models. For each family in Pfam one can: The descriptions of Pfam families are managed by the general public using Wikipedia. 74% of protein sequences have at least one match to Pfam. This number is called the sequence coverage. The Pfam database contains information about protein domains and families. Pfam-A is the manually curated portion of the database that contains over 10,000 entries. For each entry a protein sequence alignment and a hidden Markov model is stored. These hidden Markov models can be used to search sequence databases with the HMMER package written by Sean Eddy. Because the entries in Pfam-A do not cover all known proteins, an automatically generated supplement is provided called Pfam-B. Pfam-B contains a large number of small families derived from clusters produced by an algorithm called ADDA. Although of lower quality, Pfam-B families can be useful when no Pfam-A families are found. The database iPfam builds on the domain description of Pfam. It investigates if different proteins described together in the protein structure
    0.00
    0 votes
    41
    Reactome : A knowledgebase of biological pathways and processes

    Reactome : A knowledgebase of biological pathways and processes

    Reactome is a database of cellular level processes from simple events, such as biochemical reactions, to complex events, such as the cell cycle. It provides process-level annotation of the structure and function of the Human Genome, with dynamic links to other databases relevant to the human system, and to the relevant literature. Our ontology ensures that the various events are linked in an appropriate spatial and temporal context. The database is produced by faculty-level authors, recruited from the biological research community, who write review-style articles on a set of related pathways using a template tool provided by us. The reviews are edited by the staff at CSHL and the EBI, and entered into a relational database. They are then reviewed by other biological researchers for consistency and accuracy. This database was formerly called The Genome Knowledgebase. , Reactome is a curated database of biological processes in humans. It covers biological pathways ranging from the basic processes of metabolism to high-level processes such as hormonal signalling. While Reactome is targeted at human pathways, it also includes many individual biochemical reactions from non-human systems such as rat, mouse, fugu fish and zebra fish. This makes the database relevant to the large number of researchers who work on model organisms. All the information in Reactome is backed up by its provenance: either a literature citation or an electronic inference based on sequence similarity. Our ontology ensures that the various events are linked in an appropriate spatial and temporal context. PThe basic information in Reactome is provided by bench biologists who are experts in that domain of biology. The information is then managed and edited by the Reactome staff at CSHL and the EBI, and entered into a relational database. They are then reviewed by other biological researchers for consistency and accuracy. Following peer-review, the information is published to the web. PReactome supersedes an earlier project called The Genome Knowledgebase and incorporates all the information previously available in its predecessor. Reactome sports a radically redesigned user interface in which the entire set of human pathways known to the database are represented as a series of constellations in a "starry sky." The starry sky can be used to navigate through the universe of human reactions and is invaluable to visualize connections between pathways, some of which will be surprising to biologists who are not familiar with pathways outside their domain of research.
    0.00
    0 votes
    Get your friends to vote! Spread this URL or share:

    Discuss Best SPARQL endpoint provided by Bio2RDF of All Time

    Top List Voters