Imagine if we could compute across phenotype data as easily as genomic data; this article calls for efforts to realize this vision and discusses the potential benefits.
%B PLoS Biol
%I Public Library of Science
%V 13
%P e1002033
%8 01
%G eng
%U http://dx.doi.org/10.1371%2Fjournal.pbio.1002033
%R 10.1371/journal.pbio.1002033
%0 Generic
%D 2014
%T Common Reference Ontologies for Plant Biology (cROP): A Platform for Integrative Plant Genomics
%A Cooper, Laurel
%E Justin L. Elser
%E Preece, Justin
%E Arnaud, Elizabeth
%E Sinisa Todorovic
%E Eugene Zhang
%E Christopher Mungall
%E Smith, Barry
%E Dennis Wm. Stevenson
%E Jaiswal, Pankaj
%X Around the world, a small number of plant species serve as the primary source of food for the human population, yet these crops are vulnerable to multiple stressors, such as diseases, nutrient deficiencies and unfavorable environmental conditions. Traditional breeding methods for plant improvement may be combined with next-generation methods such as automated scoring of traits and phenotypes to develop improved varieties. Linking these analyses to the growing corpus of genomics data generated by high-throughput sequencing, transcriptomics, proteomics, phenomics and genome annotation projects requires common, interoperable, reference vocabularies (ontologies) for the description of the data. The ‘Common Reference Ontologies for Plant Biology’ (cROP) initiative is building the needed suite of reference ontologies, together with enhanced data storage and visualization technologies. The cROP will assume the further development of the existing Plant Ontology (PO), Plant Trait Ontology (TO), and Plant Environment Ontology (EO) and will develop the Plant Stress Ontology (PSO) for abiotic and biotic stresses. It will also include relevant aspects of ontologies such as Gene Ontology (GO), Cell Type (CL), Chemical Entities of Biological Interest (ChEBI), Protein Ontology (PRO) and the Phenotypic Qualities Ontology (PATO). It will include a centralized platform where reference ontologies for plants will be used to access cutting-edge data resources for plant traits, phenotypes, diseases, genomes and semantically-queried gene expression and genetic diversity data across a wide range of plant species. cROP will unify and streamline a fragmented semantic framework and will support allele discovery, advance the understanding of crop evolution, and facilitate crop development.
%B Plant and Animal Genome XXII Meeting
%C San Diego, CA
%8 Jan. 11-15, 2014
%G eng
%U https://pag.confex.com/pag/xxii/webprogram/Paper9799.html
%0 Generic
%D 2014
%T Plant Environmental Condition Ontology (EO)
%A Jaiswal, Pankaj
%E Cooper, Laurel
%E Laura Moore
%B Fourth Annual Summit of the Phenotype Ontology Research Coordination Network
%I Phenotype Research Coordination Network
%C Biosphere2, Tucson, AZ
%8 Feb. 21-23, 2014
%G eng
%0 Generic
%D 2013
%T Development of a Unified Phenotype Dataset for Plants
%A Huala, Eva
%E Steven B. Cannon
%E Cooper, Laurel
%E George Gkoutos
%E Lisa C Harper
%E Jaiswal, Pankaj
%E Carolyn J. Lawrence
%E Johnny Lloyd
%E David Meinke
%E Menda, Naama
%E Laura Moore
%E Mueller, Lukas
%E Nelson, Rex T
%E Walls, Ramona L
%X Plant phenotype datasets can be found in a range of formats including free text and species-specific or knowledge domain-specific controlled vocabularies. While this enables some limited comparison of phenotype data across a single species or within a knowledge domain such as crop breeding, queries or analyses that span a broader set of species are not possible in the absence of a common vocabulary for describing phenotypes. To enable cross-species and cross-domain phenotype comparisons and analyses in plants, we have launched an effort to convert existing phenotype datasets for 8 plant species, encompassing both model species and crops, into a common format using taxonomically broad ontologies representing plant anatomical parts and developmental stages (Plant Ontology), biological processes (Gene Ontology), chemicals (ChEBI), and phenotypic qualities (PATO). Our effort focuses on mutant and overexpression phenotypes associated with genes of known sequence in Arabidopsis, tomato, potato, pepper, maize, rice, soybean and Medicago. Shared use of ontologies, annotation standards, formats and best practices across these eight plant species ensures that the resulting dataset will produce valid results for cross-species querying and semantic similarity analyses. Additionally, the dataset will enable us to explore the relationship between sequence similarity and phenotypic similarity across a range of plant species.
%B Plant and Animal Genome 2013
%C San Diego, CA
%8 Jan. 11-16, 2013
%G eng
%U https://pag.confex.com/pag/xxi/webprogram/Paper5616.html
%9 Poster presentationPoster presentation
%0 Generic
%D 2013
%T Development of the Reference Plant Trait Ontology: A Unified Resource for Plant Phenomics
%A Cooper, Laurel
%E Laura Moore
%E Arnaud, Elizabeth
%E Nelson, Rex T
%E Menda, Naama
%E Shrestha, Rosemary
%E Grant, David
%E L. Matteis
%E Mungall, Christopher J
%E Bastow, Ruth
%E McLaren, Graham
%E Jaiswal, Pankaj
%X One of the central principles of biology is the concept that an organism’s genotype interacts with the environment to produce the observable characteristics, or phenotype. Understanding this interaction is a core goal of modern biology, and enables development of organisms with commercially useful characteristics through modern breeding programs. A number of crop- or clade-specific plant trait ontologies have been developed to describe plant traits important for agriculture in order to address major scientific challenges such as food security. Traditionally, phenotype information has been captured in a free text manner, which cannot be easily indexed and presents an obstacle to data sharing. Recent advances in next generation sequencing and phenotyping technologies have allowed researchers to access a growing mountain of data, resulting in an emerging gap between the genomics information and the quantitative information describing phenotypes and traits. One approach to overcome this obstacle is through the annotation of data using a common controlled vocabulary or “ontology". We present our vision of a species-neutral Reference Plant Trait Ontology (Ref-TO) which would be the basis for linking the disparate knowledge domains and that will support data integration and data mining across species. The Ref-TO is one of the modules for the Common Reference Ontology for Plant Science (cROP) which is being developed.
%B Plant and Animal Genome XXI Conference
%C San Diego, CA
%8 Jan. 11-16, 2013
%G eng
%U https://pag.confex.com/pag/xxi/webprogram/Paper7640.html
%9 PosterPoster
%0 Journal Article
%J Database
%D 2013
%T An overview of the BioCreative 2012 Workshop Track III: interactive text mining task
%A Arighi, Cecilia N.
%A Carterette, Ben
%A Cohen, K. Bretonnel
%A Krallinger, Martin
%A Wilbur, W. John
%A Fey, Petra
%A Dodson, Robert
%A Cooper, Laurel
%A Van Slyke, Ceri E.
%A Dahdul, Wasila
%A Mabee, Paula
%A Li, Donghui
%A Harris, Bethany
%A Gillespie, Marc
%A Jimenez, Silvia
%A Roberts, Phoebe
%A Matthews, Lisa
%A Becker, Kevin
%A Drabkin, Harold
%A Bello, Susan
%A Licata, Luana
%A Chatr-aryamontri, Andrew
%A Schaeffer, Mary L
%A Park, Julie
%A Haendel, Melissa
%A Van Auken, Kimberly
%A Li, Yuling
%A Chan, Juancarlos
%A Muller, Hans-Michael
%A Cui, Hong
%A Balhoff, James P.
%A Chi-Yang Wu, Johnny
%A Lu, Zhiyong
%A Wei, Chih-Hsuan
%A Tudor, Catalina O.
%A Raja, Kalpana
%A Subramani, Suresh
%A Natarajan, Jeyakumar
%A Cejuela, Juan Miguel
%A Dubey, Pratibha
%A Wu, Cathy
%X In many databases, biocuration primarily involves literature curation, which usually involves retrieving relevant articles, extracting information that will translate into annotations and identifying new incoming literature. As the volume of biological literature increases, the use of text mining to assist in biocuration becomes increasingly relevant. A number of groups have developed tools for text mining from a computer science/linguistics perspective, and there are many initiatives to curate some aspect of biology from the literature. Some biocuration efforts already make use of a text mining tool, but there have not been many broad-based systematic efforts to study which aspects of a text mining tool contribute to its usefulness for a curation task. Here, we report on an effort to bring together text mining tool developers and database biocurators to test the utility and usability of tools. Six text mining systems presenting diverse biocuration tasks participated in a formal evaluation, and appropriate biocurators were recruited for testing. The performance results from this evaluation indicate that some of the systems were able to improve efficiency of curation by speeding up the curation task significantly (\~{}1.7- to 2.5-fold) over manual curation. In addition, some of the systems were able to improve annotation accuracy when compared with the performance on the manually curated set. In terms of inter-annotator agreement, the factors that contributed to significant differences for some of the systems included the expertise of the biocurator on the given curation task, the inherent difficulty of the curation and attention to annotation guidelines. After the task, annotators were asked to complete a survey to help identify strengths and weaknesses of the various systems. The analysis of this survey highlights how important task completion is to the biocurators’ overall experience of a system, regardless of the system’s high score on design, learnability and usability. In addition, strategies to refine the annotation guidelines and systems documentation, to adapt the tools to the needs and query types the end user might have and to evaluate performance in terms of efficiency, user interface, result export and traditional evaluation metrics have been analyzed during this task. This analysis will help to plan for a more intense study in BioCreative IV.
%B Database
%V 2013
%8 2013
%G eng
%U http://database.oxfordjournals.org/content/2013/bas056.abstract
%0 Generic
%D 2013
%T Plant Ontology, a controlled and structured plant vocabulary for all botanical disciplines
%A Brian Atkinson
%E Cooper, Laurel
%E Laura Moore
%E Preece, Justin
%E Nikhil TV Lingutla
%E Sinisa Todorovic
%E Walls, Ramona L
%E Ruth Stockey
%E Gar Rothwell
%E Smith, Barry
%E Gandolfo, Maria A
%E Dennis Wm. Stevenson
%E Jaiswal, Pankaj
%X Recently, plant genome sequencing has expanded to different species of plants. This has dramatically expanded our knowledge of gene expression in plant structures and development, as well as plant evolution. However, due to the vast phylogenetic diversity within the plant kingdom some inconsistencies with terminology have occurred. These conflicting plant vocabularies challenge advancement in the plant sciences; therefore, it is important to have a consistent plant structure vocabulary that encompasses all green plants. The Plant Ontology (PO) has been constructed as a well-structured vocabulary whether the terms are anatomical or developmental. The PO also annotates gene expression data to a wide diversity of plant parts and stages of development, for example, terms can be linked with relevant genes that are expressed during the development of a certain structure. Terms are arranged in a hierarchical structure in which taxon-specific annotations occur; this provides the opportunity for users to compare gene expression in homologous structures across clades. This serves as a critical aid for plant scientists who incorporate large data sets to engage questions on genomics, development, and comparative genetics across different plant groups. The Plant Ontology also provides other resources for plant biologists to use such as the Annotation of Image Segments with Ontologies program (AISO), allowing users to annotate plant structures with relevant terminology and genes from images from digital photography or scanned copies. For example digital images of fossil flowers can be segmented and annotated with Plant Ontology terms, to create an image database where structures can be easily identified and compared with other structures from different specimens in longitudinal and cross sections. The goal of the Plant Ontology is to cultivate a consistent vocabulary for plant biologists across all disciplines of botany.
%B Botany 2013
%C New Orleans, LA
%8 July 27-31, 2013
%G eng
%U http://www.2013.botanyconference.org/engine/search/index.php?func=detail&aid=1337
%9 Poster presentationPoster presentation
%0 Journal Article
%J Plant & Cell Physiology
%D 2013
%T The Plant Ontology as a Tool for Comparative Plant Anatomy and Genomic Analyses
%A Cooper, Laurel
%A Walls, Ramona L
%A Elser, Justin
%A Gandolfo, Maria A
%A Stevenson, Dennis W
%A Smith, Barry
%A Preece, Justin
%A Athreya, Balaji
%A Mungall, Christopher J
%A Rensing, Stefan
%A Hiss, Manuel
%A Lang, Daniel
%A Reski, Ralf
%A Berardini, Tanya Z
%A Li, Donghui
%A Huala, Eva
%A Schaeffer, Mary
%A Menda, Naama
%A Arnaud, Elizabeth
%A Shrestha, Rosemary
%A Yamazaki, Yukiko
%A Jaiswal, Pankaj
%K Alkyl and Aryl Transferases
%K bioinformatics
%K comparative genomics
%K genome annotation
%K Molecular Sequence Annotation
%K Multigene Family
%K ontology
%K Phenotype
%K plant anatomy
%K Plant Proteins
%K Software
%K terpene synthase
%X The Plant Ontology (PO; http://www.plantontology.org/) is a publicly available, collaborative effort to develop and maintain a controlled, structured vocabulary ('ontology') of terms to describe plant anatomy, morphology and the stages of plant development. The goals of the PO are to link (annotate) gene expression and phenotype data to plant structures and stages of plant development, using the data model adopted by the Gene Ontology. From its original design covering only rice, maize and Arabidopsis, the scope of the PO has been expanded to include all green plants. The PO was the first multispecies anatomy ontology developed for the annotation of genes and phenotypes. Also, to our knowledge, it was one of the first biological ontologies that provides translations (via synonyms) in non-English languages such as Japanese and Spanish. As of Release #18 (July 2012), there are about 2.2 million annotations linking PO terms to >110,000 unique data objects representing genes or gene models, proteins, RNAs, germplasm and quantitative trait loci (QTLs) from 22 plant species. In this paper, we focus on the plant anatomical entity branch of the PO, describing the organizing principles, resources available to users and examples of how the PO is integrated into other plant genomics databases and web portals. We also provide two examples of comparative analyses, demonstrating how the ontology structure and PO-annotated data can be used to discover the patterns of expression of the LEAFY (LFY) and terpene synthase (TPS) gene homologs.
%B Plant & Cell Physiology
%V 54
%P 1-23
%8 2013 Feb
%G eng
%U http://pcp.oxfordjournals.org/content/54/2/e1
%N 2
%1 http://www.ncbi.nlm.nih.gov/pubmed/23220694?dopt=Abstract
%& 1
%R 10.1093/pcp/pcs163
%0 Generic
%D 2013
%T The Species-Specific Crop Ontology (Generation Challenge Programme): Application and Integration into the Reference Plant Trait Ontology to Enable Data Mining on Phenotypes
%A Arnaud, Elizabeth
%E Shrestha, Rosemary
%E Kulakow, Peter
%E Bakare, Moshood
%E Antonio Lopez-Montes
%E Ofodile, Sam
%E T., Praveen Reddy
%E Prasad, Peteti
%E Shah, Trushar
%E Hash, Charles Thomas
%E Weltzien-Rattunde, Eva
%E Sissoko, Ibrahima
%E Guerrero, Alberto Fabio
%E Simon, Reinhard
%E Borja-Borja, Nikki Frances
%E Ramil, Mauleon
%E L. Matteis
%E Skofic, Milko
%E Hazekamp, Tom
%E McLaren, Graham
%E Cooper, Laurel
%E Jaiswal, Pankaj
%E Menda, Naama
%E Nelson, Rex
%E Grant, David
%E Bastow, Ruth
%E Rami, Jean-Francois
%X The Crop Ontology (CO) of the Generation Challenge Program (GCP) (http://cropontology.org/) currently contains eleven crop-specific ontologies and has been developed for the Integrated Breeding Platform (IBP) (https://www.integratedbreeding.net/) by several CGIAR centers. The CO provides validated trait names used by crop communities of practice (CoP) for harmonizing the annotation of phenotypic and genotypic data and thus supporting data accessibility and discovery through web queries. The trait information is completed by the description of the measurement methods and scales and images. The trait dictionaries used to produce the Integrated Breeding (IB) fieldbooks are synchronized with the CO terms for automatic annotation of the phenotypic data measured in the field. The CO acts as a trait name server for various sites and databases: the Genotyping Data Management System (GDMS); the cassava database at Cornell University (http://cassavadb.org); Agtrials, the Global Repository for Evaluation Trials of Climate Change, Agriculture and Food Security (CCAFS), a CGIAR Research Program (http://agtrials.org ); and the EU-Sol BreedDB website (https://www.eu-sol.wur.nl/). The vision will be presented of a species-neutral and overarching Reference Plant Trait Ontology to support data annotation, integration and data mining across species, which has resulted from the successful collaboration between the CO project, the Plant Ontology (PO; http://www.plantontology.org/), the Trait Ontology (TO;http://www.gramene.org/plant_ontology/) the USDA-ARS SoyBase Database (http://www.soybase.org/), the Solanaceae Genomic Network (http://solgenomics.net/), and GarNet (http://www.garnetcommunity.org.uk/).
%B Plant and Animal Genome XXI Conference
%C San Diego, CA
%8 Jan. 11-16, 2013
%G eng
%U https://pag.confex.com/pag/xxi/webprogram/Paper5002.html
%9 Ontology Workshop TalkOntology Workshop Talk
%0 Generic
%D 2012
%T Annotating Gene Expression in