Development of a Unified Phenotype Dataset for Plants

TitleDevelopment of a Unified Phenotype Dataset for Plants
Publication TypeConference Poster
Presenting AuthorHuala E
Secondary AuthorsCannon SB, Cooper L, Gkoutos G, Harper LC, Jaiswal P, Lawrence CJ, Lloyd J, Meinke D, Menda N, Moore L, Mueller L, Nelson RT, Walls RL
Conference NamePlant and Animal Genome 2013
Conference LocationSan Diego, CA
Conference DatesJan. 11-16, 2013

Plant phenotype datasets can be found in a range of formats including free text and species-specific or knowledge domain-specific controlled vocabularies. While this enables some limited comparison of phenotype data across a single species or within a knowledge domain such as crop breeding, queries or analyses that span a broader set of species are not possible in the absence of a common vocabulary for describing phenotypes. To enable cross-species and cross-domain phenotype comparisons and analyses in plants, we have launched an effort to convert existing phenotype datasets for 8 plant species, encompassing both model species and crops, into a common format using taxonomically broad ontologies representing plant anatomical parts and developmental stages (Plant Ontology), biological processes (Gene Ontology), chemicals (ChEBI), and phenotypic qualities (PATO). Our effort focuses on mutant and overexpression phenotypes associated with genes of known sequence in Arabidopsis, tomato, potato, pepper, maize, rice, soybean and Medicago. Shared use of ontologies, annotation standards, formats and best practices across these eight plant species ensures that the resulting dataset will produce valid results for cross-species querying and semantic similarity analyses. Additionally, the dataset will enable us to explore the relationship between sequence similarity and phenotypic similarity across a range of plant species.

Author Address

Carnegie Institution for Science, Stanford, CA