inv
top top2
arrow SIIM Home  arrow Contact Us
SIIM
 
Stay Connected!

 

Twitter

 

Twitter

 

LinkedIn

 

Facebook

 

Facebook

Wordpress

 
CFA 2010
 
Ride to SIIM
 

It's not too late! Your support of the SIIM Research & Education Fund through the 4th Annual "Ride to SIIM" will help fund the SIIM Grant Program and the Samuel J. Dwyer, III, PhD, FSIIM, Memorial Lecture.

Make a per-mile contribution to the SIIM Research & Education Fund today!

 
 
Gateway
 
 
Scientific Abstracts
invisible
An Ontology for Population-based Imaging Studies
 
Authors:
Dean Zhang, Wake Forest University School of Medicine; John J. Carr, MD; Yaorong Ge, PhD
 
Background:

One effective approach to human phenotyping employs carefully designed long-term studies of well-sampled population groups. These population-based studies collect myriads of clinical assessments, as well as many kinds of imaging studies in order to understand and discover complex relationships among large number of factors that influence human aging and disease processes. Both the data and analyses obtained from these studies are extremely valuable for generating and testing new hypotheses as we continuously improve our understanding of human phenotypes and their association with genotypes.

The Multi-Ethnic Study of Atherosclerosis (MESA), Coronary Artery Risk Development in Young Adults Study (CARDIA) and Jackson Heart Study (JHS) are three examples of nation-wide studies that investigate cardiovascular diseases (CVD). It is the goal of MESA to measure subclinical CVD before it becomes symptomatic so that earlier detection of CVD can be made possible. Data is gathered around the country from six field sites measuring multiple ethnic populations.[1] CARDIA is also a multi-site study with a focus on young adults.[2] JHS has similar goals but focuses on African-American populations in Jackson, Mississippi.[4] All these studies involve thousands of subjects and perform several types of imaging procedures for each subject, including CT, MR, and ultrasound, at multiple visits that are several years apart.

 

As data and knowledge grow in size and complexity with increasing numbers of studies, the need of an ontology to represent and integrate both data and knowledge in a standardized and formal framework is becoming ever more apparent and urgent. An ontology is a hierarchial model made up of groups of concepts within a defined doman, as well as the relationships among the concepts. The ontology provides a sound foundation for linking data and analyses from multiple studies that frequently use different naming conventions in their data dictionaries and use different formats for data storage. It is the beginning of a knowledge-based approach to understanding and discovering deeper and wider associations among phenotypes and genotypes. It is also an ideal framework for capturing published research results so that analyses from multiple studies can be properly compared, validated, and intelligently retrieved through automated reasoning and semantic processing.[5]

 

Prominent existing ontologies include the Foundational Model of Anatomy (FMA) and Gene Ontology (GO).[1,3] The former models the anatomy of the human body and the relationships between anatomical parts using ontological properties.[1] The later provides a structured set of vocabularies that can be used to define the gene products of any organism.[3] In recent years, Radiological Society of North America (RSNA) has initiated the development of RadLex to model the concepts used in medical imaging and radiology practices.[7] Additionally, the Open Biomedical Ontologies (OBO) Foundry provides a collection of biomedical ontologies, as well as general principles, tools, and support for ontology development.[6] However, none of the existing ontologies is sufficient for capturing the concepts and relationships related to population based clinical studies.

The focus of this project is to develop an ontology for population-based studies that initially cover the imaging data and clinical concepts that are related to studies of cardiovascular diseases. Our current development incorporates publically available data and concepts from MESA, CARDIA, and JHS.

 
Evaluation:

Ontology development is an iterative process. The level of details in an ontology is closely dependent on the goals and applications it serves. In this first iteration of development, we determined two initial goals for this project. The first is to provide a standard framework for representing and exploring data from multiple CVD related studies that use different data variables. The second is to provide a formal model for representing significant CVD related findings that are discovered in multiple different studies.

 

We began by reviewing the published literature from MESA and other major studies, as well as the MESA data dictionary, in order to generate a general list of concepts. During weekly meetings, we edited the list to yield a more focused list of concepts that related to coronary artery calcium (CAC), participant demographics, and imaging procedures. Initially, three top-level concepts – Anatomy, Disease, Imaging Procedure – were defined as top-level concepts for the ontology (see Figure 1). The concept of Anatomy was modeled using FMA. Anatomical concepts related to CAC were picked out of FMA and placed into our ontology while retaining the original FMA hiearchy. The concept of Disease was modeled using a combination of concepts taken from Unified Medical Language System (UMLS) and SNOMED CT, as well as terms we created. The concept of Imaging Procedure was modeled almost entirely using terms created by us. The concepts in our ontology were organized based on the Is_A relationship. Additionally, we added other concept relationships, such as Part_Of, Measured_By, and Belongs_To.[9] The anatomical relationships were based on FMA. Other relationships were added based on basic science knowledge along with UMLS, SNOMED CT, and published literature. In order to maintain the Is_A organizational scheme, additional top-level concepts were added (see Figure 2). Currently, we have 8 top-level concepts and 12 different relationship types.

 

Figure 1

 

Figure 2

 

Protégé Ontology Editor was chosen as the development tool. General principles from published articles on ontology development were carefully followed to ensure the correctness and integrity of our ontology. Top-level concepts were created as Classes. Following the Is_A hiearchy, additional subclasses were added to the top-level classes to represent more detailed concepts. Slots describing relationships were added to the classes to create additional relationships among the classes. For example, the concept Anatomy was given the slot Part_Of. This allowed a concept such as Coronary artery to be described as Part_Of the concept Heart. Additional slots were added to classes to allow them to have descriptive properties. For example, the concept CT has slots to describe how it is measured and what part of the anatomy it affects.

 
Discussion:

To test our ontology, MESA study data was obtained from the website. Basic participant demographic data and CAC imaging data were input into Protégé as instances. MESA dataset variable names were matched with the corresponding data dictionary entries. The data was then manually input into Protégé in the appropriate slot (see Figure 4). This same process of matching a data variable to a data dictionary term and then inputting the value into Protégé was performed for JHS and CARDIA data. These limited data samples have allowed us to sucessfully test queries that retrieve data instances across multiple studies. Further efforts are under way to expand the ontology to include other clinical and genetic concepts related CVD, as well as analysis results. Additional system features are also under development to programmatically map ontological concepts to data instance sources so that queries to the ontology system retrieve data samples directly from the actual data sources without needing to manually import data.

 

Figure 4

 
Conclusion:
We have developed an ontology for population-based studies that currently focuses on imaging and cardiovascular concepts. Preliminary tests show that the ontology can successfully model imaging data from MESA and other CVD studies such as JHS and CARDIA. We expect the complete ontology to greatly enhance CVD research with better integration of data and analyses from multiple studies.
 
References:

1. Bild, et al. Multi-Ethnic Study of Atherosclerosis: Objectives and Design. American Journal of Epidemiology. 2002;156:871-881.

2. CARDIA. Brief Description. http://www.cardia.dopm.uab.edu/o_brde.htm.

3. The Gene Ontology Consortium. Creating the Gene Ontology Resource: Desing and Implementation. Genome Res. 2001;11:1425-1433.

4. Jackson Heart Study. Design, Rationale, and Objectives. http://www.nhlbi.nih.gov/about/jackson/2ndpg.htm.

5. Noy, NF, McGuinness, DL. Ontology Development 101: A Guide to Creating Your First Ontology. Available at: http://protege.stanford.edu/publications/ontology_development/ontology101.pdf.

6. The Open Biomedical Ontologies. http://www.obofoundry.org/.

7. Radlex. http://www.radlex.org/viewer.

8. Rosse C, Mejino JLV Jr. A reference ontology for biomedical informatics: the Foundational Model of Anatomy. Journal of Biomedical Informatics. 2003;36:478-500.

9. Schulz S, Kumar A, Bittner T. Biomedical ontologies: What part-of is and isn’t. Journal of Biomedical Informatics. 2006;39:350-361.

10. Spackman KA, Campbell KE, Cote RA. SNOMED RT: A Reference Terminology for Health Care. Proc AMIA Symp. 1997:640-644.

11. US Department of Health and Human Sciences, National Institutes of Health, National Library of Medicine. Unified Medical Language System (UMLS), 2008.