inv
top top2
arrow SIIM Home  arrow Contact Us
SIIM
 
Stay Connected!

 

Twitter

 

Twitter

 

LinkedIn

 

Facebook

 

Facebook

Wordpress

 
CFA 2010
 
Ride to SIIM
 

It's not too late! Your support of the SIIM Research & Education Fund through the 4th Annual "Ride to SIIM" will help fund the SIIM Grant Program and the Samuel J. Dwyer, III, PhD, FSIIM, Memorial Lecture.

Make a per-mile contribution to the SIIM Research & Education Fund today!

 
 
Gateway
 
 
Scientific Abstracts
invisible
Using Knowledge Discovery Techniques to Identify a Novel Predictor of Breast Cancer: Breast Mass Density
 
Authors:
Ryan Woods, MD, MPh, University of Wisconsin School of Medicine and Public Health; Lonie Salkowski, MD; Gale Sisney, MD; Kazuhiko Shinki, MS; Louis Oliphant, MS; David Page, PhD; Jude Shavlik, PhD; Elizabeth Burnside, MD, MPh, MS; Charles Kahn, MD, MS
 
Hypothesis:

Knowledge discovery techniques applied to a large dataset of structured mammography reports can identify valuable predictors of disease. We identify that high mass density is predictive of breast cancer and validate this discovery in an independent dataset.

 
Introduction:

Knowledge discovery techniques can be used to generate novel hypotheses using large amounts of biomedical data now available. In the radiology community, one such repository of data is the National Mammography Database, which contains structured mammography reports comprised of the Breast Imaging Reporting and Data System (BI-RADS) lexicon developed for reporting on mammographic abnormalities. We used two techniques—inductive logic programming (ILP), and a probabilistic reasoning method, to generate a set of novel hypotheses describing the relationship between BI-RADS mammogram descriptors and breast cancer. One unanticipated and interesting hypothesis generated from these techniques was that higher breast mass density on a mammogram was predictive of malignancy. Even though high breast mass density as a predictor of malignancy has been asserted in the past, research has questioned this assertion.[1,2,3] To test the validity of this hypothesis, we assessed the relationship between breast mass density and breast cancer outcomes in an independent dataset of structured mammography reports linked to pathologic outcomes.

 
Methods:

The two knowledge discovery techniques were performed on a dataset of consecutive screening and diagnostic mammograms from a large, urban medical center between 1999 and 2004, consisting of over 47,000 mammograms on 18,270 patients. The ILP technique uses a system (“A Learning Engine for Proposing Hypotheses”) to generate rules based on benign and malignant examples in the dataset.[4] The most predictive rules were then reviewed by a subspecialty-trained breast radiologist for clinical significance.[5] The probabilistic reasoning technique analyzed the same dataset, and ranked individual descriptors and pairs of descriptors based on their ability to correctly predict malignancy in the dataset. Each of the above techniques revealed some predictive association between high breast mass density and malignancy.

 

To validate this hypothesis, we used an independent dataset of 359 consecutively collected solid masses biopsied between October, 2005, and December, 2007. BI-RADS descriptive data, including overall breast density, mass density (low, isodense, or high density as compared to normal parenchymal tissue), and size (greatest transverse width) were automatically collected using National Mammography Database format queries from an electronic structured mammography report system (PenRad®).[6,7] Because mass density was not consistently recorded at the time of mammogram, all masses were retrospectively assessed for mass density by 3 subspecialty-trained breast radiologists. Since relatively few masses are characterized as low density, low density and isodense masses were grouped as “low/iso” density for all analyses. Pathological outcome (benign vs. malignant) was determined at the time of biopsy by a pathologist.

 

A binary logistic model was created to determine the relative contribution of mass density to cancer outcome while controlling for age, size of the mass, overall breast density, and the effect of reading radiologist. Forward stepwise Akaike Information Criterion (AIC) selection was used to compare models, as predictors were added to the model.

 
Results:

The mean age of patients in the study was 54.9 years (SD 13.4, range 31-95). The masses ranged in size from 1mm-72mm (mean 13.7, SD 8.7), and were considered low/iso density in 269 masses, and high density in 90 masses. There were 236 benign masses and 123 malignant masses. The logistic regression model determined that only breast mass density (p<.0001) and age (p<.0001) were predictive of malignancy, and removed size and overall breast density from the final model.

 
Discussion:

In this study we validated a hypothesis generated by two knowledge discovery techniques by showing that high breast mass density is predictive of breast cancer in an independent dataset. These results suggest that: (1) knowledge discovery techniques are a way of uncovering new hypotheses from large amount of available biomedical data, and (2) that breast mass density is predictive of breast malignancy, even after controlling for age, size of mass, and overall breast density. These techniques are clinically important, particularly in the radiology community, where increasingly large amounts of structured data are becoming available, and where the identifying novel hypotheses is challenging.

 
Conclusion:

A novel hypothesis generated by two knowledge discovery techniques that breast mass density is predictive of breast cancer is validated in an independent dataset of structured mammography reports. Those knowledge discovery techniques can be used in biomedical research to identify hypotheses not previously considered.

 
References:
1. Egan RL. Breast imaging: Diagnosis and morphology of breast diseases. Philadelphia: Saunders. 1988.
2. Homer MJ. Mammographic interpretation: A practical approach. 2nd ed. New York: McGraw-Hill, Health Professions Division. 1997.
3. Jackson VP, Dines KA, Bassett LW, Gold RH, Reynolds HE. Diagnostic importance of the radiographic density of noncalcified breast masses: analysis of 91 lesions. AJR Am J Roentgenol. July 1991;157(1):25-28.
4. Aleph [computer program]. Version 4; 2001.
5. Burnside ES, Davis J, Costa VS, et al. Knowledge discovery from structured mammography reports using inductive logic programming. AMIA Annu Symp Proc. 2005:96-100.
6. Osuch JR, Anthony M, Bassett LW, et al. A proposal for a national mammography database: Content, purpose, and value. AJR Am J Roentgenol. June 1995;164(6):1329-1334; discussion 1335-1326.
7. PenRad Technologies, Inc.: Minnetonka, MN