inv
top top2
arrow SIIM Home  arrow Contact Us
SIIM
 
Stay Connected!

 

Twitter

 

Twitter

 

LinkedIn

 

Facebook

 

Facebook

Wordpress

 
CFA 2010
 
Ride to SIIM
 

It's not too late! Your support of the SIIM Research & Education Fund through the 4th Annual "Ride to SIIM" will help fund the SIIM Grant Program and the Samuel J. Dwyer, III, PhD, FSIIM, Memorial Lecture.

Make a per-mile contribution to the SIIM Research & Education Fund today!

 
 
Gateway
 
 
Scientific Abstracts
invisible

Evaluation of Negation Detection and its Impact on

Precision in Search

 
Authors:

Andrew S. Wu, MD, University of Iowa Hospitals and Clinics; Jinsuh Kim, MD; Bao H. Do, MD; Daniel L. Rubin, MD, MS

 
Hypothesis:

Recognizing negated phrases in radiology reports can improve the specificity of search beyond what is currently achievable from keyword-based search.

 
Introduction:

Radiology reports are the primary work product of radiologists, and they contain a wealth of information that could be mined for teaching, research, administrative, and quality assurance purposes. Radiologists need to search radiology reports to locate specific findings and diagnoses, and there is a growing body of work to create methods to search radiology reports and to extract information from them.[1-5] A good report search engine could enable for radiology what Google has done for the World Wide Web – rapid and relevant retrieval of documents that users seek.

 

Several search engines for radiology reports have been described that retrieve reports based on word content.[1,3,6] These search engines look for keywords that match terms in the user’s search. For example, search engines can look through collections of radiology reports and return a list of cases containing the word “appendicitis.”

 

A challenge limiting the success of current radiology search engines is their inability to differentiate between positive findings and those mentioned in the context of uncertainty and negation. For example, three reports with impressions “No evidence of appendicitis,” “Cannot completely exclude appendicitis,” and “Acute appendicitis” would be retrieved in a user search for “appendicitis,” even though only one of them actually is a case of appendicitis. These false-positives clutter the search results and reduce efficiency of the search.

Our goal is to develop a method to improve the precision of radiology report searches. We describe RadReportMiner, a negation-aware search engine, and compare its performance with a currently-available keyword-based method of search.

 
Methods:

Following IRB approval, 238,952 radiology reports for exams performed between January 1, 2006, and December 31, 2006, were acquired from the radiology archives and imported into a MySQL database.

To create RadReportMiner, we built PHP scripts that applied a modified version of NegEx, an algorithm developed to detect negation in medical reports.[7] We expanded NegEx’s functionality: (1) by adding negation phrases specific to the radiology domain, such as “no evidence of,” “no longer visualized,” and “has healed”; (2) by implementing uncertainty detection by adding a separate category of “Hedge” terms, such as “cannot exclude” and “not ruled out”; (3) by sectioning the report into history, findings, and impression segments; and (4) by computing a relevance score using rules listed in Table 1, which classifies each report (positive, negative, uncertain, hedges) and sorts the report by decreasing relevance to the radiologist. The RadReportMinder algorithm is otherwise similar to NegEx, using regular expressions to search sentences for negations preceding and following the keyword, up to 6 words apart (inclusive).[7]

 

We evaluated our method by asking a radiologist to perform five term searches using the radiology reports database. The radiologist chose “appendicitis,” “optic neuritis,” “pneumonia,” “hydronephrosis” and “fracture” as the search terms. After each search term was entered, the first 100 results were displayed (or fewer if fewer were found), and a radiologist manually categorized each report as “positive,” “negative” or “uncertain” for the finding or diagnosis in question, thus establishing the ground truth. The same set of reports was then classified automatically by RadReportMiner (as positive, negative, or hedge) and the results compared against the radiologist’s results. The reports were also written to individual files for processing by Google Desktop.

 

We chose Google Desktop, a common word-based indexing engine, to compare with RadReportMiner. Google Desktop version 5.7.0806 was configured to search the radiology reports. After Google Desktop completed indexing the reports, the same five searches used to evaluate the RadReportMiner system were performed with Google Desktop, and the results were sorted “by relevance.”

Precision and recall were calculated for RadReportMiner and Google Desktop. Fisher’s Exact Test was used to determine statistical significance of the differences between the two search methods for each search term. Wilcoxon Signed Ranks Test was used to detect differences in overall precision and recall between the two search methods.

 
Results:

Four hundred sixty-four (464) reports were returned by the 5 searches. Of these, 119 (26%) were marked as positive by the radiologist. RadReportMiner classified 106 reports as positive, and achieved an overall per-document precision of 81% and a recall of 72%. Google Desktop returned 385 documents and achieved a precision of 28% and a recall of 87%. RadReportMiner’s overall precision is significantly higher than Google Desktop’s, whether weighing each report equally (per-document; p < 0.0001) or weighing each search term equally (per-term; p=0.042; median difference of 48% with 95% confidence interval 34%-75%). Google Desktop generally achieved higher recall than RadReportMiner. The difference is statistically significant per-document (p=0.0057), but not statistically significant per-term (p=0.273). Table 2 lists the results by search term.

 

The majority of false positives from RadReportMiner were due to unrecognized hedges (11) and word distance greater than 6 words (5). Table 3 lists the causes and examples of false positives.

The majority of false negatives from RadReportMiner were due to absence of keyword in the findings/impression sections (39) and negation of some but not all instances of the keyword (10). Table 4 lists the causes and examples of false negatives.

 

Table 1

 

 

Table 2

 

 

Table 3

 

 

Table 4

 
Discussion:

The ideal radiology search engine is one that has high “recall” (retrieving all the relevant reports pertinent to a user's interest from the entire database of reports) and precision (retrieving mostly relevant reports among all reports retrieved). For searching radiology reports, precision is more important than recall. When a radiologist uses a search engine to find reports containing a specific finding or diagnosis, they want high precision searches to minimize the number of false positive retrieved reports. Irrelevant reports require the radiologist to read and manually exclude them, wasting precious time and lowering efficiency.

 

There is generally a trade-off between precision and recall. The current word-based search engines perform superbly in terms of recall, but often lack precision. Our goal in this study was to improve precision by recognizing negations and uncertainties. Our results demonstrate improved precision of our system over the word-based Google Desktop search engine. We consider our results beneficial to radiologists. Given a choice of wading through 100 reports to find 16 positive appendicitis cases and looking through 20 reports for 15 positives, most radiologists would likely choose the latter. In fact, when analyzed in this fashion, one must go through an average of 36.8 more Google Desktop results to find the same number of positives returned by RadReportMiner, amounting to reading 137% additional reports.

 

Limitations of this study include the limited number of terms searched and the small number of reports analyzed. Further improvements to our algorithm can be made in the areas of synonym identification, scoring for multiple keyword +/- negation instances, hedge identification, and more robust history/findings sectioning.

 
Conclusion:

Adding negation identification to a word-based search engine improves the precision of search results over a search engine that does not take this information into account. Our approach may be useful to adopt into current report retrieval systems to help radiologists to more accurately search for radiology reports.

 
References:

1. Desjardins B. Hamilton RC. “A practical approach for inexpensive searches of radiology report databases.” Acad Radiol. June 2007;749-756.
2. Dreyer KJ, Kalra MK, Maher MM, et al. “Application of recently developed computer algorithm for automatic classification of unstructured radiology reports: validation study.” Radiology. February 2005;323-329.
3. Erinjeri JP, Picus D, Prior FW, Rubin DA, Koppel P. “Development of a Google-Based Search Engine for Data Mining Radiology Reports,” J Digit Imaging. April 5, 2008.
4. Rubin DL, Desser TS. “A data warehouse for integrating radiologic and pathologic data,” J Am Coll Radiol. March 2008;210-217.
5. Wong ST, Hoo KS Jr, Cao X, Tjandra D, Fu JC, Dillon WP. “A neuroinformatics database system for disease-oriented neuroimaging research.” Acad Radiol. March 2004;345-358.
6. Ramaswamy MR, Patterson DS, Yin L, Goodacre BW. “MoSearch: a radiologist-friendly tool for finding-based diagnostic report and image retrieval.” Radiographics. July 1996;923-933.
7. Chapman WW, Bridewell W, Hanbury P, Cooper GF, Buchanan BG. “A simple algorithm for identifying negated findings and diseases in discharge summaries.” J Biomed Inform. October 2001;301-310.