inv
top top2
arrow SIIM Home  arrow Contact Us
SIIM
 
Stay Connected!

 

Twitter

 

Twitter

 

LinkedIn

 

Facebook

 

Facebook

Wordpress

 
CFA 2010
 
Ride to SIIM
 

It's not too late! Your support of the SIIM Research & Education Fund through the 4th Annual "Ride to SIIM" will help fund the SIIM Grant Program and the Samuel J. Dwyer, III, PhD, FSIIM, Memorial Lecture.

Make a per-mile contribution to the SIIM Research & Education Fund today!

 
 
Gateway
 
 
Scientific Abstracts
invisible
Evaluation of Probabilistic Algorithm for Matching Patients’ Radiology Data at Two Medical Centers
 
Authors:
Joseph N. Gitlin, DPH, FSIIM, The Johns Hopkins School of Medicine; Daisy Nie; Tyler R. McClintock
 
Hypothesis:

There is no significant difference in the accuracy of matching patient records from identification parameters using a probabilistic algorithm compared to human review of medical records.

 
Introduction:

With the adoption of electronic patient records and the development of Regional Health Information Organizations (RHIOs), a critical component of system reliability is the accuracy of patient identification when multiple encounters are involved. This initial automated activity requires real time performance after verifying the authorization of the physician’s access to the patient’s information. Though accuracy has long been viewed as the cornerstone of any successful patient identification system, deciding which method ensures precise automated data matching can be difficult.

 

After reviewing the available alternatives, the study was conducted to determine if the “probabilistic” approach satisfies the stringent requirements related to accurate patient identification. “Deterministic” matching systems use a combination of algorithms and governing rules to determine when two or more records match. For example, a rule might instruct the system to match two records with different names if the Social Security number and address fields coincide. Either records match the requirements of the “rules” or they don’t. Probabilistic matching uses a likelihood ratio theory to assign comparison outcomes to the correct, or more likely. decision. This method leverages statistical theory and data analysis and, thus, can establish more accurate links than deterministic systems between records that have complex typographical errors and error patterns.

 

Many healthcare facilities have implemented electronic patient record systems, and there is substantial interest at many levels of government, industry, and academe in developing viable health information organizations. This is especially pertinent to the Baltimore/Washington “patient catchment” area that includes healthcare facilities representing university medical centers, private hospitals, military and VA treatment organizations, and a wide variety of clinics and offices. The diversity of healthcare delivery programs in this area poses a substantial challenge to the developers of a RHIO in terms of quality of care, patient and provider acceptance, and economic viability.

 

Among the many technical problems to be resolved is the implementation of a patient identification algorithm that will facilitate the use of specified healthcare information when requested by authorized providers. In addition to accuracy of patient identification, the system must satisfy stringent requirements to protect patient and provider privacy, confidentiality of medical records, and security of transmission media. This preliminary study is a basic effort to evaluate a probabilistic matching approach, developed by Initiate Systems, that provides flexibility in linking large datasets and supports statistical analysis.

 
Methods:

The design of a pilot study of RHIO functions, in which several healthcare facilities participate, must include a patient identification algorithm that will facilitate the use of specified healthcare information when requested by authorized providers. We chose records related to the delivery of radiology services at two medical centers to test the hypothesis regarding the accuracy of matching patient records. During the three-year period July 1, 2003 - June 30, 2006, approximately 1.2 million radiology examinations were performed at Johns Hopkins Hospital and 250,000 at Bayview Medical Center.

 

The primary objectives of the study were to:

1) Use a probabilistic algorithm to determine the number and distribution of “matching” radiology patients seen at both Johns Hopkins Hospital and Bayview Medical Center during a three-year period.
2) Compare the distributions of patient characteristics, such as age, gender, and location, in the matching and non-matching groups to determine if there are statistically significant differences.
3) Validate the probabilistic algorithm by comparing a sample of the processed radiology examination records with the corresponding Electronic Patient Records (EPR).

 

The basic data for the study were obtained from the Radiology Information System (RIS) records for each imaging examination performed at Johns Hopkins and Bayview during the selected time period. Each record contained the following patient parameters:

- Institution - either Johns Hopkins or Bayview
- History Number - eight numeric characters uniquely identifying a patient
- Last Name, First Name, Middle Initial
- Social Security Number
- Street Address, City, State, ZIP Code
- Telephone Number
- Date of Birth
- Gender
- Date of Radiology Examination
- Current Procedural Terminology (CPT) code

 

The probabilistic algorithm was used to assign a unique master patient index (MPI) number to the records of each individual patient that was identified. In addition to the basic data and the MPI, a “matching indicator” was added to the study record as follows:

 

- “Singleton” - no matching record
- “Same Source” - multiple matching records at the same institution
- “Crossover” - multiple examinations with at least one at each of the two institutions

 

When the patient identification parameters were processed by the probabilistic algorithm, the MPI assigned to each record permitted the recognition of a patient with multiple examinations and those with a single procedure. Further, this supported the identification of “crossover” patients (i.e., those who were seen at both hospitals during the study period).

 

The results were provided to the study team at Johns Hopkins for further processing and statistical analysis. Evaluating the accuracy of matching was expressed in terms of “true positives,” “true negatives,” “false positives,” and “false negatives.” False positives occur when the system mistakenly links records that should not be matched, and false negatives result when the system fails to link two records that should have been matched.

 

After obtaining approval from the Johns Hopkins Institutional Review Board (IRB) to conduct the study, particularly because of the emphasis on Personal Health Information (PHI) in the protocol, the selected RIS records were transferred to Initiate for processing by their probabilistic algorithm. The records were transferred using an established Initiate website with a secure FTP connection that was tested by Hopkins and Bayview representatives.

 

After processing and assigning an MPI to each record, the files were returned by Initiate to the study team for tabulation and analysis. The files of the radiology examinations performed at each of the two selected medical centers were analyzed to compare the matching status of patients during the three-year period.

 

To test the accuracy of the matching agorithm, the study team drew a representative sample of 1 in 1,000 patient examination records from the files of the two medical centers. The sample contained 240 patients with a total of 1,207 examinations. Each sample record was compared with its corresponding clinical record in the Hopkins Electronic Patient Record (EPR) system to review such parameters as physician notes, radiological image reports, and diagnoses related to each examination.

 
Results:

The comparison of the representative sample records with the EPR clinical information indicated no mistakes in patient identification in terms of “false positives” or “false negatives” by the probabilistic algorithm used by Initiate to assign MPI numbers to each record in the study. Based on this review, we can be 95 percent confident that the “true error rate” is less than 0.8 percent, using Bayesian inference.

 

The results of this study will be shown in tables of patients and radiology examinations by matching status for selected parameters such as age, gender, date of examination, and ZIP code. The selected table shows large differences in the number of examinations among the matching status categories. The largest category, “Same Source,” accounted for 1,181,275 examinations; the “Crossover” category included 163,799; and the “Singleton” category had 72,969 examinations. The numeric table can be converted to percentages for each of the gender groups, and the differences in matching status tested for significance.

 

Table 1

 
Discussion:

The primary outcome variables in the study are the number and percentage of “crossover” patients who have had radiology examinations at each of the two participating medical centers. Other outcome variables include the number and percentages of matching and non-matching patients by age, gender, and type and month of examination.

 

The results of this study will contribute to the design and conduct of a pilot study to demonstrate the feasibility of exchanging electronic health records across networks of providers. It is expected that an integrated health enterprise will improve the delivery of healthcare in terms of quality and efficiency by enabling authorized providers to have timely access to their patients’ records that are at other participating facilities. This will facilitate medical decision making based upon comprehensive information and reduce or eliminate the duplication of procedures that were performed elsewhere.

 
Conclusion:

The results of this study indicate that a reliable, accurate patient identification algorithm is available to facilitate the planning and implementation of a Regional Health Information Organization.

 
References:

Fellegi IP, Sunter AB: A theory for record linkage. J Am Stat Assoc. 1969;64:1183-1210.

Pankratz L, Jackson J: Habitually wandering patients. N Engl J Med. 1994;331(26):1752-1755.

Clark DE, Hahn DR: Comparison of probabilistic and deterministic record linkage in the development of a statewide trauma registry. Proc Annu Symp Comput Appl Med Care. 1995;397-401.

Grannis SJ, Overhage JM, Hui S, McDonald CJ. Analysis of a probabilistic record linkage technique without human review. AMIA Annu Symp Proc. 2003;259-263.

Walker J, Pan E, Johnston D, Alder-Milstein J, Bates DW, Middleton B. The value of health care information exchange and interoperability. Health Aff (Millwood):24:w10-w18, 2005.

Brailer DJ. Interoperability: the key to the future health care system. Health Aff (Milwood) Suppl Web Exclusives:W5-19–W5-21, 2005.

Schrag D, Xu F, Hanger M, Elkin E, Bickell NA, Bach PB. Fragmentation of care for frequently hospitalized urban residents. Med Care. 2006;44(6):560-567.

AHIMA e-HIM Work Group on Regional Health Information Organizations (RHIOs): Using the SSN as a patient identifier. J AHIMA. 2006;77(3):56A-D.

Schumacher S. Probabilistic versus deterministic data matching: making an accurate decision. DM Review. 2007.

Agresti A, Hitchcock DB. Bayesian inference for categorical data analysis. Stat Methods Appl. 2005;14:297-330.