inv
top top2
arrow SIIM Home  arrow Contact Us
SIIM
 
Stay Connected!

 

Twitter

 

Twitter

 

LinkedIn

 

Facebook

 

Facebook

Wordpress

 
CFA 2010
 
Ride to SIIM
 

It's not too late! Your support of the SIIM Research & Education Fund through the 4th Annual "Ride to SIIM" will help fund the SIIM Grant Program and the Samuel J. Dwyer, III, PhD, FSIIM, Memorial Lecture.

Make a per-mile contribution to the SIIM Research & Education Fund today!

 
 
Gateway
 
 
Scientific Abstracts
invisible

Highly Available Radiology Information System:

An Example Implementation

 
Authors:
David W. Piraino, MD, FSIIM, Cleveland Clinic; Roy Kittelberger; Bradford Richmond, MD
 
Background:

Our imaging computer systems have become essential in taking care of patients and in the operation of our departments. If these systems become unavailable for any reason, our patients suffer, and our revenue stream becomes compromised. It is imperative that our systems are designed to be continuously available. High availability is “a system design protocol and associated implementation that ensures a certain absolute degree of operational continuity during a given measurement period.”[1] A system is available when users are able to perform productive work on the system.

 

Downtime is used to refer to the time a system is unavailable. Downtime can be planned or unplanned. Both planned and unplanned downtime decreases the availability of the system. Unplanned downtime typically is more disruptive than planned downtime.

 

Availability can be calculated as the percent of time the system is available. Table 1 shows availability percent and how that translates to amount of downtime per week.

 

As the complexity of a system increases, the system has more potential points of failure. Unfortunately for us, our imaging computer systems are becoming more and more complex. This requires that we pay more attention to system design and standard maintenance.

 

The Wikipedia article on high availability has a simple statement of the basic principle of highly available systems: “The most highly available systems hew to a simple design pattern: a single, high quality, multi-purpose physical system with comprehensive internal redundancy running all interdependent functions paired with a second, like system at a separate physical location.”[1] We implemented our RIS using this basic principle of highly available systems.

 

Our design is shown in figure 1. First, there are two completely independent computer systems with their own storage, separated by two city blocks in separate buildings. These systems are internally redundant with redundant power supplies, mirrored disks, etc. Each system has its own software and database. These servers act in an active passive configuration. Server 1 is the active server unless we failover to Server 2.

 

The context switches virtualize the IP address of both servers. If Server 1 is the active server, it becomes the Server with the designated IP address. If Server 2 is the active server, the context switch redirects the same IP address to Server 2. This allows IP-based communication, such as DICOM, to be redirected. Server 1 and Server 2 can communicate through the context switch using a different IP address. The context switches are redundant and only one is needed for the systems to function properly.

 

The independent databases are kept synchronized using database replication. The replication takes place from the active server to the passive server. The database replication first places events/transactions in a replication queue and then transmits them to the second database over the network. These events/transactions are then placed in a queue on the replicated system, then committed to the database.

 

In a controlled failover from one system to another, the application is first shut down and then the database replication is allowed to complete so that the second system’s database is identical to the first. The application can then be started on the second system and the context switches redirect client and other information to the second server. In an uncontrolled failover, the replication queue may have transactions that did not get transmitted to the second server. These transactions should be in the replication queue and can be retransmitted to second system after recovery.

 
Evaluation:

The above high availability design is presently being used at our institution for our RIS system. We presently do between 1.7 and 2 million exams per year and have over 10 years of exam history in our system. Presently, the failover process is a manual task. We have set guidelines about when to failover if there are problems with the primary server or network issues affecting the primary server. Testing of the failover process has shown that it takes from 10 to 20 minutes to complete a controlled failover and begin running on the second server. We can then fail back to the primary server if the downtime is less than one day. If downtime is longer than one day, it is faster to do a copy of the database, bring replication back up, and then fail back to the primary system. We are using this failover and failback process during the move of the primary server to our new data center.

 
Conclusion:

Designing highly available systems has become more important in imaging informatics. Having physical separation and independent systems that can act in an active passive manner can increase the time your system is available. This is certainly not the only method to increase the availability of your system. It is important that these high availability designs are only a supplement to good systems administration, appropriate hardware, well-designed software, and computer datacenter requirements.

 
References:
1. “High Availability.” Wikipedia. 2007. Wikipedia. 12 Sept. 2008. Available at: http://en.wikipedia.org/wiki/High_availability.