Securing Electronic Medical Records Using Modified Blowfish Algorithm

Received May 17, 2018 Revised Aug 23, 2018 Accepted Sep 6, 2018 EMR helped improve services to patients by delivering organization and accuracy of patient information, but issues regarding security breaches and medical identity theft are growing concerns. This paper enhance the current EMR system by integrating modified encryption. The simulation used modified Blowfish algorithm in an EMR system that focuses on four goals: 1) define the requirements, 2) design and identify features, 3) develop the EMR incorporating added security mechanism using modified Blowfish algorithm, and 4) test the application with sample data. Based on the results, the incorporation of the encryption was successful based on testing and checking done on the input terminal and the database server. Data inputted on the EMR system was successfully encrypted before transmission and decrypted only on the terminal for viewing. Performance results show that without encryption, saving took an average of 87.8ms while encrypted, it acquired 88.8ms, a difference of 1ms can be noted. The minimal difference is because of the size of the data. The average decryption time of all records using modified algorithm took 1342ms while using plaintext took 1322ms. The decryption time is higher by 20ms due to the application of the decryption algorithm. Keyword:


INTRODUCTION
An Electronic Medical Record (EMR) is a controlled document that contains essential and sensitive patient's information [1]. Protecting patient privacy is deemed valuable as evidenced by restrictions imposed regarding sharing of information and the security of physical repositories [2] [3]. As a fundamental right, laws per country control and regulate the access to medical information to protect patient's confidentiality.
There is an increasing desire to use the public infrastructure like the Internet to store, send, or receive private information for availability and sharing of public and private digital data. Because of this, data security breaches and medical identity theft are growing concerns, with thousands of cases reported every year [4]. Issues like this have driven industry professionals and researchers to dedicate attention to information security for protection against unauthorized access and attacks [5].
One way of guaranteeing the protection of patient's information is through the application of cryptography. Cryptography is the study of information hiding and achieving security by encoding messages to make them non-readable [6] [7]. The use of cryptography addresses data privacy preservation and security of electronic health record from modification and unauthorized access during transmission [1], [8]- [10].
Tarlac State University (TSU) is envisioned to be a premier University not only in Tarlac but the Asia Pacific Region. Its mission is to not only provide high-quality instruction, enhance research undertakings and strengthen collaboration with institutions but also seeks to ensure safe and healthy working conditions for its employees and students [11]. The TSU medical clinic was established to provide free medical support to students, faculty, and staff of the University. Currently, the medical clinic is using the paper-based system for storing the medical records in files. But to take advantage of information technology, the clinic requests the Management Information System Office (MISO) of the University for digitization of medical records -from the previous handwritten or typed. Studies regarding the superiority of EMR versus paper-based approaches intend to improve the quality of documentation in recorded patients' information by enhancing organization and standardization of clinical data [12] [13]. Most enhancements are due to the increasing difficulty in handling medical data to gain accuracy and efficiency in the recording. The EMR is envisioned to help the medical clinic in the management of their records.
OpenEMR, GNUmed, OpenMRS, OSCAR, GNU Health and others are one of the many open source applications in healthcare. A study has evaluated the data security of these applications, and the most identified form of protection ranges only from utilizing passwords and some backup mechanisms while storage and secure communications are lacking [14]. EMR, though considered as the central element of health IT infrastructure, have drawbacks in implementation such as exposure to cyber-attack [15]. Individuals may attempt to steal a patient's identity, resulting in financial implications for the patient [4]. Secure communication is a priority requirement for EMR, so the use of encryption approaches and traffic shaping algorithms are placed to ensure secure access to data [16]. Data in transit or data in motion, movement of data from locations for instance across the internet or through a private network, is considered less secure [17]. Data in transit achieves less emphasis and found to be not as protected therefore security measures must be placed for protection because data is susceptible to eavesdropping when in motion, so encryption is needed [18], [19].
In 1994, Bruce Schneier designed Blowfish algorithm as an alternative encryption algorithm to the out-of-date DES. It is a symmetric block cipher that accepts a 64-bit input block cipher and a varying key size of 32 to 448 [20]. Blowfish is one of the fastest, compact, easy to understand, easy to implement, free alternative to existing encryption algorithms and features variable security level except when changing keys [21]. Blowfish has been used as the cryptographic algorithm to impose increase security for encryption of file records and electronic documents that contain medical information [22]- [24].
Even though blowfish is remarkable, it still uses the 64-bit input block size which would allow higher chances of having duplicate blocks during encryption of files which can lead to a leak in information. Each round in the key expansion needs around four kB which makes it unsuitable for devices with small memory. The modified blowfish algorithm was developed to address vulnerabilities of Blowfish such as the input block size, and memory storage space of the key. The modification in block size would allow encryption of file with reduced chances of having identical blocks. The number of s-boxes is reduced from four to two to provide less memory consumption while maintaining the original structure for migration ease. The modified algorithm reduced the number of S-boxes, so a derivation technique was added to remove symmetry.
This paper intends to enhance the current system by integrating an encryption scheme in an EMR system. A simulation will present the use of the modified Blowfish algorithm in the EMR. It focuses on four primary objectives namely: 1) to define the requirements needed in the development of the EMR, 2) to design and identify features to be included, 3) to develop EMR incorporating an added security mechanism using modified Blowfish algorithm, and 4) to test the application with sample data. The use of cryptography in EMR will be beneficial to health care patients as this addresses issues of data privacy preservation and encryption of patient health records for transmission over the network infrastructure.

RESEARCH METHOD
RAD is based on prototyping and iterative development. Rapid Application development focuses on gathering customer requirements through workshops or focus groups, early testing of the prototypes by the customer using iterative concept, reuse of the existing prototypes (components), continuous integration and rapid delivery. Since RAD fits into the time frame, this model was adopted in the system development.
An interview was conducted both on the medical clinic and management information system office for data gathering. After series of question and answers, requirements were formulated. Sample data were taken from the medical records of employees in Tarlac State University (TSU) and medical record sample. Since health records are considered private, the actual name of the persons involved will be replaced with dummy names during the testing phase. Data privacy will be strictly imposed. The structure of the database is taken from the existing database of TSU medical records system.
Simulation work using medical records on the algorithm will be carried out by using .net framework (pronounced dot net). VB.NET (Visual Basic) is an object-oriented programming language developed by Microsoft that runs on the .NET Framework on a Microsoft Windows operating system. An HP computer system with Intel® Core™ i5-7200U processor performing at a speed of 2.50 GHz with windows platform and 8GB installed memory will be used to carry out the proposed work.

RESULTS AND ANALYSIS 3.1. Requirement Definition
Interview was used for data gathering and after series of question and answers, requirements were formulated. Registered users must be able to login to the EMR by providing username and password. The administrator is the superuser who can access all modules. The administrator can add, edit, and view information of any EMR system user. Nurse account shall be able to add, edit, and see the medical profile which includes the patient personal information and medical history. The nurse can input initial details during the consultation which consists of the general appearance and vital signs of the patient. The doctor account can view information of the patient and add the result of the diagnosis. The nurse account can record the physical examination of employees for their Annual Physical Exam (APE). The nurse account can log the physical examination of students for their pre-employment medical exam. The nurse account can record lab examination such as hematology, blood chemistry, urinalysis, ultrasound, 2D echo with doppler, stress, and other test results. The system can provide a medical certificate of the patient. The nurse asses the patient and can add a patient in the queueing system if the patient needs Doctors' assistance for onsite consultation. The system shall be able to provide medical services internally or on-site. The system shall be able to provide transmission encryption using the modified Blowfish algorithm.

Design and Modules
Based on the gathered requirements and after deliberation, the final design of the customized EMR composed of the following features: a. Patient queueing: Patient queueing module allows organized numbering of the patient for an on-site consultation. Patients are numbered in order of appearance as per assessment by the nurse on duty. g. User management and privileges: The user management module allows the administrator to handle account privileges. This module will enable the administrator to add, edit and view users that can access the EMR. h. Audit Trail: This module records all transactions done in the EMR as audit trail logs. This feature is included for accountability purposes since EMR records are considered private and confidential. i.
Database Back-up and restoration: Back up and restoration module will enable the user to back up the database in a specified location set by the user from time to time to minimize risk in case of unforeseen events. This module can also restore the saved database from any location. j.
Login Module: This module allows authorized users to login into the EMR system.

Security Mechanism
The modified Blowfish algorithm is included as the encryption mechanism as an added security mechanism. Figure 1 explained the process of encryption and decryption. The representative or any authorized user encodes the medical related information in the EMR. The process of encryption starts the moment the record is saved. Before the data is transmitted to the server, the data is encrypted using the modified Blowfish algorithm with the key. The resulting ciphertext is sent instead of the plaintext to the server. Only selected private and sensitive information will be encrypted. During viewing, the encrypted medical information is fetched from the server and then decrypted using the modified Blowfish algorithm and key. Once the encrypted information is decrypted, the information is shown to the authorized user or representative as readable text.
The modified Blowfish reduced the size of keys from the previous 4168 bytes to 2128 bytes. The key expansion will still convert the 128-bit key length into several subkey arrays. These keys are generated dynamically before any data encryption or decryption occurs. The P-array consist of 20 32-bit subkeys. The four S-Boxes consist of 256 individual entries comprising 32-bits each. In the modified key expansion scheme, the total number of iterations is reduced to 266 to generate all required subkeys.
Calculation of the subkeys are done using the same Blowfish algorithm, but the algorithm reduced the size to two S-boxes. First, P-array followed by the four S-boxes is initialized using constant strings that consist of predetermined hexadecimal digits of pi. Next, P1 and P2 are XORed with the first and second 32 bits of the key, in a loop until all bits of the key is exhausted. Repeat the cycle until the whole P-array (P20) has been XORed against the key bits. Then, Blowfish algorithm is used in encrypting an all-zero string using the subkeys described in the previous steps. Next, the outcome of step 3 substituted values of P1 and P2. Then using the Blowfish algorithm again, encrypt the output of step 3 using the revised subkeys. Then, 313 results obtained in step 5 replaced P3 and P4. This process is continuously repeated replacing all entries of the P array, followed by the two S-boxes with the output of the continually varying Blowfish algorithm. Figure 2 shows the new process of encryption of the modified blowfish algorithm. The structure of the original blowfish algorithm is still adopted, but the modified Blowfish reduce the number of iterations to 8.

Figure 2. Modified blowfish algorithm architecture
The difference lies in the input block. The input block changes to 128-bit and will be split into two 64-bit equal segments LE0, RE0. Second, the first segment 64-bit block (LE0) is XORed to the first entry in the P-array (P1, P11) with two 32-bit entries. Third, input the two 32-bit data obtained to the F-function. The output from the F-function will then be XORed with the second segment (RE0) of the plaintext. Then, swap LE0 and RE0. This cycle will continue up to the eighth round. After the eighth round, exchange LE8 and RE8 reversing the last swap. Then, RE8 is XORed to P-array (P9, P19) and LE8 is XORed to P-array (P10, P20). Finally, we recombine LE9 and RE9 to get the ciphertext. The decryption process is the reverse of the encryption process. Figure 3 also shows the details of the construction of the new F-function in the modified blowfish. The F-function now accepts a 64-bit data stream and will be divided into eight 8-bits where a is the first 8 bits, b is the second 8 bits, up to the last 8 bits. Transform each 8-bit data bits into a 32-bit data. The first four 8-bit data stream utilizes the first S-box while the next four 8-bit data stream uses the second S-box. The output from the S-boxes are then XORed or added to obtain the final 32-bit value per S-box and then concatenated to obtain the 64-bit output as shown in the Equation 1 F (LE0)= ((S1(a) + S1(b) << 1 mod 2 32 )  S1(c) >> 1) + S1(d<<1) mod 2 32 | ((S2(e) + S2(f) << 1 mod 2 32 )  S2(g) >> 1) + S2(a<<1) mod 2 32 (1) The S-boxes are derived at runtime from S-box 1 by a simple rotation by one position of either the input or the output or either by left or right. Below defined the details of the derivation process: S2(x)= S1(x) << 1 (2) S3(x)= S1(x) >> 1 (3) The researcher changed the structure of the F-function as can be seen from the equation above.

Figure 3. Modified F-function
Sample data was encoded to test the application. Figure 4 shows the sample encoded medical profile with details such as the name, gender, birthdate, address, and others. As can be seen in the viewing module, the text is readable. Figure 4. Sample medical profile of patient using the EMR system Sample medical record is viewed on the database server to check if the encryption of data works accordingly. Figure 5 shows the screenshot of the data saved on the server. As seen, specific fields are encrypted. Hence, this assures that even if a breach occurs, the data are still encrypted and still considered safe. Performance of the modified algorithm is measured using time in milliseconds. The average time was noted without the use of encryption (plain text only) and using the modified Blowfish encryption using five sample medical profiles. Table 1 shows the comparison. As can be seen, if without encryption, saving took an average of 87.8ms while if encrypted it obtained 88.8ms. A slight variation in the average time (1ms) can be noted for the five records. The changes in time are very minimal because the size of the data to be encoded is small. In viewing the medical data, time in milliseconds was also noted without decryption (plain text only) and using the modified Blowfish decryption algorithm. Table 2 shows the comparison. Viewing of all five medical profiles was repeated five times. As can be noted, the average time without decryption or using plaintext only is 1322ms while average decryption time of information using the modified algorithm took 1342ms. The decryption time is higher by 20ms. Viewing time of all records is anticipated to increase as the number of records to display increases because all records are decrypted at the same time.

CONCLUSION AND FUTURE WORKS
EMR helped improve services provided to patients by offering organization and accuracy in dealing with patient information. The information requirement gave enough information to build the EMR software and serves as a guide. The features included were the answers on how the system might help in the enhanced acquisition of information from the EMR. Since most EMR lacks application of encryption, the study addressed issues in data security by applying an encryption algorithm using the modified Blowfish algorithm. Finally, the EMR system was tested by encoding sample data and checking the application of the encryption mechanism by inspecting data saved on the server. Performance of saving and viewing medical data in plaintext and using modified Blowfish encryption is also measured using time expressed in milliseconds. Performance results show that if without encryption, saving took an average of 87.8ms while if encrypted it acquired 88.8ms, a difference of 1ms can be noted. The minimal difference is observed because the size of the data encoded is small. The average decryption time of all records using modified algorithm took 1342ms while using plaintext took 1322m. The decryption time is higher by 20ms because of the application of the decryption algorithm. For this study, all text fields are encrypted. For future works, the processing time of encrypting text files as an additional supplementary attachment may be considered as well as encryption of non-text data.