Automated Detection of Retinal Hemorrhage based on Supervised Classifiers (KA Sreeja et al)

Received May 28, 2019 Revised Feb 28, 2020 Accepted Mar 6, 2020 Supervised machine learning algorithm based retinal hemorrhage detection and classification is presented. For developing an automated diabetic retinopathy screening system, efficient detection of retinal hemorrhage is important. Splat, which is a high level entity in image segmentation is used to mark out hemorrhage in the pre-processed fundus image. Here, color images of retina are portioned into different segments (splats) covereing the whole image. With the help of splat level and GLCM features extracted from the splats, three classifiers are trained and tested using the relevant features. The ground-truth is established with the help of a retinal expert and using dataset and clinical images the validation was done. The output obtained using the three classifiers had more than 96 % sensitivity and accuracy.


INTRODUCTION
The World Health Organisation estimated that by 2030, there will be nearly 366 million people with Diabetic mellitus (DM) [1]. A microvascular complication of DM that is responsible for a major share of cases of blindness in the world is the Diabetic Retinopathy (DR). The severe complications like Microaneurysms, Exudates, Occlusion, hemorrhages, etc., together known as DR. The DR becomes more severe when it goes undetected for a long period of time. Retinal hemorrhages and other symptoms are usually diagnosed by an ophthalmoscope or a fundus camera that are capable to examine inside of eye. In order to reduce the diagnosing time, human error and increase the accuracy, several algorithms have been developed for the early detection of DR and all of them use machine learning techniques. An expert, usually an ophthalmologist or a retinal specialist provides the ground truth to train the system. The pre-processed fundus image features are extracted and applied to a supervised classifier which is trained with the relevant features by feature subset selection. In this paper, classification of hemorrhage and non-hemorrhage fundus images carried out using three different classifiers is presented. The techniques used to develop the algorithm is based on recent researches.
When compared to large hemorrhages, it is seen that small hemorrhages are irregular in shape. Many systems have been developed to find these abnormalities. DR detection based on Convolutional Neural Network (CNN) using binocular fundus images and its correlation is suggested by Zeng et.al [2]. The model has a Siamese-like architecture that predicts the possibility of DR for both the eyes by correlating the pathological as well as the physiological condition of eyes. In our work one of the classifier decision is based on Neural network (NN). D Kumar et.al. [3] presented a radiomics-driven Computer Aided Diagnosis (CAD) based method. In order to overcome the limitations with current CAD approaches such as decision making a CLass-Enhanced Attentive Response Discovery Radiomics CLEAR-DR is proposed to aid clinical diagnosis of DR. Another important symptom of diabetic retinopathy is exudates, which are similar to hemorrhage pixels. An Early detection of exudates is presented byWisaeng and Sa-Ngiamvibool [4] using morphology mean shift algorithm (MMSA). After pre-processing, the image undergoes a coarse segmentation using mean shift and classification is done using the mathematical morphology algorithm (MMA). Detection of bright and dark lesion which can be hemorrhages or exudates, using a combination of matched filter response(MFR) and Laplacian of Gaussian Response (LoG) [5] produced a 96.10% -96.99% accuracy for various publically available database in hemorrhage detection. Multi-resolution analysis(MRA) is given importance in the work done by Lahmiri [6]. The statistical features obtained after MRA is fed to a support vector machine to grade retinal hemmorhage. Detection of hemorrhage pixels from the bright optical disc is always a constraint. Many methods are already prevailing in order to remove optical disc from the fundus image. Five optic disc detection methods with an algorithms committee having waited voting is presented by Silva et.al. [7] where, six public benchmark databases with a total of 1566 images is employed. Even though, in our work the optical disc is not removed, this method is useful when pixel based approach is considered. One such method of optic disc removal is used in exudate detection that involves mathematical morphology [8]. After morphological operation, the hard exudates are extracted using adaptive fuzzy logic. The purpose of this research is to develop a supervised classification model using three different classifiers and compare the output based on their sensitivity, specificity and accuracy. Retinal hemorrhages are demarcated with the help of an ophthalmologist who use a high-level representation entity known as splat [9]. Splats are a collection of pixels that have similar fundamental features. A two-step feature selection process is carried out to remove redundant features from the splat and these features are applied to a supervised classification to predict the possibility of hemorrhage splats in the whole image. The hemorrhage is finally detected and shown as bright spots on the dark oponency image. All the three classifiers are tested and their responses are tabulated. Section 2 describes the research method. Feature extraction and classification are portrayed in this section. Section 3 gives the result and discussion and section 4 concludes the work

RESEARCH METHOD
Pixels that are assumed to have similar spatial location and share same structural features such as color and intensity are partitioned into non-overlapping splats and spread over the entire image. [10] Splat based method uses several re-sampling strategies. As multiple sampling is performed, the background region consists of few number of larger splats whereas the foreground region consists of a large number of smaller splats. In a fundus image with hemorrhage, the total number of hemorrhage pixels is comparatively less when the entire image is considered [11]. Therefore, a splat based method is more likely to have better diversity in training the samples.
Splats are generated using watershed segmentation algorithm [10]. Splats are created in such a way that it preserves the boundaries between hemorrhage pixels and retinal background. In order to create meaningful splats, a scale specific over segmentation is performed. This is done in two steps. At first the gradient magnitude of contrast enhanced dark-bright opponent image is taken using different scales. It is done because of the variability in appearance of hemorrhages. All these values are aggregated and the maximum of the gradient value with its scale of interest(SOI) is taken to perform watershed segmentation [12].
The gradient magnitude is computed using the equation Since the topographic surface in watershed algorithm is important [11]to obtain genuine splats, the maximum of the gradient magnitude is taken for certain scale of Interest (SOI). Splats created using different scales exploiting the same watershed algorithm is shown in the Fig. 1. Figure 1. Splats created using different scales.
Scale band taken for Fig. 1(a) is most accurate when compared to scales used in Fig. 1(b) and Fig.  1(c) as it comes under the desired SOI for hemorrhage detection. The retinal background is represented by larger splats and blood regions are represented as smaller splats Fig. 1(a). The scale in Fig. 1(c) is suitable in removal of larger areas such as optic disc. The scale used in Fig. 1(b) is a fine scale and also not in the desired SOI. Again this scale range can be used in the detection of Microaneurysms. The number of splats is kept under a certain limit or threshold to achieve speed without much compromising the accuracy.

Splat Based Labelling by ophthalmologist
The samples taken from watershed algorithm are labelled by experts as the supervised algorithms are to be labelled using the MATLAB software. Small hemorrhages shown as purple dots, are indicated by a single point in Fig. 2.

Feature Extraction from Splats
After assigning reference labels for splats, a classifier can be trained to detect the target objects. An altogether of 352potentially relevant features are taken to train the classifiers. They are: 1) Color: Colors ofeach splat is extracted in RGB color space and dark-bright (db), red-green (rg), and blue-yellow (by) opponency images [13], which comes to six colour components. 2) Difference Of Gaussian(DoGFilter): Difference of Gaussian (DoG) kernels are applied at five differentsmoothing scales with one baseline scale in order to take advantage of Gaussian scale space [14] [15]. 3) Responses from GaussianFilter Bank [13]: A Gaussian filter bank which include a first orderderivative at two orientations and a second order derivative with three orientations are applied to the green channel. 4) Responses from SchmidFilter Bank: It has 13 kernels which are rotationally invariant and is applied to the dark bright opponent image. 5) Responses from Local Texture Filter Banks: This filter bank contains local entropy filter, local range filter and local standard deviation filter which compute the entropy, standard deviation and intensity range of one pixel in a given region [16] The above features are aggregated to obtain a meaningful response image which has low inter splat similarity and high intra splat similarity [13]- [19].The features mentioned are pixel-based responses. In addition to these features, we take splat wise features according to Gray-Level Co-occurrence Matrix (GLCM) [16]- [22]statistics. These are splat area, extent, texture, solidity and orientations. The complete feature set is shown in Table 1

Preliminary Feature Selection and classification
A two-step feature selection method is taken here so as to take only the relevant features and discard the irrelevant and redundant ones [23]. The preliminary feature selection is done using a filter approach in order to eliminate the features that are immaterial in discriminating hemorrhage and non-hemorrhage splats. A quadratic discriminant analysis(QDA) [24]is performed and by inspecting the features' variation with misclassification error(MCE) [25]. The preliminary features are chosen when the smallest MCE is reached. RGB and dark-bright (db), red-green(rg), blue-yellow(by) opponency [13] RGB and db, red-green rg, blue-yellow by [15] First order and second order derivatives in horizontal and vertical plane. 13 kernels [17] Local range, standard deviation and entropy of a pixel in given neighborhood [16] GLCM Number of pixels in splat Proportion of pixels in bounding box that are also in splat Angle between horizontal and major axis of the ellipse having same second-moments as splats. Proportion of pixels in convex hull that are also in splat. Statistics of GLCM: Contrast, correlation, energy and homogeneity [16] Coarseness, directionality and contrast of dbopponency associated with each splat [20] Ratio of maximum and minimum edge strength between neighboring splats. Closest distance of splat centroids to boundaries of FOV. Vessel probability map averaged within splats using vessel segmentation algorithm [21] Distance of automatically detected optic disc and fovea to splat centroids [22] After preliminary selection, a wrapper approach is performed in order to get an optimal combination of relevant features with minimum redundancy. It is the peculiarity of the wrapper approach that it assesses different combinations of feature subsets customized for a certain classification algorithm with higher computation time.The combinations are evaluated using a kNN Classifier. All the selected features are now applied to a sequential forward feature selection subset(SFS).

Classification using different classifiers
After feature selection, threedistinct trained classifiers are set up with the set of features and reference label instances.

A. kNN Classification
The kNN algorithm assigns soft class labels. The two classes defined or the outputs are hemorrhage splat or non-hemorrhage splat. The classifier decides the class of a particular splat based on the Euclidean distance of the features in an optimized feature space. As the value of k is increased the computation time increases and the splats are more accurately identified. But since all the k nearest neighbors are not near, an optimum value of k is chosen instead of an arbitrary value.

B. SVM Classification
Support Vector Machines include the concept of hyperplanes to distinguish between classes. The features are transformed to the required form using a linear kernel. The features were optimized and only the relevant number of features were trained to the classifier. A least squares SVM classifier with Radial Basis Function (RBF) Kernel is used here [26]. The RBF Kernel function is defined as Where x and x' are two feature vectors and  is a free parameter and|| ′ − || 2 is taken as squared Euclidean distance parameter.

C. ANN Classification
ANNs rely up on the concept of artificial neurons which is biologically inspired based on the function of brain. Here each neurons or nodes are held within a layer. The Neural Network consists of input layers, hidden layers and a transfer function or a threshold function. All the nodes are interconnected and they form a network. The nodes are trained using backpropagation algorithm. The nodes are trained by reducing their error through several iterations. The input to the ANN classifier is the relevant feature set and they are transformed using the desired weights from the hidden layer. Finally using a Sigmoid transfer function the output class is determined.

Data Collection and Pre-processing
Images were acquired from two sources. One from the publically available database DIARETDB1(http://www.it.lut.fi/project/imageret/diaretdb1/index.html) and the second set of clinical images from Dr. Bhejan Singh's eye hospital solely for educational and research purpose. The clinical image was captured using a "Remidio Non-Mydriatic Fundus On Phone (FOP-NM10)" Camera with an FOV of 40, working distance of 33mm and an ISO range from ISO 100 to 400. A total of 1500 images were taken 1050 for training, 225 images for testing and 225 for validation. The reference standard observations were accomplished by an ophthalmologist expert using the splat-based interpretation. Overall 1200(950 from training set 150 from testing) images were marked by the expert from a total of 1500. Preprocessing is done in order to adapt the colour variation throughout the dataset and also to equalize the intensity of the image. Histogram equalization is done using Contrast limited Adaptive Histogram Equalization(c) [27].Also Each image is normalized according to its prevailing pixel value at the three colour channels. The pixel values that occur frequently are shifted to the beginning of RGB colour space.

Feature Subset selection
From the 1050 training images. 10500 splats were created among which 300 are hemorrhage splats. This counts to a very low number of hemorrhage splat density. So images with at least 6 splats are taken for training, where 6 is arbitrarily chosen. After sequential forward feature selection subset(SFS) only the relevant features were considered whereas the insignificant and redundant ones were removed from the feature set.
The final feature set consists of 50 features from the 352 features obtained by filter approach and from this set 19 features were finally obtained by wrapper approach. The details of the final selected features are given in Table 2.

Classification of splats using different classifiers
The splats are represented as a 19 dimensional feature vector and each classifier is trained based on these features.

a. kNN Classifier
Different values of k were tested whose values are chosen between 15 to 160 that involves both feature selection as well classification. After repeated iterations, the value of k was fixed at 105 without compromising the computation time and prediction accuracy. The algorithm was validated using the image from the dataset DIARETDB1 and the ROC curve was plotted with an AUC of 0.94

b. SVM Classifier
The features were optimized and only the 19 relevant features were trained to the classifier. A least squares SVM classifier with Radial Basis Function (RBF) Kernel is used here. The confusion matrix for the hemorrhage splat detection is shown in Table 3.

c. ANN Classifier
The input to the ANN [28] classifier is the 19 feature set and they are transformed using the desired weights from the hidden layer. Finally using a sigmoid transfer function, the output class is determined. The network protocols used for detection of hemorrhage class from the splats is given in Table 4. The hemorrhage splats were successfully detected using the three classifiers and the result of a fundus image along with its receiver operator characteristics are shown in Figs.3. Splat based ROC curves corresponding to fundus image using the three classifiers is shown in Fig. 4. The AUC for ANN Classifier was 0.96 for SVM it was 0.94 and using the kNN classifier was 0.93. The best operating point on the curve for different classifiers are shown in Table 5.
In this manuscript, a set of features are extracted from each splat for defining its characteristic properties. In a selected feature space, these splats are taken as supervised classification samples. Splat based image annotations makes it easier for ophthalmologists for modeling unevenly shaped abnormalities in images. Further, it is resistant to intensity bias and noise. Analyzing how the performance of a DR detection system is related to the detection of rare large hemorrhages is quite interesting and challenging. It is found that the unweighted performance metrics like AUC or sensitivity and specificity will not be affected if we integrate the present hemorrhage detection system at a suitable threshold level. The reason is that, only the presence or absence of DR is usually indicated by the binary reference labels. In this approach, color images of retina are partitioned into different non-overlapping segments (splats). A set of features are extracted from each splat this inturn help us to detect retinal hemorrhage. The corresponding ROC curve for the input image is given in Fig. 4.  Superior results were obtained using ANN classifier when compared with other classifiers such as SVM and kNN classifiers. All the classifiers are proven to detect retinal hemorrhages with very high accuracy and since different splats in the image represents a large feature space, the neural network classifiers can outperform other classifiers. This was one of the reason to take ANN classifier into consideration. While other classifiers such as SVM and kNN are proven to provide superior results. When compared with the work of [29] the method has proved a sensitivity of 0.96 using ANN classifier against the sensitivity of 0.82 with kNN classification used in [29]. Compared with the work presented in [30], where fuzzy logic is used for classification, this method has proven superior results with a sensitivity of 0.96 against 0.86.

CONCLUSION
A splat based feature classification is presented for the detection of retinal hemorrhage. The proposed classification strategy can model different lesions with different texture size and appearance. The algorithm is validated on the publically available database DIARETDB1 and clinical image which was captured using a "Remidio Non-Mydriatic Fundus on Phone (FOP-NM10). The proposed method when compared with other methods in the literature suggest that it provides superior results when neural network classifier is being used. So the detection strategy can be incorporated into comprehensive DR assisting system for opthomologists.