COVID19 detection from chest X-ray images using transfer learning | Scientific Reports – Nature.com
May 22, 2024
In this section, the proposed framework has been explained. First, the used chest X-ray dataset has been described. Then, the developed framework, which includes pre-processing phase and the Classification using CNN models based on transfer learning phase, has been illustrated. Two different approaches have been used to train pre-trained CNN models using transfer learning. The first approach uses whole chest X-ray images, while the other approach uses lung-segmented images.
In this research, the data obtained from the COVID-19 Radiography Database has been used to apply the proposed framework. The database contains thousands of publicly available benchmark X-ray images and corresponding lung masks. The X-ray images are provided in Portable Network Graphics (PNG) format with a resolution of 299299 pixels. The database includes 10,192 Normal cases, 3616 positive COVID-19 cases, 1345 Viral Pneumonia cases, and 6012 Lung Opacity images as shown in Table1. This database was developed by a team from Qatar University, Dhaka University, Bangladesh with cooperators from Malaysia and Pakistan and cooperators of medical doctors26. Figure1 illustrates samples from different classes in the COVID-19 Radiography Database.
Samples from COVID-19 radiography chest database representing different classes.
The purpose of the pre-processing phase is to prepare the X-ray images for classification using CNN pre-trained models. In this phase, different pre-processing steps are applied to improve the performance of the classification. The pre-processing steps can be summarized as follows:
Enhancing images is a significant step for the correct classification. It increases image contrast in order to improve classification performance. Different techniques can be applied to enhance the images. In this research, some of these techniques have been applied to the original X-ray images before introducing them to the classification models, they are as follows:
Histogram Equalization (HE): The purpose of histogram equalization (HE) is to spread the gray levels inside the image. It modifies the brightness and contrast of the images to improve the image quality27. The original X-ray images intensity has been enhanced using histogram equalization (HE).
Contrast Limited Adaptive Histogram Equalization (CLAHE): It originated from Global Histogram Equalization (GHE), it is based on dividing the image into non-overlapping blocks, and after that, the histogram of each block is gotten using a pre-specified value28. In this research, CLAHE has been used to enhance the contrast of original X-ray images.
Image Complement: The complement or inverse of X-ray images transforms the dark positions to lighter and the light positions to darker. As this is a standard process, which is similar to that used by radiologists, it may aid a deep learning model for improving classification performance. The complement of the binary image can be obtained by changing the zeros to ones and ones to zeros. Whereas for a grayscale image, each pixel is subtracted from 255.
Figure2 shows an original X-ray image and its enhanced versions after applying HE, CLAHE and image complement on the original image with the corresponding histogram plots for each version.
An X-ray image and its enhanced versions after applying HE, CLAHE and complement to the original image and the corresponding histogram plots.
In the segmentation step, the regions of interest (ROI), which are the lungs region in our case, are cropped from the associated image. In this research, the ground truth lungs masks which are provided by the database have been used. A modified U-Net model was applied by the authors of the database on the X-ray images to get the lung masks associated with the full X-ray images. In this research, multiplication between each original image and the associated lung mask has been applied to get the segmented lungs. The same process of multiplication between different enhanced image versions and the associated masks has been applied to get different versions of segmented datasets with different enhancements. All these versions are introduced to CNN models as segmented versions of data. Figure3 shows the segmented images of the original image and of the different enhanced images for one of the COVID samples.
X-ray original image and its enhanced versions and the segmented lung region of each version.
Resizing the images is an essential process to satisfy the requirement of CNN of equally sized input images. In this research, the process of resizing X-ray images has been done to fit all X-ray images to the input size of the used pre-trained CNN models which are VGG19 and EfficientNetB0. Therefore, all images versions either full or segmented versions were resized to fit the CNNs input image size which is 224224 pixels. To expedite the training process, it was found that the size of 112112 pixels expedited the training without affecting the performance metrics.
In this research, different versions of either full or segmented chest X-ray images have been introduced to CNN models to train the classifiers. Different experiments have been carried out on the original and segmented lung X-ray images both with their different enhanced versions. The classification has been done using VGG1914 and EfficientNetB016 pre-trained CNN models. After the calculation of different performance metrics, the best model has been selected as the adopted model. The next subsections give a brief description of the used pre-trained models.
VGG19 is a variant of the VGG CNN model which was created by Visual Geometry Group (VGG) at Oxford University. VGG19 was one of the winners of the Image Net Large Scale Visual Recognition Challenge (ILSVRC) in 2014. The size of the input image to VGG19 is (224 224). VGG19 contains 16 convolution layers, 5 max-pooling layers and 3 fully connected layers. The convolution layers are with (3 3) filters' size, stride of 1 pixel and padding of 1 pixel. The max-pooling layers are with a size of 2 2 and a stride of 2. The rectification (ReLU) activation function is utilized for all hidden layers. Then, the first 2 fully connected layers with 4096 channels each are uitilized followed by the last layer of 1000 channels to represent the different 1000 classes of the ImageNet with soft-max activation function15.
Google research group designed a family of models, called EfficientNets using a scaling method and achieved better efficiency and accuracy than previous ConvNets. EfficientNet is based on scaling CNNs and reaching better performance by balancing network width, depth, and resolution. Therefore, the focus is to present a scaling method to uniformly scale the 3 dimensions with a simple highly effective compound coefficient. Thus, it can be considered as an optimization problem to find the best coefficients for depth, width, and resolution that maximizes the accuracy of the network given the constraints of the available resources. The primary building block of the EfficientNet models is MBConv. The network's dimension equation was used to get the family of neural networks EfficientNet-B0 to B716. In this research, EfficientNetB0 was used for the classification of the chest X-ray images. Figure4 sums up the framework of the adopted methodology in this research.
The framework of the used methodology for Chest X-ray images classification.
This article does not contain any studies with human participants or animals performed by the author.
Read the original here:
COVID19 detection from chest X-ray images using transfer learning | Scientific Reports - Nature.com