medical dataset
Recently Published Documents


TOTAL DOCUMENTS

78
(FIVE YEARS 39)

H-INDEX

7
(FIVE YEARS 1)

2021 ◽  
Vol 15 (1) ◽  
pp. 235-248
Author(s):  
Mayank R. Kapadia ◽  
Chirag N. Paunwala

Introduction: Content Based Image Retrieval (CBIR) system is an innovative technology to retrieve images from various media types. One of the CBIR applications is Content Based Medical Image Retrieval (CBMIR). The image retrieval system retrieves the most similar images from the historical cases, and such systems can only support the physician's decision to diagnose a disease. To extract the useful features from the query image for linking similar types of images is the major challenge in the CBIR domain. The Convolution Neural Network (CNN) can overcome the drawbacks of traditional algorithms, dependent on the low-level feature extraction technique. Objective: The objective of the study is to develop a CNN model with a minimum number of convolution layers and to get the maximum possible accuracy for the CBMIR system. The minimum number of convolution layers reduces the number of mathematical operations and the time for the model's training. It also reduces the number of training parameters, like weights and bias. Thus, it reduces the memory requirement for the model storage. This work mainly focused on developing an optimized CNN model for the CBMIR system. Such systems can only support the physicians' decision to diagnose a disease from the images and retrieve the relevant cases to help the doctor decide the precise treatment. Methods: The deep learning-based model is proposed in this paper. The experiment is done with several convolution layers and various optimizers to get the maximum accuracy with a minimum number of convolution layers. Thus, the ten-layer CNN model is developed from scratch and used to derive the training and testing images' features and classify the test image. Once the image class is identified, the most relevant images are determined based on the Euclidean distance between the query features and database features of the identified class. Based on this distance, the most relevant images are displayed from the respective class of images. The general dataset CIFAR10, which has 60,000 images of 10 different classes, and the medical dataset IRMA, which has 2508 images of 9 various classes, have been used to analyze the proposed method. The proposed model is also applied for the medical x-ray image dataset of chest disease and compared with the other pre-trained models. Results: The accuracy and the average precision rate are the measurement parameters utilized to compare the proposed model with different machine learning techniques. The accuracy of the proposed model for the CIFAR10 dataset is 93.9%, which is better than the state-of-the-art methods. After the success for the general dataset, the model is also tested for the medical dataset. For the x-ray images of the IRMA dataset, it is 86.53%, which is better than the different pre-trained model results. The model is also tested for the other x-ray dataset, which is utilized to identify chest-related disease. The average precision rate for such a dataset is 97.25%. Also, the proposed model fulfills the major challenge of the semantic gap. The semantic gap of the proposed model for the chest disease dataset is 2.75%, and for the IRMA dataset, it is 13.47%. Also, only ten convolution layers are utilized in the proposed model, which is very small in number compared to the other pre-trained models. Conclusion: The proposed technique shows remarkable improvement in performance metrics over CNN-based state-of-the-art methods. It also offers a significant improvement in performance metrics over different pre-trained models for the two different medical x-ray image datasets.


2021 ◽  
Vol 15 (1) ◽  
pp. 236-249
Author(s):  
Mayank R. Kapadia ◽  
Chirag N. Paunwala

Introduction: Content Based Image Retrieval (CBIR) system is an innovative technology to retrieve images from various media types. One of the CBIR applications is Content Based Medical Image Retrieval (CBMIR). The image retrieval system retrieves the most similar images from the historical cases, and such systems can only support the physician's decision to diagnose a disease. To extract the useful features from the query image for linking similar types of images is the major challenge in the CBIR domain. The Convolution Neural Network (CNN) can overcome the drawbacks of traditional algorithms, dependent on the low-level feature extraction technique. Objective: The objective of the study is to develop a CNN model with a minimum number of convolution layers and to get the maximum possible accuracy for the CBMIR system. The minimum number of convolution layers reduces the number of mathematical operations and the time for the model's training. It also reduces the number of training parameters, like weights and bias. Thus, it reduces the memory requirement for the model storage. This work mainly focused on developing an optimized CNN model for the CBMIR system. Such systems can only support the physicians' decision to diagnose a disease from the images and retrieve the relevant cases to help the doctor decide the precise treatment. Methods: The deep learning-based model is proposed in this paper. The experiment is done with several convolution layers and various optimizers to get the maximum accuracy with a minimum number of convolution layers. Thus, the ten-layer CNN model is developed from scratch and used to derive the training and testing images' features and classify the test image. Once the image class is identified, the most relevant images are determined based on the Euclidean distance between the query features and database features of the identified class. Based on this distance, the most relevant images are displayed from the respective class of images. The general dataset CIFAR10, which has 60,000 images of 10 different classes, and the medical dataset IRMA, which has 2508 images of 9 various classes, have been used to analyze the proposed method. The proposed model is also applied for the medical x-ray image dataset of chest disease and compared with the other pre-trained models. Results: The accuracy and the average precision rate are the measurement parameters utilized to compare the proposed model with different machine learning techniques. The accuracy of the proposed model for the CIFAR10 dataset is 93.9%, which is better than the state-of-the-art methods. After the success for the general dataset, the model is also tested for the medical dataset. For the x-ray images of the IRMA dataset, it is 86.53%, which is better than the different pre-trained model results. The model is also tested for the other x-ray dataset, which is utilized to identify chest-related disease. The average precision rate for such a dataset is 97.25%. Also, the proposed model fulfills the major challenge of the semantic gap. The semantic gap of the proposed model for the chest disease dataset is 2.75%, and for the IRMA dataset, it is 13.47%. Also, only ten convolution layers are utilized in the proposed model, which is very small in number compared to the other pre-trained models. Conclusion: The proposed technique shows remarkable improvement in performance metrics over CNN-based state-of-the-art methods. It also offers a significant improvement in performance metrics over different pre-trained models for the two different medical x-ray image datasets.


2021 ◽  
Vol 3 (Special Issue 9S) ◽  
pp. 28-33
Author(s):  
Mahalakshmi G. ◽  
Shimaali Riyasudeen ◽  
Sairam R ◽  
Hari Sanjeevi R ◽  
Raghupathy B.

Author(s):  
Shivlal Mewada

The valuable information is extracted through data mining techniques. Recently, privacy preserving data mining techniques are widely adopted for securing and protecting the information and data. These techniques convert the original dataset into protected dataset through swapping, modification, and deletion functions. This technique works in two steps. In the first step, cloud computing considers a service platform to determine the optimum horizontal partitioning in given data. In this work, K-Means++ algorithm is implemented to determine the horizontal partitioning on the cloud platform without disclosing the cluster centers information. The second steps contain data protection and recover phases. In the second step, noise is incorporated in the database to maintain the privacy and semantic of the data. Moreover, the seed function is used for protecting the original databases. The effectiveness of the proposed technique is evaluated using several benchmark medical datasets. The results are evaluated using encryption time, execution time, accuracy, and f-measure parameters.


2021 ◽  
Vol 21 (S1) ◽  
Author(s):  
Jie Su ◽  
Yi Cao ◽  
Yuehui Chen ◽  
Yahui Liu ◽  
Jinming Song

Abstract Background Protection of privacy data published in the health care field is an important research field. The Health Insurance Portability and Accountability Act (HIPAA) in the USA is the current legislation for privacy protection. However, the Institute of Medicine Committee on Health Research and the Privacy of Health Information recently concluded that HIPAA cannot adequately safeguard the privacy, while at the same time researchers cannot use the medical data for effective researches. Therefore, more effective privacy protection methods are urgently needed to ensure the security of released medical data. Methods Privacy protection methods based on clustering are the methods and algorithms to ensure that the published data remains useful and protected. In this paper, we first analyzed the importance of the key attributes of medical data in the social network. According to the attribute function and the main objective of privacy protection, the attribute information was divided into three categories. We then proposed an algorithm based on greedy clustering to group the data points according to the attributes and the connective information of the nodes in the published social network. Finally, we analyzed the loss of information during the procedure of clustering, and evaluated the proposed approach with respect to classification accuracy and information loss rates on a medical dataset. Results The associated social network of a medical dataset was analyzed for privacy preservation. We evaluated the values of generalization loss and structure loss for different values of k and a, i.e. $$k$$ k  = {3, 6, 9, 12, 15, 18, 21, 24, 27, 30}, a = {0, 0.2, 0.4, 0.6, 0.8, 1}. The experimental results in our proposed approach showed that the generalization loss approached optimal when a = 1 and k = 21, and structure loss approached optimal when a = 0.4 and k = 3. Conclusion We showed the importance of the attributes and the structure of the released health data in privacy preservation. Our method achieved better results of privacy preservation in social network by optimizing generalization loss and structure loss. The proposed method to evaluate loss obtained a balance between the data availability and the risk of privacy leakage.


2021 ◽  
Vol 50 (1) ◽  
pp. 102-122
Author(s):  
Veera Anusuya ◽  
V Gomathi

In the 20th century, it is evident that there is a massive evolution of chronic diseases. The data mining approaches beneficial in making some medicinal decisions for curing diseases. But medical data may consist of a large number of data, which makes the prediction process a very difficult one. Also, in the medical field, the dataset may involve both the small database and extensive database. This creates the study of a complex one for disease prediction mechanism. Hence, in this paper, we intend to use a practical machine learning approach for disease prediction of both large and small datasets. Among the various machine learning procedures, classification, and clusters method play a significant role. Therefore, we introduced the enhanced classification and clusters approach in this work for obtaining better accuracy results for disease prediction. In this proposed method, a process of preprocessing is involved, followed by Eigen vector extraction, feature selection, and classification Further, the most suitable features are selected with the use of Multi-Objective based Ant Colony Optimization (MO-ACO) from the extracted features for increasing the classification and clusters. Here we have shown the novelty in every stage of the implementation, such as feature selection, feature extraction, and the final prediction stage. The proposed method will be compared with the existing technique on the measure of precision, NMI, execution time, recall, and Accuracy. Here we conclude with the solution having more accuracy for both small and large datasets.


2021 ◽  
pp. 196-203
Author(s):  
Jaafar Hammoud ◽  
Aleksandra Vatian ◽  
Natalia Dobrenko ◽  
Nikolai Vedernikov ◽  
Anatoly Shalyto ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document