scholarly journals Reduced and stable feature sets selection with random forest for neurons segmentation in histological images of macaque brain

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
C. Bouvier ◽  
N. Souedet ◽  
J. Levy ◽  
C. Jan ◽  
Z. You ◽  
...  

AbstractIn preclinical research, histology images are produced using powerful optical microscopes to digitize entire sections at cell scale. Quantification of stained tissue relies on machine learning driven segmentation. However, such methods require multiple additional information, or features, which are increasing the quantity of data to process. As a result, the quantity of features to deal with represents a drawback to process large series or massive histological images rapidly in a robust manner. Existing feature selection methods can reduce the amount of required information but the selected subsets lack reproducibility. We propose a novel methodology operating on high performance computing (HPC) infrastructures and aiming at finding small and stable sets of features for fast and robust segmentation of high-resolution histological images. This selection has two steps: (1) selection at features families scale (an intermediate pool of features, between spaces and individual features) and (2) feature selection performed on pre-selected features families. We show that the selected sets of features are stables for two different neuron staining. In order to test different configurations, one of these dataset is a mono-subject dataset and the other is a multi-subjects dataset to test different configurations. Furthermore, the feature selection results in a significant reduction of computation time and memory cost. This methodology will allow exhaustive histological studies at a high-resolution scale on HPC infrastructures for both preclinical and clinical research.

GeoJournal ◽  
2007 ◽  
Vol 69 (1-2) ◽  
pp. 119-129 ◽  
Author(s):  
Anil Cheriyadat ◽  
Eddie Bright ◽  
David Potere ◽  
Budhendra Bhaduri

2019 ◽  
Author(s):  
Владимир Авербух ◽  
Vladimir Averbukh ◽  
Александр Берсенев ◽  
Alexander Bersenev ◽  
Маджид Форгани ◽  
...  

In the paper we present the situation which had required visualization of a large amount of non-trivial objects, such as supercomputer’s tasks. The method of visualization of these objects was hard to find. Then we used additional information about an extra structure on those objects. This knowledge led us to an idea of grouping the objects into new generalized ones. Those new artificial objects were easy to visualize due to their small quantity. And they happened to be enough for the cognition of the original problem. That was a successful change of point of view. As a whole, our work belongs to a high-performance computing performance visualization area. It gains valuable attention from scientists over the whole world, for example [1-2].


Author(s):  
Fhira Nhita

<p>Data mining is a combination technology for analyze a useful information from dataset using some technique such as classification, clustering, and etc. Clustering is one of the most used data mining technique these day. K-Means and K-Medoids is one of clustering algorithms that mostly used because it’s easy implementation, efficient, and also present good results. Besides mining important information, the needs of time spent when mining data is also a concern in today era considering the real world applications produce huge volume of data. This research analyzed the result from K-Means and K-Medoids algorithm and time performance using High Performance Computing (HPC) Cluster to parallelize K-Means and K-Medoids algorithms and using Message Passing Interface (MPI) library. The results shown that K-Means algorithm gives smaller SSE than K-Medoids. And also parallel algorithm that used MPI gives faster computation time than sequential algorithm.</p>


Author(s):  
Amin Ul Haq ◽  
Jianping Li ◽  
Jalaluddin khan ◽  
Muhammad Hammad Memon ◽  
Shah Nazir ◽  
...  

A significant attention has been made to the accurate detection of diabetes which is a big challenge for the research community to develop a diagnosis system to detect diabetes in a successful way in the IoT e-healthcare environment. Internet of Things (IOT) has emerging role in healthcare services which delivers a system to analyze the medical data for diagnosis of diseases applied data mining methods. The existing diagnosis systems have some drawbacks, such as high computation time, and low prediction accuracy. To handle these issues, we have proposed a IOT based diagnosis system using machine learning methods, such as preprocessing of data, feature selection, and classification for the detection of diabetes disease in e- healthcare environment. Model validation and performance evaluation metrics have been used to check the validity of the proposed system. We have proposed a filter method based on the Decision Tree (Iterative Dichotomiser 3) algorithm for highly important feature selection. Two ensemble learning Decision Tree algorithms, such as Ada Boost and Random Forest are also used for feature selection and compared the classifier performance with wrapper based feature selection algorithms also. Machine learning classifier Decision Tree has been used for the classification of healthy and diabetic subjects. The experimental results show that the Decision Tree algorithm based on selected features improves the classification performance of the predictive model and achieved optimal accuracy. Additionally, the proposed system performance is high as compared to the previous state-of-the-art methods. High performance of the proposed method is due to the different combinations of selected features set and GL, DPF, and BMI are more significantly important features in the dataset for prediction of diabetes disease. Furthermore, the experimental results statistical analysis demonstrated that the proposed method would be effectively detected diabetes disease and can easily be deployed in IOT wireless sensor technologies based e-healthcare environment.


Author(s):  
Tahsin Kurc ◽  
Shannon Hastings ◽  
Vijay Kumar ◽  
Stephen Langella ◽  
Ashish Sharma ◽  
...  

Integrative biomedical research projects query, analyze, and integrate many different data types and make use of datasets obtained from measurements or simulations of structure and function at multiple biological scales. With the increasing availability of high-throughput and high-resolution instruments, the integrative biomedical research imposes many challenging requirements on software middleware systems. In this paper, we look at some of these requirements using example research pattern templates. We then discuss how middleware systems, which incorporate Grid and high-performance computing, could be employed to address the requirements.


Sign in / Sign up

Export Citation Format

Share Document