An Assessment on the FCM Classification of Thermodynamic-Property Data

Author(s):  
Gabriela Avila ◽  
Arturo Pacheco-Vega

In the present study we consider the algorithmic classification of thermodynamic properties of fluids using the fuzzy C-means (FCM) clustering methodology. The FCM is a technique that can find patters directly from the data. It is based on the minimization of an objective function that provides a measure of the dissimilarity of the data being classified in a particular group. The dissimilarity in the data is commonly formulated in terms of the Euclidean distance between the data points and the cluster centroids. This mathematical formulation and the efficient implementation are among its advantages. However, some drawbacks that lead to misclassification include the convergence to local optima, the particular form of the data, and the choice of the parameters embedded in the scheme. To assess the correct classification performance of FCM algorithm, published data of pressure, volume, and temperature are used with emphasis on the way the algorithm is affected by the natural scale of the data, and the following strategies for the classification: (1) data normalization, (2) transformation, (3) sample size used, and (4) data supply to the algorithm. The results of this assessment show that the natural scaling, and the normalization and transformation strategies are important, whereas the way the data are presented to the algorithm is not a critical factor in the classification. Also, a decrease in the number of data considered degrades the quality of the clustering. A complete consideration of the issues studied here may be helpful when a FCM classification is tried on new data.

The problem of medical data classification is analyzed and the methods of classification are reviewed in various aspects. However, the efficiency of classification algorithms is still under question. With the motivation to leverage the classification performance, a Class Level disease Convergence and Divergence (CLDC) measure based algorithm is presented in this paper. For any dimension of medical data, it convergence or divergence indicates the support for the disease class. Initially, the data set has been preprocessed to remove the noisy data points. Further, the method estimates disease convergence/divergence measure on different dimensions. The convergence measure is computed based on the frequency of dimensional match where the divergence is estimated based on the dimensional match of other classes. Based on the measures a disease support factor is estimated. The value of disease support has been used to classify the data point and improves the classification performance.


Author(s):  
Yuejun Liu ◽  
Yifei Xu ◽  
Xiangzheng Meng ◽  
Xuguang Wang ◽  
Tianxu Bai

Background: Medical imaging plays an important role in the diagnosis of thyroid diseases. In the field of machine learning, multiple dimensional deep learning algorithms are widely used in image classification and recognition, and have achieved great success. Objective: The method based on multiple dimensional deep learning is employed for the auxiliary diagnosis of thyroid diseases based on SPECT images. The performances of different deep learning models are evaluated and compared. Methods: Thyroid SPECT images are collected with three types, they are hyperthyroidism, normal and hypothyroidism. In the pre-processing, the region of interest of thyroid is segmented and the amount of data sample is expanded. Four CNN models, including CNN, Inception, VGG16 and RNN, are used to evaluate deep learning methods. Results: Deep learning based methods have good classification performance, the accuracy is 92.9%-96.2%, AUC is 97.8%-99.6%. VGG16 model has the best performance, the accuracy is 96.2% and AUC is 99.6%. Especially, the VGG16 model with a changing learning rate works best. Conclusion: The standard CNN, Inception, VGG16, and RNN four deep learning models are efficient for the classification of thyroid diseases with SPECT images. The accuracy of the assisted diagnostic method based on deep learning is higher than that of other methods reported in the literature.


Author(s):  
S. P. Oakley

After a brief introduction on stemmatic method, this book contains genealogical investigations of the textual traditions of Quintus Curtius Rufus and then Dictys Cretensis. The sections on each author begin with a list of MSS and incunables that will be discussed (they number just over 150 for Curtius, about 80 for Dictys) and then a survey of existing scholarship. There then follows the classification of the MSS and incunables; most of the MSS of both authors were produced in Italy in the fifteenth-century. In the section on Curtius MSS B = Bern, Burgerbibliothek 451, Br = Brussels 10161, and A= Paris, lat. 5720, owned by Petrarch are shown to have been very productive. For Dictys it is argued that a stemma codicum can be established. First witnesses related to G = Sankt Gallen, Stiftsbibliothek 197 (these include an important lost MS of Poggio) are discussed, then those related to MS E, the codex Aesinas, owned by Stefano Guarnieri. There follows discussion of the archetype, of the way in which proper attention to the stemma codicum can improve the text, and of the excerpts from Dictys found in MSS of Dares.


2021 ◽  
Vol 11 (1) ◽  
pp. 428
Author(s):  
Donghoon Oh ◽  
Jeong-Sik Park ◽  
Ji-Hwan Kim ◽  
Gil-Jin Jang

Speech recognition consists of converting input sound into a sequence of phonemes, then finding text for the input using language models. Therefore, phoneme classification performance is a critical factor for the successful implementation of a speech recognition system. However, correctly distinguishing phonemes with similar characteristics is still a challenging problem even for state-of-the-art classification methods, and the classification errors are hard to be recovered in the subsequent language processing steps. This paper proposes a hierarchical phoneme clustering method to exploit more suitable recognition models to different phonemes. The phonemes of the TIMIT database are carefully analyzed using a confusion matrix from a baseline speech recognition model. Using automatic phoneme clustering results, a set of phoneme classification models optimized for the generated phoneme groups is constructed and integrated into a hierarchical phoneme classification method. According to the results of a number of phoneme classification experiments, the proposed hierarchical phoneme group models improved performance over the baseline by 3%, 2.1%, 6.0%, and 2.2% for fricative, affricate, stop, and nasal sounds, respectively. The average accuracy was 69.5% and 71.7% for the baseline and proposed hierarchical models, showing a 2.2% overall improvement.


2021 ◽  
Vol 21 (S2) ◽  
Author(s):  
Kun Zeng ◽  
Yibin Xu ◽  
Ge Lin ◽  
Likeng Liang ◽  
Tianyong Hao

Abstract Background Eligibility criteria are the primary strategy for screening the target participants of a clinical trial. Automated classification of clinical trial eligibility criteria text by using machine learning methods improves recruitment efficiency to reduce the cost of clinical research. However, existing methods suffer from poor classification performance due to the complexity and imbalance of eligibility criteria text data. Methods An ensemble learning-based model with metric learning is proposed for eligibility criteria classification. The model integrates a set of pre-trained models including Bidirectional Encoder Representations from Transformers (BERT), A Robustly Optimized BERT Pretraining Approach (RoBERTa), XLNet, Pre-training Text Encoders as Discriminators Rather Than Generators (ELECTRA), and Enhanced Representation through Knowledge Integration (ERNIE). Focal Loss is used as a loss function to address the data imbalance problem. Metric learning is employed to train the embedding of each base model for feature distinguish. Soft Voting is applied to achieve final classification of the ensemble model. The dataset is from the standard evaluation task 3 of 5th China Health Information Processing Conference containing 38,341 eligibility criteria text in 44 categories. Results Our ensemble method had an accuracy of 0.8497, a precision of 0.8229, and a recall of 0.8216 on the dataset. The macro F1-score was 0.8169, outperforming state-of-the-art baseline methods by 0.84% improvement on average. In addition, the performance improvement had a p-value of 2.152e-07 with a standard t-test, indicating that our model achieved a significant improvement. Conclusions A model for classifying eligibility criteria text of clinical trials based on multi-model ensemble learning and metric learning was proposed. The experiments demonstrated that the classification performance was improved by our ensemble model significantly. In addition, metric learning was able to improve word embedding representation and the focal loss reduced the impact of data imbalance to model performance.


2019 ◽  
Vol 2019 ◽  
pp. 1-9
Author(s):  
Yizhe Wang ◽  
Cunqian Feng ◽  
Yongshun Zhang ◽  
Sisan He

Precession is a common micromotion form of space targets, introducing additional micro-Doppler (m-D) modulation into the radar echo. Effective classification of space targets is of great significance for further micromotion parameter extraction and identification. Feature extraction is a key step during the classification process, largely influencing the final classification performance. This paper presents two methods for classifying different types of space precession targets from the HRRPs. We first establish the precession model of space targets and analyze the scattering characteristics and then compute electromagnetic data of the cone target, cone-cylinder target, and cone-cylinder-flare target. Experimental results demonstrate that the support vector machine (SVM) using histograms of oriented gradient (HOG) features achieves a good result, whereas the deep convolutional neural network (DCNN) obtains a higher classification accuracy. DCNN combines the feature extractor and the classifier itself to automatically mine the high-level signatures of HRRPs through a training process. Besides, the efficiency of the two classification processes are compared using the same dataset.


Author(s):  
Bo Wang ◽  
Chen Sun ◽  
Keming Zhang ◽  
Jubing Chen

Abstract As a representative type of outlier, the abnormal data in displacement measurement often inevitably occurred in full-field optical metrology and significantly affected the further evaluation, especially when calculating the strain field by differencing the displacement. In this study, an outlier removal method is proposed which can recognize and remove the abnormal data in optically measured displacement field. A iterative critical factor least squares algorithm (CFLS) is developed which distinguishes the distance between the data points and the least square plane to identify the outliers. A successive boundary point algorithm is proposed to divide the measurement domain to improve the applicability and effectiveness of the CFLS algorithm. The feasibility and precision of the proposed method are discussed in detail through simulations and experiments. Results show that the outliers are reliably recognized and the precision of the strain estimation is highly improved by using these methods.


2021 ◽  
Author(s):  
Faruk Alpak ◽  
Yixuan Wang ◽  
Guohua Gao ◽  
Vivek Jain

Abstract Recently, a novel distributed quasi-Newton (DQN) derivative-free optimization (DFO) method was developed for generic reservoir performance optimization problems including well-location optimization (WLO) and well-control optimization (WCO). DQN is designed to effectively locate multiple local optima of highly nonlinear optimization problems. However, its performance has neither been validated by realistic applications nor compared to other DFO methods. We have integrated DQN into a versatile field-development optimization platform designed specifically for iterative workflows enabled through distributed-parallel flow simulations. DQN is benchmarked against alternative DFO techniques, namely, the Broyden–Fletcher–Goldfarb–Shanno (BFGS) method hybridized with Direct Pattern Search (BFGS-DPS), Mesh Adaptive Direct Search (MADS), Particle Swarm Optimization (PSO), and Genetic Algorithm (GA). DQN is a multi-thread optimization method that distributes an ensemble of optimization tasks among multiple high-performance-computing nodes. Thus, it can locate multiple optima of the objective function in parallel within a single run. Simulation results computed from one DQN optimization thread are shared with others by updating a unified set of training data points composed of responses (implicit variables) of all successful simulation jobs. The sensitivity matrix at the current best solution of each optimization thread is approximated by a linear-interpolation technique using all or a subset of training-data points. The gradient of the objective function is analytically computed using the estimated sensitivities of implicit variables with respect to explicit variables. The Hessian matrix is then updated using the quasi-Newton method. A new search point for each thread is solved from a trust-region subproblem for the next iteration. In contrast, other DFO methods rely on a single-thread optimization paradigm that can only locate a single optimum. To locate multiple optima, one must repeat the same optimization process multiple times starting from different initial guesses for such methods. Moreover, simulation results generated from a single-thread optimization task cannot be shared with other tasks. Benchmarking results are presented for synthetic yet challenging WLO and WCO problems. Finally, DQN method is field-tested on two realistic applications. DQN identifies the global optimum with the least number of simulations and the shortest run time on a synthetic problem with known solution. On other benchmarking problems without a known solution, DQN identified compatible local optima with reasonably smaller numbers of simulations compared to alternative techniques. Field-testing results reinforce the auspicious computational attributes of DQN. Overall, the results indicate that DQN is a novel and effective parallel algorithm for field-scale development optimization problems.


2018 ◽  
Vol 2018 ◽  
pp. 1-13 ◽  
Author(s):  
Chao-qun Zhao ◽  
Long Chen ◽  
Hong Cai ◽  
Wei-li Yao ◽  
Qun Zhou ◽  
...  

Objective. This study aimed to analyze the differential metabolites and their metabolic pathways from the serum of patients with hepatitis B cirrhosis, with two typical patterns of Gan Dan Shi Re (GDSR) and Gan Shen Yin Xu (GSYX) based on the theory of traditional Chinese medicine (TCM). It also investigated the variation in the internal material basis for the two types of patterns and provided an objective basis for classifying TCM patterns using metabolomic techniques. Methods. The serum samples taken from 111 qualified patients (40 GDSR cases, 41 GSYX cases, and 30 Latent Pattern (LP) cases with no obvious pattern characters) and 60 healthy volunteers were tested to identify the differential substances relevant to hepatitis B cirrhosis and the two typical TCM patterns under the gas chromatography–time-of-flight mass spectrometry platform. The relevant metabolic pathways of differential substances were analyzed using multidimensional statistical analysis. Results. After excluding the influence of LP groups, six common substances were found in GDSR and GSYX patterns, which were mainly involved in the metabolic pathways of glycine, serine, threonine, and phenylalanine. Eight specific metabolites involved in the metabolic pathways of linoleic, glycine, threonine, and serine existed in the two patterns. Conclusions. The data points on the metabolic spectrum were found to be well distributed among the differential substances between the two typical TCM patterns of patients with hepatitis B cirrhosis using metabolomic techniques. The differential expression of these substances between GDSR and GSYX patterns provided an important objective basis for the scientific nature of TCM pattern classification at the metabolic level.


Sign in / Sign up

Export Citation Format

Share Document