A simple protein evolutionary classification method based on the mutual relations between protein sequences

2020 ◽  
Vol 15 ◽  
Author(s):  
Xiaogeng Wan ◽  
Xinying Tan

Aims: This paper presents a simple method that is efficient for protein evolutionary classification. Background: Proteins are diverse with their sequences, structures and functions. It is important to understand the relations between the sequences, structures and functions of proteins. Many methods have been developed for protein evolutionaryclassifications, these methods include machine learning methods such as the LibSVM, feature methods such as the natural vector method and the protein map. Machine learning methods use pre-labeled training sets to classify protein sequences into disjoint classes. Feature methods such as the natural vector and the protein map convert protein sequences into feature vectors and use polygenetic-trees to classify on the distance between the feature vectors. In this paper, we propose a simple method that classify the evolutionary relations of protein sequences using the distance maps on the mutual relations between protein sequences. The new method is unsupervised and model-free, which is efficient in the evolutionary classifications of proteins. Objective: In this paper, we propose a simple method that classify the evolutionary relations of protein sequences using the distance maps on the mutual relations between protein sequences. The new method is unsupervised and model-free, which is efficient in the evolutionary classifications of proteins.methodTo quantify the mutual relations and the homology of protein sequences, we use the normalized mutual information rates on protein sequences, and we define two distance maps that convert the normalized mutual information rates into 'distances', and use UPGMA trees to present the evolutionary classifications of proteins. Method:: To quantify the mutual relations and the homology of protein sequences, we use the normalized mutual information rates on protein sequences, and we define two distance maps that convert the normalized mutual information rates into 'distances', and use UPGMA trees to present the evolutionary classifications of proteins. Result: We use four classifical protein evolutionary classification examples to demonstrate the new method, where the results are compared with traditional methods such as the natural vector and the protein maps. We use the AUPRC curves to evaluate the classification qualities of the new method and the traditional methods. We found that the new method with the two distance maps is efficient in the evolutionary classification of the classical examples, and it outperforms the natural vector and the protein maps in the evolutionary classifications. Conclusion: The normalized mutual information rates with the two distance maps are efficient in protein evolutionary classifications, which outperform some classifical methods in the evolutionary classifications. Other: The results are compared with traditional protein evolutionary classification methods such as the natural vector and the protein map, and the method of AUPRC curves is applied to the new method and the traditional methods to inspect the classification accuracies.

2021 ◽  
Author(s):  
Yan-Mei Li ◽  
Chu-Qiao Liang ◽  
Lin Wang ◽  
Yun-Yi Luo ◽  
Qian-Qian Li

We developed a new method for protein droplet visualization by means of a droplet probe (DroProbe) based on an aggregation-induced emission (AIE) fluorogen. A simple method for viscosity comparison of...


2021 ◽  
Vol 69 (2) ◽  
pp. 173-179
Author(s):  
Nilolina Samardzic ◽  
Brian C.J. Moore

Traditional methods for predicting the intelligibility of speech in the presence of noise inside a vehicle, such as the Articulation Index (AI), the Speech Intelligibility Index (SII), and the Speech Transmission Index (STI), are not accurate, probably because they do not take binaural listening into account; the signals reaching the two ears can differ markedly depending on the positions of the talker and listener. We propose a new method for predicting the intelligibility of speech in a vehicle, based on the ratio of the binaural loudness of the speech to the binaural loudness of the noise, each calculated using the method specified in ISO 532-2 (2017). The method was found to give accurate predictions of the speech reception threshold (SRT) measured under a variety of conditions and for different positions of the talker and listener in a car. The typical error in the predicted SRT was 1.3 dB, which is markedly smaller than estimated using the SII and STI (2.0 dB and 2.1 dB, respectively).


2018 ◽  
Vol 14 (1) ◽  
pp. 43-50 ◽  
Author(s):  
Anna Fitzpatrick ◽  
Joseph A Stone ◽  
Simon Choppin ◽  
John Kelley

Performance analysis and identifying performance characteristics associated with success are of great importance to players and coaches in any sport. However, while large amounts of data are available within elite tennis, very few players employ an analyst or attempt to exploit the data to enhance their performance; this is partly attributable to the considerable time and complex techniques required to interpret these large datasets. Using data from the 2016 and 2017 French Open tournaments, we tested the agreement between the results of a simple new method for identifying important performance characteristics (the Percentage of matches in which the Winner Outscored the Loser, PWOL) and the results of two standard statistical methods to establish the validity of the simple method. Spearman’s rank-order correlations between the results of the three methods demonstrated excellent agreement, with all methods identifying the same three performance characteristics ( points won of 0–4 rally length, baseline points won and first serve points won) as strongly associated with success. Consequently, we propose that the PWOL method is valid for identifying performance characteristics associated with success in tennis, and is therefore a suitable alternative to more complex statistical methods, as it is simpler to calculate, interpret and contextualise.


Radiocarbon ◽  
1983 ◽  
Vol 25 (2) ◽  
pp. 639-644 ◽  
Author(s):  
H T Waterbolk

In the past 30 years many hundreds of archaeologic samples have been dated by radiocarbon laboratories. Yet, one cannot say that 14C dating is fully integrated into archaeology. For many archaeologists, a 14C date is an outside expertise, for which they are grateful, when it provides the answer to an otherwise insoluble chronologic problem and when it falls within the expected time range. But if a 14C date contradicts other chronologic evidence, they often find the ‘solution’ inexplicable. Some archaeologists are so impressed by the new method, that they neglect the other evidence; others simply reject problematic 14C dates as archaeologically unacceptable. Frequently, excavation reports are provided with an appendix listing the relevant 14C dates with little or no discussion of their implication. It is rare, indeed, to see in archaeologic reports a careful weighing of the various types of chronologic evidence. Yet, this is precisely what the archaeologist is accustomed to do with the evidence from his traditional methods for building up a chronology: typology and stratigraphy. Why should he not be able to include radiocarbon dates in the same way in his considerations?


2015 ◽  
Author(s):  
Patrizio Tressoldi ◽  
William Giroldini ◽  
Luciano Pederzoli ◽  
Marco Bilucaglia ◽  
Simone Melloni

Event Related Potentials (ERPs) are widely used in Brain-Computer Interface applications and in neuroscience. Normal EEG activity is rich in background noise and therefore, in order to detect ERPs, it is usually necessary to take the average from multiple trials to reduce the effects of this noise. The noise produced by EEG activity itself is not correlated with the ERP waveform and so, by calculating the average, the noise is decreased by a factor inversely proportional to the square root of N, where N is the number of averaged epochs. This is the easiest strategy currently used to detect ERPs, which is based on calculating the average of each ERPs waveform, these waveforms being time-and phase-locked. In this paper a new method called GW6 is proposed, which calculates the ERP using a mathematical method based only on Pearson's Correlation. This results in a graph with the same time resolution as the classical ERP and which contains only positive peaks representing the increase, in consonance to the stimuli, in EEG signal correlation over all channels. This new method is also useful for selectively identifying and highlighting any hidden components of the ERP response that are not phase-locked, and that are usually hidden in the standard and simple method based on the averaging of all the epochs. These hidden components seem to be caused by variations (between each successive stimulus) of the ERPs inherent phase latency period (jitter), although the same stimulus across all EEG channels produces a reasonably constant phase. For this reason, this new method could be very helpful to investigate these hidden components of the ERP response and to develop applications for scientific and medical purposes. Moreover, this new method is more resistant to EEG artifacts than the standard calculations of the average. The method we are proposing can be directly used in the form of a process written in the well known Matlab programming language and can be easily and quickly written in any other software language.


Sign in / Sign up

Export Citation Format

Share Document