pLoc_bal-mEuk: Predict Subcellular Localization of Eukaryotic Proteins by General PseAAC and Quasi-balancing Training Dataset

2019 ◽  
Vol 15 (5) ◽  
pp. 472-485 ◽  
Author(s):  
Kuo-Chen Chou ◽  
Xiang Cheng ◽  
Xuan Xiao

<P>Background/Objective: Information of protein subcellular localization is crucially important for both basic research and drug development. With the explosive growth of protein sequences discovered in the post-genomic age, it is highly demanded to develop powerful bioinformatics tools for timely and effectively identifying their subcellular localization purely based on the sequence information alone. Recently, a predictor called “pLoc-mEuk” was developed for identifying the subcellular localization of eukaryotic proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems where many proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mEuk was trained by an extremely skewed dataset where some subset was about 200 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. </P><P> Methods: To alleviate such bias, we have developed a new predictor called pLoc_bal-mEuk by quasi-balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLocmEuk, the existing state-of-the-art predictor in identifying the subcellular localization of eukaryotic proteins. It has not escaped our notice that the quasi-balancing treatment can also be used to deal with many other biological systems. </P><P> Results: To maximize the convenience for most experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mEuk/. </P><P> Conclusion: It is anticipated that the pLoc_bal-Euk predictor holds very high potential to become a useful high throughput tool in identifying the subcellular localization of eukaryotic proteins, particularly for finding multi-target drugs that is currently a very hot trend trend in drug development.</P>

2019 ◽  
Vol 24 (34) ◽  
pp. 4013-4022 ◽  
Author(s):  
Xiang Cheng ◽  
Xuan Xiao ◽  
Kuo-Chen Chou

Knowledge of protein subcellular localization is vitally important for both basic research and drug development. With the avalanche of protein sequences emerging in the post-genomic age, it is highly desired to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called “pLoc-mPlant” was developed for identifying the subcellular localization of plant proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, called “multiplex proteins”, may simultaneously occur in two or more subcellular locations. Although it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mPlant was trained by an extremely skewed dataset in which some subsets (i.e., the protein numbers for some subcellular locations) were more than 10 times larger than the others. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset. To overcome such biased consequence, we have developed a new and bias-free predictor called pLoc_bal-mPlant by balancing the training dataset. Cross-validation tests on exactly the same experimentconfirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mPlant, the existing state-of-the-art predictor in identifying the subcellular localization of plant proteins. To maximize the convenience for the majority of experimental scientists, a user-friendly web-server for the new predictor has been established at http://www.jci-bioinfo.cn/pLoc_bal-mPlant/, by which users can easily get their desired results without the need to go through the detailed mathematics.


2019 ◽  
Vol 15 (5) ◽  
pp. 496-509 ◽  
Author(s):  
Xuan Xiao ◽  
Xiang Cheng ◽  
Genqiang Chen ◽  
Qi Mao ◽  
Kuo-Chen Chou

Background/Objective:Knowledge of protein subcellular localization is vitally important for both basic research and drug development. Facing the avalanche of protein sequences emerging in the post-genomic age, it is urgent to develop computational tools for timely and effectively identifying their subcellular localization based on the sequence information alone. Recently, a predictor called “pLoc-mVirus” was developed for identifying the subcellular localization of virus proteins. Its performance is overwhelmingly better than that of the other predictors for the same purpose, particularly in dealing with multi-label systems in which some proteins, known as “multiplex proteins”, may simultaneously occur in, or move between two or more subcellular location sites. Despite the fact that it is indeed a very powerful predictor, more efforts are definitely needed to further improve it. This is because pLoc-mVirus was trained by an extremely skewed dataset in which some subset was over 10 times the size of the other subsets. Accordingly, it cannot avoid the biased consequence caused by such an uneven training dataset.Methods:Using the Chou's general PseAAC (Pseudo Amino Acid Composition) approach and the IHTS (Inserting Hypothetical Training Samples) treatment to balance out the training dataset, we have developed a new predictor called “pLoc_bal-mVirus” for predicting the subcellular localization of multi-label virus proteins.Results:Cross-validation tests on exactly the same experiment-confirmed dataset have indicated that the proposed new predictor is remarkably superior to pLoc-mVirus, the existing state-of-theart predictor for the same purpose.Conclusion:Its user-friendly web-server is available at http://www.jci-bioinfo.cn/pLoc_balmVirus/, by which the majority of experimental scientists can easily get their desired results without the need to go through the detailed complicated mathematics. Accordingly, pLoc_bal-mVirus will become a very useful tool for designing multi-target drugs and in-depth understanding of the biological process in a cell.


2021 ◽  
Vol 6 (1) ◽  
pp. 203
Author(s):  
Jonathan Jibson

When Pillai scores are used to study vowel mergers, formants are typically sampled from the midpoint. This study compares alternative methods for calculating Pillai scores: methods that incorporate dynamic spectral information. Eighteen speakers produced 20 tokens of Hodd and hawed. Formants were sampled at 20–35–50–65–80% duration. Seven Pillai scores were calculated, each based on a different subset of those samples with temporal pooling: (i) onsets, (ii) heads, (iii) midpoints, (iv) onsets + offsets, (v) heads + tails, (vi) onsets + midpoints + offsets, and (vii) all five. Subjects also completed a vowel identification task, and the rate of identifying one low-back vowel as the other was calculated. The results of the identification task were regressed on each Pillai score separately to identify the one with the highest correlation, through model selection. Dynamic formant contours performed better than static formant values, with midpoint sampling performing worst of all. Directions are discussed for basic research on Pillai scores in phonetics.


Author(s):  
A. V. Crewe

We have become accustomed to differentiating between the scanning microscope and the conventional transmission microscope according to the resolving power which the two instruments offer. The conventional microscope is capable of a point resolution of a few angstroms and line resolutions of periodic objects of about 1Å. On the other hand, the scanning microscope, in its normal form, is not ordinarily capable of a point resolution better than 100Å. Upon examining reasons for the 100Å limitation, it becomes clear that this is based more on tradition than reason, and in particular, it is a condition imposed upon the microscope by adherence to thermal sources of electrons.


Author(s):  
Maxim B. Demchenko ◽  

The sphere of the unknown, supernatural and miraculous is one of the most popular subjects for everyday discussions in Ayodhya – the last of the provinces of the Mughal Empire, which entered the British Raj in 1859, and in the distant past – the space of many legendary and mythological events. Mostly they concern encounters with inhabitants of the “other world” – spirits, ghosts, jinns as well as miraculous healings following magic rituals or meetings with the so-called saints of different religions (Hindu sadhus, Sufi dervishes),with incomprehensible and frightening natural phenomena. According to the author’s observations ideas of the unknown in Avadh are codified and structured in Avadh better than in other parts of India. Local people can clearly define if they witness a bhut or a jinn and whether the disease is caused by some witchcraft or other reasons. Perhaps that is due to the presence in the holy town of a persistent tradition of katha, the public presentation of plots from the Ramayana epic in both the narrative and poetic as well as performative forms. But are the events and phenomena in question a miracle for the Avadhvasis, residents of Ayodhya and its environs, or are they so commonplace that they do not surprise or fascinate? That exactly is the subject of the essay, written on the basis of materials collected by the author in Ayodhya during the period of 2010 – 2019. The author would like to express his appreciation to Mr. Alok Sharma (Faizabad) for his advice and cooperation.


1996 ◽  
Vol 2 (1) ◽  
pp. 111-122
Author(s):  
Joseph R. Stimpfl

The literature annotated here is from a subset of literature in cultural anthropology that deals with ethnographic fieldwork: the basic research exercise of cultural immersion. This bibliography is meant to offer a representative sample of literature in anthropology that deals with the fieldwork experiences of researchers. Cultural anthropology is devoted to the concept of “discovering the other.” Its method of inquiry is often referred to as participant/observation: the researcher lives the culture while observing it. Since so much of the fieldwork experience deals with personal adjustments to living in different cultures, the literature is charged with the problems of adjustment and understanding so common to study abroad experiences. This literature is particularly relevant to those interested in cross-cultural learning and issues in cultural adjustment. 


HortScience ◽  
1998 ◽  
Vol 33 (3) ◽  
pp. 452c-452 ◽  
Author(s):  
Schuyler D. Seeley ◽  
Raymundo Rojas-Martinez ◽  
James Frisby

Mature peach trees in pots were treated with nighttime temperatures of –3, 6, 12, and 18 °C for 16 h and a daytime temperature of 20 °C for 8 h until the leaves abscised in the colder treatments. The trees were then chilled at 6 °C for 40 to 70 days. Trees were removed from chilling at 40, 50, 60, and 70 days and placed in a 20 °C greenhouse under increasing daylength, spring conditions. Anthesis was faster and shoot length increased with longer chilling treatments. Trees exposed to –3 °C pretreatment flowered and grew best with 40 days of chilling. However, they did not flower faster or grow better than the other treatments with longer chilling times. There was no difference in flowering or growth between the 6 and 12 °C pretreatments. The 18 °C pretreatment resulted in slower flowering and very little growth after 40 and 50 days of chilling, but growth was comparable to other treatments after 70 days of chilling.


2020 ◽  
Vol 27 (3) ◽  
pp. 178-186 ◽  
Author(s):  
Ganesan Pugalenthi ◽  
Varadharaju Nithya ◽  
Kuo-Chen Chou ◽  
Govindaraju Archunan

Background: N-Glycosylation is one of the most important post-translational mechanisms in eukaryotes. N-glycosylation predominantly occurs in N-X-[S/T] sequon where X is any amino acid other than proline. However, not all N-X-[S/T] sequons in proteins are glycosylated. Therefore, accurate prediction of N-glycosylation sites is essential to understand Nglycosylation mechanism. Objective: In this article, our motivation is to develop a computational method to predict Nglycosylation sites in eukaryotic protein sequences. Methods: In this article, we report a random forest method, Nglyc, to predict N-glycosylation site from protein sequence, using 315 sequence features. The method was trained using a dataset of 600 N-glycosylation sites and 600 non-glycosylation sites and tested on the dataset containing 295 Nglycosylation sites and 253 non-glycosylation sites. Nglyc prediction was compared with NetNGlyc, EnsembleGly and GPP methods. Further, the performance of Nglyc was evaluated using human and mouse N-glycosylation sites. Results: Nglyc method achieved an overall training accuracy of 0.8033 with all 315 features. Performance comparison with NetNGlyc, EnsembleGly and GPP methods shows that Nglyc performs better than the other methods with high sensitivity and specificity rate. Conclusion: Our method achieved an overall accuracy of 0.8248 with 0.8305 sensitivity and 0.8182 specificity. Comparison study shows that our method performs better than the other methods. Applicability and success of our method was further evaluated using human and mouse N-glycosylation sites. Nglyc method is freely available at https://github.com/bioinformaticsML/ Ngly.


Sign in / Sign up

Export Citation Format

Share Document