TRANSFER LEARNING FOR CYTOCHROME P450 ISOZYME SELECTIVITY PREDICTION

2011 ◽  
Vol 09 (04) ◽  
pp. 521-540 ◽  
Author(s):  
REIJI TERAMOTO ◽  
TSUYOSHI KATO

In the drug discovery process, the metabolic fate of drugs is crucially important to prevent drug–drug interactions. Therefore, P450 isozyme selectivity prediction is an important task for screening drugs of appropriate metabolism profiles. Recently, large-scale activity data of five P450 isozymes (CYP1A2 CYP2C9, CYP3A4, CYP2D6, and CYP2C19) have been obtained using quantitative high-throughput screening with a bioluminescence assay. Although some isozymes share similar selectivities, conventional supervised learning algorithms independently learn a prediction model from each P450 isozyme. They are unable to exploit the other P450 isozyme activity data to improve the predictive performance of each P450 isozyme's selectivity. To address this issue, we apply transfer learning that uses activity data of the other isozymes to learn a prediction model from multiple P450 isozymes. After using the large-scale P450 isozyme selectivity dataset for five P450 isozymes, we evaluate the model's predictive performance. Experimental results show that, overall, our algorithm outperforms conventional supervised learning algorithms such as support vector machine (SVM), Weighted k-nearest neighbor classifier, Bagging, Adaboost, and latent semantic indexing (LSI). Moreover, our results show that the predictive performance of our algorithm is improved by exploiting the multiple P450 isozyme activity data in the learning process. Our algorithm can be an effective tool for P450 selectivity prediction for new chemical entities using multiple P450 isozyme activity data.

2020 ◽  
Author(s):  
Xingyi Yang ◽  
Xuehai He ◽  
Yuxiao Liang ◽  
Yue Yang ◽  
Shanghang Zhang ◽  
...  

Pretraining has become a standard technique in computer vision and natural language processing, which usually helps to improve performance substantially. Previously, the most dominant pretraining method is transfer learning (TL), which uses labeled data to learn a good representation network. Recently, a new pretraining approach -- self-supervised learning (SSL) -- has demonstrated promising results on a wide range of applications. SSL does not require annotated labels. It is purely conducted on input data by solving auxiliary tasks defined on the input data examples. The current reported results show that in certain applications, SSL outperforms TL and the other way around in other applications. There has not been a clear understanding on what properties of data and tasks render one approach outperforms the other. Without an informed guideline, ML researchers have to try both methods to find out which one is better empirically. It is usually time-consuming to do so. In this work, we aim to address this problem. We perform a comprehensive comparative study between SSL and TL regarding which one works better under different properties of data and tasks, including domain difference between source and target tasks, the amount of pretraining data, class imbalance in source data, and usage of target data for additional pretraining, etc. The insights distilled from our comparative studies can help ML researchers decide which method to use based on the properties of their applications.


2021 ◽  
Vol 12 (5) ◽  
pp. 1-19
Author(s):  
Xingjian Li ◽  
Haoyi Xiong ◽  
Zeyu Chen ◽  
Jun Huan ◽  
Cheng-Zhong Xu ◽  
...  

Ensemble learning is a widely used technique to train deep convolutional neural networks (CNNs) for improved robustness and accuracy. While existing algorithms usually first train multiple diversified networks and then assemble these networks as an aggregated classifier, we propose a novel learning paradigm, namely, “In-Network Ensemble” ( INE ) that incorporates the diversity of multiple models through training a SINGLE deep neural network. Specifically, INE segments the outputs of the CNN into multiple independent classifiers, where each classifier is further fine-tuned with better accuracy through a so-called diversified knowledge distillation process . We then aggregate the fine-tuned independent classifiers using an Averaging-and-Softmax operator to obtain the final ensemble classifier. Note that, in the supervised learning settings, INE starts the CNN training from random, while, under the transfer learning settings, it also could start with a pre-trained model to incorporate the knowledge learned from additional datasets. Extensive experiments have been done using eight large-scale real-world datasets, including CIFAR, ImageNet, and Stanford Cars, among others, as well as common deep network architectures such as VGG, ResNet, and Wide ResNet. We have evaluated the method under two tasks: supervised learning and transfer learning. The results show that INE outperforms the state-of-the-art algorithms for deep ensemble learning with improved accuracy.


2020 ◽  
Author(s):  
Xingyi Yang ◽  
Xuehai He ◽  
Yuxiao Liang ◽  
Yue Yang ◽  
Shanghang Zhang ◽  
...  

Pretraining has become a standard technique in computer vision and natural language processing, which usually helps to improve performance substantially. Previously, the most dominant pretraining method is transfer learning (TL), which uses labeled data to learn a good representation network. Recently, a new pretraining approach -- self-supervised learning (SSL) -- has demonstrated promising results on a wide range of applications. SSL does not require annotated labels. It is purely conducted on input data by solving auxiliary tasks defined on the input data examples. The current reported results show that in certain applications, SSL outperforms TL and the other way around in other applications. There has not been a clear understanding on what properties of data and tasks render one approach outperforms the other. Without an informed guideline, ML researchers have to try both methods to find out which one is better empirically. It is usually time-consuming to do so. In this work, we aim to address this problem. We perform a comprehensive comparative study between SSL and TL regarding which one works better under different properties of data and tasks, including domain difference between source and target tasks, the amount of pretraining data, class imbalance in source data, and usage of target data for additional pretraining, etc. The insights distilled from our comparative studies can help ML researchers decide which method to use based on the properties of their applications.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Surya Krishnamurthy ◽  
Kathiravan Srinivasan ◽  
Saeed Mian Qaisar ◽  
P. M. Durai Raj Vincent ◽  
Chuan-Yu Chang

Pneumonitis is an infectious disease that causes the inflammation of the air sac. It can be life-threatening to the very young and elderly. Detection of pneumonitis from X-ray images is a significant challenge. Early detection and assistance with diagnosis can be crucial. Recent developments in the field of deep learning have significantly improved their performance in medical image analysis. The superior predictive performance of the deep learning methods makes them ideal for pneumonitis classification from chest X-ray images. However, training deep learning models can be cumbersome and resource-intensive. Reusing knowledge representations of public models trained on large-scale datasets through transfer learning can help alleviate these challenges. In this paper, we compare various image classification models based on transfer learning with well-known deep learning architectures. The Kaggle chest X-ray dataset was used to evaluate and compare our models. We apply basic data augmentation and fine-tune our feed-forward classification head on the models pretrained on the ImageNet dataset. We observed that the DenseNet201 model outperforms other models with an AUROC score of 0.966 and a recall score of 0.99. We also visualize the class activation maps from the DenseNet201 model to interpret the patterns recognized by the model for prediction.


2017 ◽  
Vol 4 (1) ◽  
pp. 41-61
Author(s):  
Pelin Sönmez ◽  
Abulfaz Süleymanov

Türkiye, Cumhuriyet tarihinin en yoğun zorunlu göç dalgasını 2011 yılından bu yana süren Suriye Savaşı ile yaşamaktadır. Suriye vatandaşlarının geçici koruma statüsü altında Türkiye toplumuna her açıdan entegrasyonları günümüzün ve geleceğin politika öncelikleri arasında düşünülmelidir. Öte yandan ülkeye kabul edilen sığınmacıların kendi kültürel kimliğini kaybetmeden içinde yaşadığı ev sahibi topluma uyumu, ortak yaşam kültürünün gelişmesi açısından önem arz etmektedir. Bu makalede, "misafir" olarak kabul edilen Suriyeli vatandaşların Türk toplumunca kabul edilmeleri ve dışlanma risklerinin azaltılmasına yönelik devlet politikaları ortaya konularak, üye ve aday ülkelere göçmenlerin dışlanmasını önlemek için Avrupa Birliği (AB) tarafından sunulan hukuki yapı ve kamu hizmeti inisiyatifleri incelenmekte, birlikte yaşam kültürü çerçevesinde Suriyeli vatandaşlara yönelik  toplumsal kabul düzeyleri ele alınmaktadır. Çalışma iki ana bölümden oluşmaktadır: göçmen ve sığınmacılara karşı toplumsal dışlanmayı engellemek için benimsenen yasa ve uygulamaların etkisi ve İstanbul-Sultanbeyli bölgesinde Suriyeli sığınmacılarla ilgili toplumsal algı çalışmasının sonuçları. Bölgede ikamet eden Suriyelilere yönelik toplumsal kabul düzeyinin yüksek olduğu görülürken, halkın Suriyelileri kendilerine  kültürel ve dini olarak yakın hissetmesi toplumsal kabul düzeyini olumlu etkilemektedir. ABSTRACT IN ENGLISHAn evaluation of the European Union and Turkish policies regarding the culture of living togetherThis article aims to determine the level of social acceptance towards Syrians within the context of cohabitation culture by evaluating EU’s legal structure and public service initiatives in order to prevent Syrian refugees from being excluded in member and candidate countries and by revealing government policies on acceptance of Syrians as “guest” by Turkish society and minimizing the exclusion risks of them. This article consists of two main parts, one of which is based on the effects of law and practices preventing refugees and asylum seekers from social exclusion, and the other is on the results of social perception on Syrians in Sultanbeyli district of Istanbul. At the end of 5-years taking in Syrian War, it is obvious that most of more than 3 million Syrian with unregistered ones in Turkey are “here to stay”. From this point of view, the primary scope of policies should be specified in order to remove side effects of refugee phenomenon seen as weighty matter by bottoming out the exclusion towards those people. To avoid possible large-scale conflicts or civil wars in the future, the struggle with exclusion phenomenon plays a crucial role regarding Turkey’s sociological situation and developing policies. In the meaning of forming a model for Turkey, a subtitle in this article is about public services for European-wide legal acquis and practices carried out since 1970s in order to prevent any exclusion from the society. On the other hand, other subtitles are about legal infrastructure and practices like Common European Asylum and Immigration Policies presented in 2005, and Law on Foreigners and International Protection introduced in 2013. In the last part of the article, the results of a field survey carried out in a district of Istanbul were used to analyze the exclusion towards refugees in Turkey. A face-to-face survey was randomly conducted with 200 settled refugees in Sultanbeyli district of Istanbul, and their perceptions towards Syrian people under temporary protection were evaluated. According to the results, the level of acceptance for Syrians living in this district seems relatively high. The fact that Turkish people living in the same district feel close to Syrian refugees culturally and religiously affect their perception in a positive way: however, it is strikingly seen and understood that local residents cop an attitude on the refugees’ becoming Turkish citizens.


2020 ◽  
Vol 6 (5) ◽  
pp. 1183-1189
Author(s):  
Dr. Tridibesh Tripathy ◽  
Dr. Umakant Prusty ◽  
Dr. Chintamani Nayak ◽  
Dr. Rakesh Dwivedi ◽  
Dr. Mohini Gautam

The current article of Uttar Pradesh (UP) is about the ASHAs who are the daughters-in-law of a family that resides in the same community that they serve as the grassroots health worker since 2005 when the NRHM was introduced in the Empowered Action Group (EAG) states. UP is one such Empowered Action Group (EAG) state. The current study explores the actual responses of Recently Delivered Women (RDW) on their visits during the first month of their recent delivery. From the catchment area of each of the 250 ASHAs, two RDWs were selected who had a child in the age group of 3 to 6 months during the survey. The response profiles of the RDWs on the post- delivery first month visits are dwelled upon to evolve a picture representing the entire state of UP. The relevance of the study assumes significance as detailed data on the modalities of postnatal visits are available but not exclusively for the first month period of their recent delivery. The details of the post-delivery first month period related visits are not available even in large scale surveys like National Family Health Survey 4 done in 2015-16. The current study gives an insight in to these visits with a five-point approach i.e. type of personnel doing the visit, frequency of the visits, visits done in a particular week from among those four weeks separately for the three visits separately. The current study is basically regarding the summary of this Penta approach for the post- delivery one-month period.     The first month period after each delivery deals with 70% of the time of the postnatal period & the entire neonatal period. Therefore, it does impact the Maternal Mortality Rate & Ratio (MMR) & the Neonatal Mortality Rates (NMR) in India and especially in UP through the unsafe Maternal & Neonatal practices in the first month period after delivery. The current MM Rate of UP is 20.1 & MM Ratio is 216 whereas the MM ratio is 122 in India (SRS, 2019). The Sample Registration System (SRS) report also mentions that the Life Time Risk (LTR) of a woman in pregnancy is 0.7% which is the highest in the nation (SRS, 2019). This means it is very risky to give birth in UP in comparison to other regions in the country (SRS, 2019). This risk is at the peak in the first month period after each delivery. Similarly, the current NMR in India is 23 per 1000 livebirths (UNIGME,2018). As NMR data is not available separately for states, the national level data also hold good for the states and that’s how for the state of UP as well. These mortalities are the impact indicators and such indicators can be reduced through long drawn processes that includes effective and timely visits to RDWs especially in the first month period after delivery. This would help in making their post-natal & neonatal stage safe. This is the area of post-delivery first month visit profile detailing that the current article helps in popping out in relation to the recent delivery of the respondents.   A total of four districts of Uttar Pradesh were selected purposively for the study and the data collection was conducted in the villages of the respective districts with the help of a pre-tested structured interview schedule with both close-ended and open-ended questions.  The current article deals with five close ended questions with options, two for the type of personnel & frequency while the other three are for each of the three visits in the first month after the recent delivery of respondents. In addition, in-depth interviews were also conducted amongst the RDWs and a total 500 respondents had participated in the study.   Among the districts related to this article, the results showed that ASHA was the type of personnel who did the majority of visits in all the four districts. On the other hand, 25-40% of RDWs in all the 4 districts replied that they did not receive any visit within the first month of their recent delivery. Regarding frequency, most of the RDWs in all the 4 districts received 1-2 times visits by ASHAs.   Regarding the first visit, it was found that the ASHAs of Barabanki and Gonda visited less percentage of RDWs in the first week after delivery. Similarly, the second visit revealed that about 1.2% RDWs in Banda district could not recall about the visit. Further on the second visit, the RDWs responded that most of them in 3 districts except Gonda district did receive the second postnatal visit in 7-15 days after their recent delivery. Less than half of RDWs in Barabanki district & just more than half of RDWs in Gonda district received the third visit in 15-21 days period after delivery. For the same period, the majority of RDWs in the rest two districts responded that they had been entertained through a home visit.


2014 ◽  
Vol 6 (2) ◽  
pp. 46-51
Author(s):  
Galang Amanda Dwi P. ◽  
Gregorius Edwadr ◽  
Agus Zainal Arifin

Nowadays, a large number of information can not be reached by the reader because of the misclassification of text-based documents. The misclassified data can also make the readers obtain the wrong information. The method which is proposed by this paper is aiming to classify the documents into the correct group.  Each document will have a membership value in several different classes. The method will be used to find the degree of similarity between the two documents is the semantic similarity. In fact, there is no document that doesn’t have a relationship with the other but their relationship might be close to 0. This method calculates the similarity between two documents by taking into account the level of similarity of words and their synonyms. After all inter-document similarity values obtained, a matrix will be created. The matrix is then used as a semi-supervised factor. The output of this method is the value of the membership of each document, which must be one of the greatest membership value for each document which indicates where the documents are grouped. Classification result computed by the method shows a good value which is 90 %. Index Terms - Fuzzy co-clustering, Heuristic, Semantica Similiarity, Semi-supervised learning.


2020 ◽  
Vol 26 (33) ◽  
pp. 4195-4205
Author(s):  
Xiaoyu Ding ◽  
Chen Cui ◽  
Dingyan Wang ◽  
Jihui Zhao ◽  
Mingyue Zheng ◽  
...  

Background: Enhancing a compound’s biological activity is the central task for lead optimization in small molecules drug discovery. However, it is laborious to perform many iterative rounds of compound synthesis and bioactivity tests. To address the issue, it is highly demanding to develop high quality in silico bioactivity prediction approaches, to prioritize such more active compound derivatives and reduce the trial-and-error process. Methods: Two kinds of bioactivity prediction models based on a large-scale structure-activity relationship (SAR) database were constructed. The first one is based on the similarity of substituents and realized by matched molecular pair analysis, including SA, SA_BR, SR, and SR_BR. The second one is based on SAR transferability and realized by matched molecular series analysis, including Single MMS pair, Full MMS series, and Multi single MMS pairs. Moreover, we also defined the application domain of models by using the distance-based threshold. Results: Among seven individual models, Multi single MMS pairs bioactivity prediction model showed the best performance (R2 = 0.828, MAE = 0.406, RMSE = 0.591), and the baseline model (SA) produced the most lower prediction accuracy (R2 = 0.798, MAE = 0.446, RMSE = 0.637). The predictive accuracy could further be improved by consensus modeling (R2 = 0.842, MAE = 0.397 and RMSE = 0.563). Conclusion: An accurate prediction model for bioactivity was built with a consensus method, which was superior to all individual models. Our model should be a valuable tool for lead optimization.


Sign in / Sign up

Export Citation Format

Share Document