scholarly journals Perplexity-based molecule ranking and bias estimation of chemical language models

Author(s):  
Michael Moret ◽  
Francesca Grisoni ◽  
Paul Katzberger ◽  
Gisbert Schneider

Chemical language models (CLMs) can be employed to design molecules with desired properties. CLMs generate new chemical structures in the form of textual representations, such as the simplified molecular input line entry systems (SMILES) strings, in a rule-free manner. However, the quality of these de novo generated molecules is difficult to assess a priori. In this study, we apply the perplexity metric to determine the degree to which the molecules generated by a CLM match the desired design objectives. This model-intrinsic score allows identifying and ranking the most promising molecular designs based on the probabilities learned by the CLM. Using perplexity to compare “greedy” (beam search) with “explorative” (multinomial sampling) methods for SMILES generation, certain advantages of multinomial sampling become apparent. Additionally, perplexity scoring is performed to identify undesired model biases introduced during model training and allows the development of a new ranking system to remove those undesired biases.

2021 ◽  
Author(s):  
Michael Moret ◽  
Moritz Helmstädter ◽  
Francesca Grisoni ◽  
Gisbert Schneider ◽  
Daniel Merk

Chemical language models enable de novo drug design without the requirement for explicit molecular construction rules. While such models have been applied to generate novel compounds with desired bioactivity, the actual prioritization and selection of the most promising computational designs remains challenging. In this work, we leveraged the probabilities learnt by chemical language models with the beam search algorithm as a model-intrinsic technique for automated molecule design and scoring. Prospective application of this method yielded three novel inverse agonists of retinoic acid receptor-related orphan receptors (RORs). Each design was synthesizable in three reaction steps and presented low-micromolar to nanomolar potency towards RORg. This model-intrinsic sampling technique eliminates the strict need for external compound scoring functions, thereby further extending the applicability of generative artificial intelligence to data-driven drug discovery.<br>


2021 ◽  
Author(s):  
Michael Moret ◽  
Moritz Helmstädter ◽  
Francesca Grisoni ◽  
Gisbert Schneider ◽  
Daniel Merk

Chemical language models enable de novo drug design without the requirement for explicit molecular construction rules. While such models have been applied to generate novel compounds with desired bioactivity, the actual prioritization and selection of the most promising computational designs remains challenging. In this work, we leveraged the probabilities learnt by chemical language models with the beam search algorithm as a model-intrinsic technique for automated molecule design and scoring. Prospective application of this method yielded three novel inverse agonists of retinoic acid receptor-related orphan receptors (RORs). Each design was synthesizable in three reaction steps and presented low-micromolar to nanomolar potency towards RORg. This model-intrinsic sampling technique eliminates the strict need for external compound scoring functions, thereby further extending the applicability of generative artificial intelligence to data-driven drug discovery.<br>


2021 ◽  
Vol 11 (2) ◽  
Author(s):  
Patrícia Aline Gröhs Ferrareze ◽  
Corinne Maufrais ◽  
Rodrigo Silva Araujo Streit ◽  
Shelby J Priest ◽  
Christina A Cuomo ◽  
...  

Abstract Evaluating the quality of a de novo annotation of a complex fungal genome based on RNA-seq data remains a challenge. In this study, we sequentially optimized a Cufflinks-CodingQuary-based bioinformatics pipeline fed with RNA-seq data using the manually annotated model pathogenic yeasts Cryptococcus neoformans and Cryptococcus deneoformans as test cases. Our results show that the quality of the annotation is sensitive to the quantity of RNA-seq data used and that the best quality is obtained with 5–10 million reads per RNA-seq replicate. We also showed that the number of introns predicted is an excellent a priori indicator of the quality of the final de novo annotation. We then used this pipeline to annotate the genome of the RNAi-deficient species Cryptococcus deuterogattii strain R265 using RNA-seq data. Dynamic transcriptome analysis revealed that intron retention is more prominent in C. deuterogattii than in the other RNAi-proficient species C. neoformans and C. deneoformans. In contrast, we observed that antisense transcription was not higher in C. deuterogattii than in the two other Cryptococcus species. Comparative gene content analysis identified 21 clusters enriched in transcription factors and transporters that have been lost. Interestingly, analysis of the subtelomeric regions in these three annotated species identified a similar gene enrichment, reminiscent of the structure of primary metabolic clusters. Our data suggest that there is active exchange between subtelomeric regions, and that other chromosomal regions might participate in adaptive diversification of Cryptococcus metabolite assimilation potential.


Author(s):  
Patricia A.G. Ferrareze ◽  
Corinne Maufrais ◽  
Rodrigo Silva Aroujo Streit ◽  
Shelby J. Priest ◽  
Christina A. Cuomo ◽  
...  

Evaluating the quality of a de novo annotation of a complex fungal genome based on RNA-seq data remains a challenge. In this study, we sequentially optimized a Cufflinks-CodingQuary based bioinformatics pipeline fed with RNA-seq data using the manually annotated model pathogenic yeasts Cryptococcus neoformans and Cryptococcus deneoformans as test cases. Our results demonstrate that the quality of the annotation is sensitive to the quantity of RNA-seq data used and that the best quality is obtained with 5 to 10 million reads per RNA-seq replicate. We also demonstrated that the number of introns predicted is an excellent a priori indicator of the quality of the final de novo annotation. We then used this pipeline to annotate the genome of the RNAi-deficient species Cryptococcus deuterogattii strain R265 using RNA-seq data. Dynamic transcriptome analysis revealed that intron retention is more prominent in C. deuterogattii than in the other RNAi-proficient species C. neoformans and C. deneoformans. In contrast, we observed that antisense transcription was not higher in C. deuterogattii than in the two other Cryptococcus species. Comparative gene content analysis identified 21 clusters enriched in transcription factors and transporters that have been lost. Interestingly, analysis of the subtelomeric regions in these three annotated species identified a similar gene enrichment, reminiscent of the structure of primary metabolic clusters. Our data suggest that there is active exchange between subtelomeric regions, and that other chromosomal regions might participate in adaptive diversification of Cryptococcus metabolite assimilation potential.


2021 ◽  
Author(s):  
Michael Moret ◽  
Francesca Grisoni ◽  
Cyrill Brunner ◽  
Gisbert Schneider

Generative chemical language models (CLMs) can be used for de novo molecular structure generation. These CLMs learn from the structural information of known molecules to generate new ones. In this paper, we show that “hybrid” CLMs can additionally leverage the bioactivity information available for the training compounds. To computationally design ligands of phosphoinositide 3-kinase gamma (PI3Kγ), we created a large collection of virtual molecules with a generative CLM. This primary virtual compound library was further refined using a CLM-based classifier for bioactivity prediction. This second hybrid CLM was pretrained with patented molecular structures and fine-tuned with known PI3Kγ binders and non-binders by transfer learning. Several of the computer-generated molecular designs were commercially available, which allowed for fast prescreening and preliminary experimental validation. A new PI3Kγ ligand with sub-micromolar activity was identified. The results positively advocate hybrid CLMs for virtual compound screening and activity-focused molecular design in low-data situations.


GIS Business ◽  
2019 ◽  
Vol 14 (4) ◽  
pp. 85-98
Author(s):  
Idoko Peter

This research the impact of competitive quasi market on service delivery in Benue State University, Makurdi Nigeria. Both primary and secondary source of data and information were used for the study and questionnaire was used to extract information from the purposively selected respondents. The population for this study is one hundred and seventy three (173) administrative staff of Benue State University selected at random. The statistical tools employed was the classical ordinary least square (OLS) and the probability value of the estimates was used to tests hypotheses of the study. The result of the study indicates that a positive relationship exist between Competitive quasi marketing in Benue State University, Makurdi Nigeria (CQM) and Transparency in the service delivery (TRSP) and the relationship is statistically significant (p<0.05). Competitive quasi marketing (CQM) has a negative effect on Observe Competence in Benue State University, Makurdi Nigeria (OBCP) and the relationship is not statistically significant (p>0.05). Competitive quasi marketing (CQM) has a positive effect on Innovation in Benue State University, Makurdi Nigeria (INVO) and the relationship is statistically significant (p<0.05) and in line with a priori expectation. This means that a unit increases in Competitive quasi marketing (CQM) will result to a corresponding increase in innovation in Benue State University, Makurdi Nigeria (INVO) by a margin of 22.5%. It was concluded that government monopoly in the provision of certain types of services has greatly affected the quality of service experience in the institution. It was recommended among others that the stakeholders in the market has to be transparent so that the system will be productive to serve the society effectively


Author(s):  
Oksana Bitlian ◽  
Oksana Kravchenko ◽  
Tetiana Kodak ◽  
Andrii Onyshchenko ◽  
Tetiana Konks

The analysis of literature sources shows that the type and material from which the packaging is made has an important place in the system of factors which influence on the storage of feed products and also prevents reducing the quality of raw materials and finished products. Therefore, the purpose of our research is the technological justification of changing the quality indexes of premix samples with salts of trace elements of different chemical nature in the process of storage. For the solution of the tasks, common zootechnical and statistical methods of the research were used. The use of premixes in feeding pigs is based on the fact that they should be used taking into account the biogeochemical properties of the region for which they are calculated. Foods depending on regional properties have a special biochemical composition and excess or lack of individual substances should be offset by the composition of premix. Ignoring this provision necessarily leads to the inappropriate use of BAR, the misbalance of the diet in relation to the physiological needs and inefficiency of the industry. In turn, it requires the purchase and conservation of products for the period of use. Various chemical structures and structures of BAR during the storage process react differently and change qualitative indexes, which leads to a decrease in the productive activity of active substances. It was determined that the humidity of premixes varied within the limits of 12.0-13.0 %, which exceeded the normative, but was not critical, the highest acidity had premix with sulfuric acid salts (6.9 units), the least - premix with lysates (5.7 unit). According to the results of the study, positive qualitative responses were found for the presence of vitamins A, D and B2, macro- and micronutrients: potassium, magnesium, copper, zinc, cobalt, iodine. The above facts of changes in the properties of premixes in the process of storage must be taken into account when providing technological bases for feeding pigs in order to obtain high gains and the quality of manufactured products. Key words: premix, micro-and macro elements, combined fodders, fodder mixes, chelating compounds, feeding, using, pigs' livestock.


2019 ◽  
Vol 14 (2) ◽  
pp. 93-116 ◽  
Author(s):  
Shabnam Mohebbi ◽  
Mojtaba Nasiri Nezhad ◽  
Payam Zarrintaj ◽  
Seyed Hassan Jafari ◽  
Saman Seyed Gholizadeh ◽  
...  

Biomedical engineering seeks to enhance the quality of life by developing advanced materials and technologies. Chitosan-based biomaterials have attracted significant attention because of having unique chemical structures with desired biocompatibility and biodegradability, which play different roles in membranes, sponges and scaffolds, along with promising biological properties such as biocompatibility, biodegradability and non-toxicity. Therefore, chitosan derivatives have been widely used in a vast variety of uses, chiefly pharmaceuticals and biomedical engineering. It is attempted here to draw a comprehensive overview of chitosan emerging applications in medicine, tissue engineering, drug delivery, gene therapy, cancer therapy, ophthalmology, dentistry, bio-imaging, bio-sensing and diagnosis. The use of Stem Cells (SCs) has given an interesting feature to the use of chitosan so that regenerative medicine and therapeutic methods have benefited from chitosan-based platforms. Plenty of the most recent discussions with stimulating ideas in this field are covered that could hopefully serve as hints for more developed works in biomedical engineering.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
T. Dolev ◽  
S. Zubedat ◽  
Z. Brand ◽  
B. Bloch ◽  
E. Mader ◽  
...  

AbstractLack of established knowledge and treatment strategies, and change in work environment, may altogether critically affect the mental health and functioning of physicians treating COVID-19 patients. Thus, we examined whether treating COVID-19 patients affect the physicians’ mental health differently compared with physicians treating non-COVID-19 patients. In this cohort study, an association was blindly computed between physiologically measured anxiety and attention vigilance (collected from 1 May 2014 to 31 May 31 2016) and self-reports of anxiety, mental health aspects, and sleep quality (collected from 20 April to 30 June 2020, and analyzed from 1 July to 1 September 2020), of 91 physicians treating COVID-19 or non-COVID-19 patients. As a priori hypothesized, physicians treating COVID-19 patients showed a relative elevation in both physiological measures of anxiety (95% CI: 2317.69–2453.44 versus 1982.32–2068.46; P < 0.001) and attention vigilance (95% CI: 29.85–34.97 versus 22.84–26.61; P < 0.001), compared with their colleagues treating non-COVID-19 patients. At least 3 months into the pandemic, physicians treating COVID-19 patients reported high anxiety and low quality of sleep. Machine learning showed clustering to the COVID-19 and non-COVID-19 subgroups with a high correlation mainly between physiological and self-reported anxiety, and between physiologically measured anxiety and sleep duration. To conclude, the pattern of attention vigilance, heightened anxiety, and reduced sleep quality findings point the need for mental intervention aimed at those physicians susceptible to develop post-traumatic stress symptoms, owing to the consequences of fighting at the forefront of the COVID-19 pandemic.


2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Soter Ameh ◽  
Bolarinwa Oladimeji Akeem ◽  
Caleb Ochimana ◽  
Abayomi Olabayo Oluwasanu ◽  
Shukri F. Mohamed ◽  
...  

Abstract Background Universal health coverage is one of the Sustainable Development Goal targets known to improve population health and reduce financial burden. There is little qualitative data on access to and quality of primary healthcare in East and West Africa. The aim of this study was to describe the viewpoints of healthcare users, healthcare providers and other stakeholders on health-seeking behaviour, access to and quality of healthcare in seven communities in East and West Africa. Methods A qualitative study was conducted in four communities in Nigeria and one community each in Kenya, Uganda and Tanzania in 2018. Purposive sampling was used to recruit: 155 respondents (mostly healthcare users) for 24 focus group discussions, 25 healthcare users, healthcare providers and stakeholders for in-depth interviews and 11 healthcare providers and stakeholders for key informant interviews. The conceptual framework in this study combined elements of the Health Belief Model, Health Care Utilisation Model, four ‘As’ of access to care, and pathway model to better understand the a priori themes on access to and quality of primary healthcare as well as health-seeking behaviours of the study respondents. A content analysis of the data was done using MAXQDA 2018 qualitative software to identify these a priori themes and emerging themes. Results Access to primary healthcare in the seven communities was limited, especially use of health insurance. Quality of care was perceived to be unacceptable in public facilities whereas cost of care was unaffordable in private facilities. Health providers and users as well as stakeholders highlighted shortage of equipment, frequent drug stock-outs and long waiting times as major issues, but had varying opinions on satisfaction with care. Use of herbal medicines and other traditional treatments delayed or deterred seeking modern healthcare in the Nigerian sites. Conclusions There was a substantial gap in primary healthcare coverage and quality in the selected communities in rural and urban East and West Africa. Alternative models of healthcare delivery that address social and health inequities, through affordable health insurance, can be used to fill this gap and facilitate achieving universal health coverage.


Sign in / Sign up

Export Citation Format

Share Document