diversity constraint
Recently Published Documents


TOTAL DOCUMENTS

13
(FIVE YEARS 5)

H-INDEX

4
(FIVE YEARS 0)

2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Jules Leguy ◽  
Marta Glavatskikh ◽  
Thomas Cauchy ◽  
Benoit Da Mota

AbstractChemical diversity is one of the key term when dealing with machine learning and molecular generation. This is particularly true for quantum chemical datasets. The composition of which should be done meticulously since the calculation is highly time demanding. Previously we have seen that the most known quantum chemical dataset QM9 lacks chemical diversity. As a consequence, ML models trained on QM9 showed generalizability shortcomings. In this paper we would like to present (i) a fast and generic method to evaluate chemical diversity, (ii) a new quantum chemical dataset of 435k molecules, OD9, that includes QM9 and new molecules generated with a diversity objective, (iii) an analysis of the diversity impact on unconstrained and goal-directed molecular generation on the example of QED optimization. Our innovative approach makes it possible to individually estimate the impact of a solution to the diversity of a set, allowing for effective incremental evaluation. In the first application, we will see how the diversity constraint allows us to generate more than a million of molecules that would efficiently complete the reference datasets. The compounds were calculated with DFT thanks to a collaborative effort through the QuChemPedIA@home BOINC project. With regard to goal-directed molecular generation, getting a high QED score is not complicated, but adding a little diversity can cut the number of calls to the evaluation function by a factor of ten


2021 ◽  
Author(s):  
Jules Leguy ◽  
Marta Glavatskikh ◽  
Thomas Cauchy ◽  
Benoit Da Mota

Abstract Chemical diversity is one of the key term when dealing with machine learning and molecular generation. This is particularly true for quantum chemical datasets. The composition of which should be done meticulously since the calculation is highly time demanding. Previously we have seen that the most known quantum chemical dataset QM9 lacks chemical diversity. As a consequence, ML models trained on QM9 showed generalizability shortcomings. In this paper we would like to present (i) a fast and generic method to evaluate chemical diversity, (ii) a new quantum chemical dataset of 435k molecules, OD9, that includes QM9 and new molecules generated with a diversity objective, (iii) an analysis of the diversity impact on unconstrained and goal-directed molecular generation on the example of QED optimization. Our innovative approach makes it possible to individually estimate the impact of a solution to the diversity of a set, allowing for effective incremental evaluation. In the first application, we will see how the diversity constraint allows us to generate more than a million of molecules that would efficiently complete the reference datasets. The compounds were calculated with DFT thanks to a collaborative effort through the QuChemPedIA@home BOINC project. With regard to goal-directed molecular generation, getting a high QED score is not complicated, but adding a little diversity can cut the number of calls to the evaluation function by a factor of ten.


In this study we propose an automatic single document text summarization technique using Latent Semantic Analysis (LSA) and diversity constraint in combination. The proposed technique uses the query based sentence ranking. Here we are not considering the concept of IR (Information Retrieval) so we generate the query by using the TF-IDF(Term Frequency-Inverse Document Frequency). For producing the query vector, we identify the terms having the high IDF. We know that LSA utilizes the vectorial semantics to analyze the relationships between documents in a corpus or between sentences within a document and key terms they carry by producing a list of ideas interconnected to the documents and terms. LSA helps to represent the latent structure of documents. For selecting the sentences from the document Latent Semantic Indexing (LSI) is used. LSI helps to arrange the sentences with its score. Traditionally the highest score sentences have been chosen for summary but here we calculate the diversity between chosen sentences and produce the final summary as a good summary should have maximum level of diversity. The proposed technique is evaluated on OpinosisDataset1.0.


2018 ◽  
Vol 21 (2) ◽  
pp. 217-235
Author(s):  
A Khatun ◽  
N Parvin ◽  
MMR Dewan ◽  
A Saha

A consistent and comprehensive database on cropping pattern, cropping intensity and crop diversity of a particular area is the prime importance for guiding policy makers, researchers, extentionists and development agencies for the future research and development planning. The study was carried out all the upazilas of Mymensingh region during 2015-16 using pre-designed and pre-tested semistructured questionnaire with a view to document the existing cropping pattern, crop diversity and cropping intensity. The most dominant cropping pattern Boro−Fallow−T. Aman occupied about onehalf of net cropped area (NCA) of the region distributed to 46 out of 47 upazilas. Single Boro cropping pattern ranked the second position which covered 23% of NCA distributed in 45 upazilas. A total of 129 cropping patterns were identified in the whole area of Mymensingh region under this investigation. The highest number of (30) cropping patterns were identified in Pakundia upazila of Kishoreganj and the lowest was (10) in Sreebardi of Sherpur. The lowest crop diversity index (CDI) was reported (0.111) in Mithamoin of Kishoreganj followed by 0.114 at Khaliajuri in Netrokona. The highest value of CDI was observed 0.933 at Dewanganj in Jamalpur followed by 0.920 at Bhairab in Kishoreganj. The range of cropping intensity values was recorded 101-249%. The maximum value was for Hossainpur and minimum for Itna and Mithamoin in Kishoreganj. At a glance the calculated CDI of Mymensingh region was 0.840 and the average cropping intensity was 187%.Bangladesh Rice j. 2017, 21(2): 217-235


2014 ◽  
Vol 2014 ◽  
pp. 1-16 ◽  
Author(s):  
Nebojsa Bacanin ◽  
Milan Tuba

Portfolio optimization (selection) problem is an important and hard optimization problem that, with the addition of necessary realistic constraints, becomes computationally intractable. Nature-inspired metaheuristics are appropriate for solving such problems; however, literature review shows that there are very few applications of nature-inspired metaheuristics to portfolio optimization problem. This is especially true for swarm intelligence algorithms which represent the newer branch of nature-inspired algorithms. No application of any swarm intelligence metaheuristics to cardinality constrained mean-variance (CCMV) portfolio problem with entropy constraint was found in the literature. This paper introduces modified firefly algorithm (FA) for the CCMV portfolio model with entropy constraint. Firefly algorithm is one of the latest, very successful swarm intelligence algorithm; however, it exhibits some deficiencies when applied to constrained problems. To overcome lack of exploration power during early iterations, we modified the algorithm and tested it on standard portfolio benchmark data sets used in the literature. Our proposed modified firefly algorithm proved to be better than other state-of-the-art algorithms, while introduction of entropy diversity constraint further improved results.


Sign in / Sign up

Export Citation Format

Share Document