Evaluating Crowdsourced Design Concepts With Machine Learning

Abstract Automation has enabled design of increasingly complex products, services, and systems. Advanced technology enables designers to automate repetitive tasks in earlier design phases, even high level conceptual ideation. One particularly repetitive task in ideation is to process the large concept sets that can be developed through crowdsourcing. This paper introduces a method for filtering, categorizing, and rating large sets of design concepts. It leverages unsupervised machine learning (ML) trained on open source databases. Input design concepts are written in natural language. The concepts are not pre-tagged, structured or processed in any way which requires human intervention. Nor does the approach require dedicated training on a sample set of designs. Concepts are assessed at the sentence level via a mixture of named entity tagging (keywords) through contextual sense recognition and topic tagging (sentence topic) through probabilistic mapping to a knowledge graph. The method also includes a filtering strategy, the introduction of two metrics, and a selection strategy for assessing design concepts. The metrics are analogous to the design creativity metrics novelty, level of detail, and a selection strategy. To test the method, four ideation cases were studied; over 4,000 concepts were generated and evaluated. Analyses include: asymptotic convergence analysis; a predictive industry case study; and a dominance test between several approaches to selection of high ranking concepts. Notably, in a series of binary comparisons between concepts that were selected from the entire set by a time limited human versus those with the highest ML metric scores, the ML selected concepts were dominant.

Download Full-text

Mining for Creativity: Determining the Creativity of Ideas Through Data Mining Techniques

Volume 7: 29th International Conference on Design Theory and Methodology ◽

10.1115/detc2017-68304 ◽

2017 ◽

Cited By ~ 1

Author(s):

Christine A. Toh ◽

Elizabeth M. Starkey ◽

Conrad S. Tucker ◽

Scarlett R. Miller

Keyword(s):

Machine Learning ◽

Design Research ◽

Machine Learning Techniques ◽

Numeric Model ◽

Design Creativity ◽

Large Sets ◽

Learning Techniques ◽

Ideation Methods ◽

High Level ◽

Design Ideas

The emergence of ideation methods that generate large volumes of early-phase ideas has led to a need for reliable and efficient metrics for measuring the creativity of these ideas. However, existing methods of human judgment-based creativity assessments, as well as numeric model-based creativity assessment approaches suffer from low reliability and prohibitive computational burdens on human raters due to the high level of human input needed to calculate creativity scores. In addition, there is a need for an efficient method of computing the creativity of large sets of design ideas typically generated during the design process. This paper focuses on developing and empirically testing a machine learning approach for computing design creativity of large sets of design ideas to increase the efficiency and reliability of creativity evaluation methods in design research. The results of this study show that machine learning techniques can predict creativity of ideas with relatively high accuracy and sensitivity. These findings show that machine learning has the potential to be used for rating the creativity of ideas generated based on their descriptions.

Download Full-text

Improving police transparency in Canada

Journal of Community Safety and Well-Being ◽

10.35502/jcswb.206 ◽

2021 ◽

Vol 6 (2) ◽

pp. 71-74

Author(s):

Lance Valcour

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Data Collection ◽

Advanced Technology ◽

Management Tools ◽

Call To Action ◽

Challenges And Opportunities ◽

High Level ◽

Collection And Management

The path to improved police transparency in Canada includes the use of advanced technology with capabilities such as artificial intelligence, machine learning, “cloud” enabled services, and an ever-increasing number of data collection and management tools. However, these innovations need to be closely linked with a national—not federal—stakeholder review of current legal, legislative, and privacy frameworks. This article provides readers with a high-level overview of the issue of police transparency in Canada. It then outlines a number of key challenges and opportunities for improving this transparency. It concludes with a call to action for key Canadian stakeholders to work collaboratively to improve police transparency in Canada.

Download Full-text

BAND NN: A Deep Learning Framework For Energy Prediction and Geometry Optimization of Organic Small Molecules

10.26434/chemrxiv.9763094 ◽

2019 ◽

Author(s):

Siddhartha Laghuvarapu ◽

Yashaswi Pathak ◽

U. Deva Priyakumar

Keyword(s):

Machine Learning ◽

Density Functional ◽

Computational Cost ◽

Geometry Optimization ◽

Dft Methods ◽

Energy Prediction ◽

Machine Learning Model ◽

Equilibrium Structures ◽

High Level ◽

Non Equilibrium

Recent advances in artificial intelligence along with development of large datasets of energies calculated using quantum mechanical (QM)/density functional theory (DFT) methods have enabled prediction of accurate molecular energies at reasonably low computational cost. However, machine learning models that have been reported so far requires the atomic positions obtained from geometry optimizations using high level QM/DFT methods as input in order to predict the energies, and do not allow for geometry optimization. In this paper, a transferable and molecule-size independent machine learning model (BAND NN) based on a chemically intuitive representation inspired by molecular mechanics force fields is presented. The model predicts the atomization energies of equilibrium and non-equilibrium structures as sum of energy contributions from bonds (B), angles (A), nonbonds (N) and dihedrals (D) at remarkable accuracy. The robustness of the proposed model is further validated by calculations that span over the conformational, configurational and reaction space. The transferability of this model on systems larger than the ones in the dataset is demonstrated by performing calculations on select large molecules. Importantly, employing the BAND NN model, it is possible to perform geometry optimizations starting from non-equilibrium structures along with predicting their energies.

Download Full-text

COVID-19 Outbreak Prediction with Machine Learning

10.34055/osf.io/xr4js ◽

2020 ◽

Author(s):

Sina Faizollahzadeh Ardabili ◽

Amir Mosavi ◽

Pedram Ghamisi ◽

Filip Ferdinand ◽

Annamaria R. Varkonyi-Koczy ◽

...

Keyword(s):

Machine Learning ◽

Prediction Models ◽

Fuzzy Inference ◽

Control Measures ◽

Future Research ◽

Complex Nature ◽

Inference System ◽

Wide Range ◽

Standard Models ◽

High Level

Several outbreak prediction models for COVID-19 are being used by officials around the world to make informed-decisions and enforce relevant control measures. Among the standard models for COVID-19 global pandemic prediction, simple epidemiological and statistical models have received more attention by authorities, and they are popular in the media. Due to a high level of uncertainty and lack of essential data, standard models have shown low accuracy for long-term prediction. Although the literature includes several attempts to address this issue, the essential generalization and robustness abilities of existing models needs to be improved. This paper presents a comparative analysis of machine learning and soft computing models to predict the COVID-19 outbreak as an alternative to SIR and SEIR models. Among a wide range of machine learning models investigated, two models showed promising results (i.e., multi-layered perceptron, MLP, and adaptive network-based fuzzy inference system, ANFIS). Based on the results reported here, and due to the highly complex nature of the COVID-19 outbreak and variation in its behavior from nation-to-nation, this study suggests machine learning as an effective tool to model the outbreak. This paper provides an initial benchmarking to demonstrate the potential of machine learning for future research. Paper further suggests that real novelty in outbreak prediction can be realized through integrating machine learning and SEIR models.

Download Full-text

High Level-of-Detail BIM and Machine Learning for Automated Masonry Wall Defect Surveying

Proceedings of the 35th International Symposium on Automation and Robotics in Construction (ISARC) ◽

10.22260/isarc2018/0101 ◽

2018 ◽

Author(s):

Enrique Valero ◽

Alan Forster ◽

Frédéric Bosché ◽

Camille Renier ◽

Ewan Hyslop ◽

...

Keyword(s):

Machine Learning ◽

Masonry Wall ◽

Level Of Detail ◽

Wall Defect ◽

High Level

Download Full-text

Efficient End-to-End Sentence-Level Lipreading with Temporal Convolutional Networks

Applied Sciences ◽

10.3390/app11156975 ◽

2021 ◽

Vol 11 (15) ◽

pp. 6975

Author(s):

Tao Zhang ◽

Lun He ◽

Xudong Li ◽

Guoqing Feng

Keyword(s):

Performance Improvement ◽

State Of The Art ◽

Error Rates ◽

Convolutional Network ◽

Convolutional Networks ◽

Sentence Level ◽

End To End ◽

High Level ◽

Improved Accuracy ◽

Talking Face

Lipreading aims to recognize sentences being spoken by a talking face. In recent years, the lipreading method has achieved a high level of accuracy on large datasets and made breakthrough progress. However, lipreading is still far from being solved, and existing methods tend to have high error rates on the wild data and have the defects of disappearing training gradient and slow convergence. To overcome these problems, we proposed an efficient end-to-end sentence-level lipreading model, using an encoder based on a 3D convolutional network, ResNet50, Temporal Convolutional Network (TCN), and a CTC objective function as the decoder. More importantly, the proposed architecture incorporates TCN as a feature learner to decode feature. It can partly eliminate the defects of RNN (LSTM, GRU) gradient disappearance and insufficient performance, and this yields notable performance improvement as well as faster convergence. Experiments show that the training and convergence speed are 50% faster than the state-of-the-art method, and improved accuracy by 2.4% on the GRID dataset.

Download Full-text

An Annotated Corpus of Crime-Related Portuguese Documents for NLP and Machine Learning Processing

Data ◽

10.3390/data6070071 ◽

2021 ◽

Vol 6 (7) ◽

pp. 71

Author(s):

Gonçalo Carnaz ◽

Mário Antunes ◽

Vitor Beires Nogueira

Keyword(s):

Machine Learning ◽

Language Processing ◽

Named Entity Recognition ◽

Entity Recognition ◽

Automatic Identification ◽

Named Entities ◽

Related Data ◽

Named Entity ◽

Chain Of Custody ◽

Evidence Collection

Criminal investigations collect and analyze the facts related to a crime, from which the investigators can deduce evidence to be used in court. It is a multidisciplinary and applied science, which includes interviews, interrogations, evidence collection, preservation of the chain of custody, and other methods and techniques of investigation. These techniques produce both digital and paper documents that have to be carefully analyzed to identify correlations and interactions among suspects, places, license plates, and other entities that are mentioned in the investigation. The computerized processing of these documents is a helping hand to the criminal investigation, as it allows the automatic identification of entities and their relations, being some of which difficult to identify manually. There exists a wide set of dedicated tools, but they have a major limitation: they are unable to process criminal reports in the Portuguese language, as an annotated corpus for that purpose does not exist. This paper presents an annotated corpus, composed of a collection of anonymized crime-related documents, which were extracted from official and open sources. The dataset was produced as the result of an exploratory initiative to collect crime-related data from websites and conditioned-access police reports. The dataset was evaluated and a mean precision of 0.808, recall of 0.722, and F1-score of 0.733 were obtained with the classification of the annotated named-entities present in the crime-related documents. This corpus can be employed to benchmark Machine Learning (ML) and Natural Language Processing (NLP) methods and tools to detect and correlate entities in the documents. Some examples are sentence detection, named-entity recognition, and identification of terms related to the criminal domain.

Download Full-text

Machine Learning–enabled Scalable Performance Prediction of Scientific Codes

ACM Transactions on Modeling and Computer Simulation ◽

10.1145/3450264 ◽

2021 ◽

Vol 31 (2) ◽

pp. 1-28

Author(s):

Gopinath Chennupati ◽

Nandakishore Santhi ◽

Phill Romero ◽

Stephan Eidenbenz

Keyword(s):

Machine Learning ◽

Performance Prediction ◽

Prediction Models ◽

Radiation Transport ◽

Discrete Event ◽

Basic Block ◽

Distribution Models ◽

Scientific Application ◽

High Level ◽

Access Patterns

Hardware architectures become increasingly complex as the compute capabilities grow to exascale. We present the Analytical Memory Model with Pipelines (AMMP) of the Performance Prediction Toolkit (PPT). PPT-AMMP takes high-level source code and hardware architecture parameters as input and predicts runtime of that code on the target hardware platform, which is defined in the input parameters. PPT-AMMP transforms the code to an (architecture-independent) intermediate representation, then (i) analyzes the basic block structure of the code, (ii) processes architecture-independent virtual memory access patterns that it uses to build memory reuse distance distribution models for each basic block, and (iii) runs detailed basic-block level simulations to determine hardware pipeline usage. PPT-AMMP uses machine learning and regression techniques to build the prediction models based on small instances of the input code, then integrates into a higher-order discrete-event simulation model of PPT running on Simian PDES engine. We validate PPT-AMMP on four standard computational physics benchmarks and present a use case of hardware parameter sensitivity analysis to identify bottleneck hardware resources on different code inputs. We further extend PPT-AMMP to predict the performance of a scientific application code, namely, the radiation transport mini-app SNAP. To this end, we analyze multi-variate regression models that accurately predict the reuse profiles and the basic block counts. We validate predicted SNAP runtimes against actual measured times.

Download Full-text

Semi-automated classification of colonial Microcystis by FlowCAM imaging flow cytometry in mesocosm experiment reveals high heterogeneity during seasonal bloom

Scientific Reports ◽

10.1038/s41598-021-88661-2 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yersultan Mirasbekov ◽

Adina Zhumakhanova ◽

Almira Zhantuyakova ◽

Kuanysh Sarkytbayev ◽

Dmitry V. Malashenkov ◽

...

Keyword(s):

Machine Learning ◽

Flow Cytometry ◽

Spatial Resolution ◽

Mesocosm Experiment ◽

Imaging Flow Cytometry ◽

Leibler Divergence ◽

Temporal And Spatial ◽

High Level ◽

Training Sets

AbstractA machine learning approach was employed to detect and quantify Microcystis colonial morphospecies using FlowCAM-based imaging flow cytometry. The system was trained and tested using samples from a long-term mesocosm experiment (LMWE, Central Jutland, Denmark). The statistical validation of the classification approaches was performed using Hellinger distances, Bray–Curtis dissimilarity, and Kullback–Leibler divergence. The semi-automatic classification based on well-balanced training sets from Microcystis seasonal bloom provided a high level of intergeneric accuracy (96–100%) but relatively low intrageneric accuracy (67–78%). Our results provide a proof-of-concept of how machine learning approaches can be applied to analyze the colonial microalgae. This approach allowed to evaluate Microcystis seasonal bloom in individual mesocosms with high level of temporal and spatial resolution. The observation that some Microcystis morphotypes completely disappeared and re-appeared along the mesocosm experiment timeline supports the hypothesis of the main transition pathways of colonial Microcystis morphoforms. We demonstrated that significant changes in the training sets with colonial images required for accurate classification of Microcystis spp. from time points differed by only two weeks due to Microcystis high phenotypic heterogeneity during the bloom. We conclude that automatic methods not only allow a performance level of human taxonomist, and thus be a valuable time-saving tool in the routine-like identification of colonial phytoplankton taxa, but also can be applied to increase temporal and spatial resolution of the study.

Download Full-text

Individualized embryo selection strategy developed by stacking machine learning model for better in vitro fertilization outcomes: an application study

Reproductive Biology and Endocrinology ◽

10.1186/s12958-021-00734-z ◽

2021 ◽

Vol 19 (1) ◽

Author(s):

Qingsong Xi ◽

Qiyu Yang ◽

Meng Wang ◽

Bo Huang ◽

Bo Zhang ◽

...

Keyword(s):

Machine Learning ◽

In Vitro Fertilization ◽

Endometrial Thickness ◽

Learning System ◽

Embryo Selection ◽

Selection Strategy ◽

Application Study ◽

Vitro Fertilization ◽

The Impact

Abstract Background To minimize the rate of in vitro fertilization (IVF)- associated multiple-embryo gestation, significant efforts have been made. Previous studies related to machine learning in IVF mainly focused on selecting the top-quality embryos to improve outcomes, however, in patients with sub-optimal prognosis or with medium- or inferior-quality embryos, the selection between SET and DET could be perplexing. Methods This was an application study including 9211 patients with 10,076 embryos treated during 2016 to 2018, in Tongji Hospital, Wuhan, China. A hierarchical model was established using the machine learning system XGBoost, to learn embryo implantation potential and the impact of double embryos transfer (DET) simultaneously. The performance of the model was evaluated with the AUC of the ROC curve. Multiple regression analyses were also conducted on the 19 selected features to demonstrate the differences between feature importance for prediction and statistical relationship with outcomes. Results For a single embryo transfer (SET) pregnancy, the following variables remained significant: age, attempts at IVF, estradiol level on hCG day, and endometrial thickness. For DET pregnancy, age, attempts at IVF, endometrial thickness, and the newly added P1 + P2 remained significant. For DET twin risk, age, attempts at IVF, 2PN/ MII, and P1 × P2 remained significant. The algorithm was repeated 30 times, and averaged AUC of 0.7945, 0.8385, and 0.7229 were achieved for SET pregnancy, DET pregnancy, and DET twin risk, respectively. The trend of predictive and observed rates both in pregnancy and twin risk was basically identical. XGBoost outperformed the other two algorithms: logistic regression and classification and regression tree. Conclusion Artificial intelligence based on determinant-weighting analysis could offer an individualized embryo selection strategy for any given patient, and predict clinical pregnancy rate and twin risk, therefore optimizing clinical outcomes.

Download Full-text