Automatically Generating 60,000 CAD Variants for Big Data Applications

Abstract Machine learning is opening up new ways of optimizing designs but it requires large data sets for training and verification. While such data sets already exist for financial, sales and business applications, this is not the case for engineering product design data. This paper discusses our efforts in curating a large Computer Aided Design (CAD) data set with desired variety and validity for automotive body structural compositions. Manual creation of 60,000 CAD variants is obviously not viable so we examine several approaches that can be automated with commercial CAD systems such as Parametric Design, Feature Based Design, Design Tables/Catalogs of Variants and Macros. We discuss pros and cons of each method and how we devised a combination of these approaches. This hybrid approach was used in association with DOE tables. Since the geometric configurations and characteristics need to be correlated to performance (structural integrity), the paper also demonstrates automated workflows to perform FEA on CAD models generated. Key simulation results can then be associated with CAD geometry and, for example, processes using machine learning algorithms for both supervised and unsupervised learning. The information obtained from the application of such methods to historical CAD models may help to understand the reasoning behind experiential design decisions. With the increase in computing power and network speed, such datasets together with novel machine learning methods, could assist in generating better designs, which could potentially be obtained by a combination of existing ones, or might provide insights into completely new design concepts meeting or exceeding the performance requirements.

Download Full-text

Design Science Meets Data Science: Curating Large Design Datasets for Engineered Artifacts

Volume 9: 40th Computers and Information in Engineering Conference (CIE) ◽

10.1115/detc2020-22377 ◽

2020 ◽

Author(s):

Satchit Ramnath ◽

Payam Haghighi ◽

Jiachen Ma ◽

Jami J. Shah ◽

Duane Detwiler

Keyword(s):

Machine Learning ◽

Data Science ◽

Structural Integrity ◽

Design Science ◽

Large Data ◽

Machine Learning Algorithms ◽

Element Analysis ◽

Data Set ◽

Trade Offs ◽

Cad Models

Abstract Machine learning is opening up new ways of optimizing designs, but it requires large data sets for training and verification. The primary focus of this paper is to explain the trade-offs between generating a large data set and the level of idealization required to automate the process of generating such a data set. This paper discusses the efforts in curating a large CAD data set with the desired variety and validity of automotive body structures. A method to incorporate constraint networks to filter invalid designs, prior to the start of model generation is explained. Since the geometric configurations and characteristics need to be correlated to performance (structural integrity), the paper also demonstrates automated workflows to perform finite element analysis on 3D CAD models generated. Key simulation results can then be associated with CAD geometry and fed to the machine learning algorithms. With the increase in computing power and network speed, such datasets could assist in generating better designs, which could potentially be obtained by a combination of existing ones, or might provide insights into completely new design concepts meeting or exceeding the performance requirements. The approach is explained using the hood frame as an example, but the same can be adopted to other design components.

Download Full-text

Birds Sound Classification Based on Machine Learning Algorithms

Asian Journal of Research in Computer Science ◽

10.9734/ajrcos/2021/v9i430227 ◽

2021 ◽

pp. 1-11

Author(s):

Aska E. Mehyadin ◽

Adnan Mohsin Abdulazeez ◽

Dathar Abas Hasan ◽

Jwan N. Saeed

Keyword(s):

Machine Learning ◽

Noise Suppression ◽

Bird Species ◽

Machine Learning Algorithms ◽

Data Sets ◽

Learning Technology ◽

Species Classification ◽

Data Set ◽

Sound Classification ◽

Mel Frequency Cepstral Coefficient

The bird classifier is a system that is equipped with an area machine learning technology and uses a machine learning method to store and classify bird calls. Bird species can be known by recording only the sound of the bird, which will make it easier for the system to manage. The system also provides species classification resources to allow automated species detection from observations that can teach a machine how to recognize whether or classify the species. Non-undesirable noises are filtered out of and sorted into data sets, where each sound is run via a noise suppression filter and a separate classification procedure so that the most useful data set can be easily processed. Mel-frequency cepstral coefficient (MFCC) is used and tested through different algorithms, namely Naïve Bayes, J4.8 and Multilayer perceptron (MLP), to classify bird species. J4.8 has the highest accuracy (78.40%) and is the best. Accuracy and elapsed time are (39.4 seconds).

Download Full-text

Breast Cancer Detection Using Machine Learning Algorithms

International Journal of Computer Science and Mobile Computing ◽

10.47760/ijcsmc.2021.v10i11.002 ◽

2021 ◽

Vol 10 (11) ◽

pp. 4-11

Author(s):

Samer Hamed ◽

Abdelwadood Mesleh ◽

Abdullah Arabiyyat

Keyword(s):

Machine Learning ◽

Random Forest ◽

Cancer Detection ◽

Computer Aided Design ◽

Machine Learning Algorithms ◽

Breast Cancers ◽

Data Set ◽

Cad System ◽

Aided Design ◽

F Measure

This paper presents a computer-aided design (CAD) system that detects breast cancers (BCs). BC detection uses random forest, AdaBoost, logistic regression, decision trees, naïve Bayes and conventional neural networks (CNNs) classifiers, these machine learning (ML) based algorithms are trained to predicting BCs (malignant or benign) on BC Wisconsin data-set from the UCI repository, in which attribute clump thickness is used as evaluation class. The effectiveness of these ML algorithms are evaluated in terms of accuracy and F-measure; random forest outperformed the other classifiers and achieved 99% accuracy and 99% F-measure.

Download Full-text

Implementing Machine Learning Algorithms on Finite Element Analyses Data Sets for Selecting Proper Cellular Structure

International Journal of Applied Mechanics ◽

10.1142/s1758825121500721 ◽

2021 ◽

Author(s):

Mahziyar Darvishi ◽

Omid Ziaee ◽

Arash Rahmati ◽

Mohammad Silani

Keyword(s):

Machine Learning ◽

Finite Element ◽

Nearest Neighbor ◽

Machine Learning Algorithms ◽

Cellular Structures ◽

Data Sets ◽

K Nearest Neighbor ◽

Element Analysis ◽

Data Set ◽

Efficient Alternative

Numerous structure geometries are available for cellular structures, and selecting the suitable structure that reflects the intended characteristics is cumbersome. While testing many specimens for determining the mechanical properties of these materials could be time-consuming and expensive, finite element analysis (FEA) is considered an efficient alternative. In this study, we present a method to find the suitable geometry for the intended mechanical characteristics by implementing machine learning (ML) algorithms on FEA results of cellular structures. Different cellular structures of a given material are analyzed by FEA, and the results are validated with their corresponding analytical equations. The validated results are employed to create a data set used in the ML algorithms. Finally, by comparing the results with the correct answers, the most accurate algorithm is identified for the intended application. In our case study, the cellular structures are three widely used cellular structures as bone implants: Cube, Kelvin, and Rhombic dodecahedron, made of Ti–6Al–4V. The ML algorithms are simple Bayesian classification, K-nearest neighbor, XGBoost, random forest, and artificial neural network. By comparing the results of these algorithms, the best-performing algorithm is identified.

Download Full-text

A Review of Regression Models in Machine Learning

10.51682/jiscom.00202005.2021 ◽

2021 ◽

Vol 2 (2) ◽

pp. 40-47

Author(s):

Sunil Kumar ◽

Vaibhav Bhatnagar

Keyword(s):

Machine Learning ◽

Regression Analysis ◽

Regression Model ◽

Regression Models ◽

Machine Learning Algorithms ◽

Data Sets ◽

Analysis Model ◽

Data Set ◽

Data Regression ◽

Different Types

Machine learning is one of the active fields and technologies to realize artificial intelligence (AI). The complexity of machine learning algorithms creates problems to predict the best algorithm. There are many complex algorithms in machine learning (ML) to determine the appropriate method for finding regression trends, thereby establishing the correlation association in the middle of variables is very difficult, we are going to review different types of regressions used in Machine Learning. There are mainly six types of regression model Linear, Logistic, Polynomial, Ridge, Bayesian Linear and Lasso. This paper overview the above-mentioned regression model and will try to find the comparison and suitability for Machine Learning. A data analysis prerequisite to launch an association amongst the innumerable considerations in a data set, association is essential for forecast and exploration of data. Regression Analysis is such a procedure to establish association among the datasets. The effort on this paper predominantly emphases on the diverse regression analysis model, how they binning to custom in context of different data sets in machine learning. Selection the accurate model for exploration is the most challenging assignment and hence, these models considered thoroughly in this study. In machine learning by these models in the perfect way and thru accurate data set, data exploration and forecast can provide the maximum exact outcomes.

Download Full-text

A supervised machine learning approach to trace doctorate recipients’ employment trajectories

Quantitative Science Studies ◽

10.1162/qss_a_00001 ◽

2020 ◽

Vol 1 (1) ◽

pp. 94-116

Author(s):

Dominik P. Heinisch ◽

Johannes Koenig ◽

Anne Otto

Keyword(s):

Machine Learning ◽

Labor Market ◽

Record Linkage ◽

Machine Learning Algorithms ◽

Supervised Machine Learning ◽

Data Sets ◽

Data Set ◽

Employment Trajectories ◽

Doctorate Recipients ◽

Gap Data

Only scarce information is available on doctorate recipients’ career outcomes ( BuWiN, 2013 ). With the current information base, graduate students cannot make an informed decision on whether to start a doctorate or not ( Benderly, 2018 ; Blank et al., 2017 ). However, administrative labor market data, which could provide the necessary information, are incomplete in this respect. In this paper, we describe the record linkage of two data sets to close this information gap: data on doctorate recipients collected in the catalog of the German National Library (DNB), and the German labor market biographies (IEB) from the German Institute of Employment Research. We use a machine learning-based methodology, which (a) improves the record linkage of data sets without unique identifiers, and (b) evaluates the quality of the record linkage. The machine learning algorithms are trained on a synthetic training and evaluation data set. In an exemplary analysis, we compare the evolution of the employment status of female and male doctorate recipients in Germany.

Download Full-text

Machine Learning Algorithms for Analysis of DNA Data Sets

Machine Learning Algorithms for Problem Solving in Computational Applications ◽

10.4018/978-1-4666-1833-6.ch004 ◽

2012 ◽

pp. 47-58 ◽

Cited By ~ 2

Author(s):

John Yearwood ◽

Adil Bagirov ◽

Andrei V. Kelarev

Keyword(s):

Machine Learning ◽

Dna Sequences ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Local Alignment ◽

Data Sets ◽

Data Set ◽

Applications Of Machine Learning ◽

New Machine

The applications of machine learning algorithms to the analysis of data sets of DNA sequences are very important. The present chapter is devoted to the experimental investigation of applications of several machine learning algorithms for the analysis of a JLA data set consisting of DNA sequences derived from non-coding segments in the junction of the large single copy region and inverted repeat A of the chloroplast genome in Eucalyptus collected by Australian biologists. Data sets of this sort represent a new situation, where sophisticated alignment scores have to be used as a measure of similarity. The alignment scores do not satisfy properties of the Minkowski metric, and new machine learning approaches have to be investigated. The authors’ experiments show that machine learning algorithms based on local alignment scores achieve very good agreement with known biological classes for this data set. A new machine learning algorithm based on graph partitioning performed best for clustering of the JLA data set. Our novel k-committees algorithm produced most accurate results for classification. Two new examples of synthetic data sets demonstrate that the authors’ k-committees algorithm can outperform both the Nearest Neighbour and k-medoids algorithms simultaneously.

Download Full-text

An Intelligent Multicriteria Model for Diagnosing Dementia in People Infected with Human Immunodeficiency Virus

Applied Sciences ◽

10.3390/app112110457 ◽

2021 ◽

Vol 11 (21) ◽

pp. 10457

Author(s):

Luana I. C. C. Pinheiro ◽

Maria Lúcia D. Pereira ◽

Evandro C. de Andrade ◽

Luciano C. Nunes ◽

Wilson C. de Abreu ◽

...

Keyword(s):

Machine Learning ◽

Human Immunodeficiency Virus ◽

Mental Disorders ◽

Neurological Disorders ◽

Hybrid Approach ◽

International Classification Of Diseases ◽

Machine Learning Algorithms ◽

World Health ◽

Data Set ◽

Immunodeficiency Virus

Hybrid models to detect dementia based on Machine Learning can provide accurate diagnoses in individuals with neurological disorders and cognitive complications caused by Human Immunodeficiency Virus (HIV) infection. This study proposes a hybrid approach, using Machine Learning algorithms associated with the multicriteria method of Verbal Decision Analysis (VDA). Dementia, which affects many HIV-infected individuals, refers to neurodevelopmental and mental disorders. Some manuals standardize the information used in the correct detection of neurological disorders with cognitive complications. Among the most common manuals used are the DSM-5 (Diagnostic and Statistical Manual of Mental Disorders, 5th edition) of the American Psychiatric Association and the International Classification of Diseases, 10th edition (ICD-10)—both published by World Health Organization (WHO). The model is designed to explore the predictive of specific data. Furthermore, a well-defined database data set improves and optimizes the diagnostic models sought in the research.

Download Full-text

PseUdeep: RNA Pseudouridine Site Identification with Deep Learning Algorithm

Frontiers in Genetics ◽

10.3389/fgene.2021.773882 ◽

2021 ◽

Vol 12 ◽

Author(s):

Jujuan Zhuang ◽

Danyang Liu ◽

Meng Lin ◽

Wenjing Qiu ◽

Jinyang Liu ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Algorithm ◽

Machine Learning Algorithms ◽

Data Sets ◽

Biological Processes ◽

Rna Sequences ◽

Frequency Pattern ◽

Data Set ◽

Testing Data

Background: Pseudouridine (Ψ) is a common ribonucleotide modification that plays a significant role in many biological processes. The identification of Ψ modification sites is of great significance for disease mechanism and biological processes research in which machine learning algorithms are desirable as the lab exploratory techniques are expensive and time-consuming.Results: In this work, we propose a deep learning framework, called PseUdeep, to identify Ψ sites of three species: H. sapiens, S. cerevisiae, and M. musculus. In this method, three encoding methods are used to extract the features of RNA sequences, that is, one-hot encoding, K-tuple nucleotide frequency pattern, and position-specific nucleotide composition. The three feature matrices are convoluted twice and fed into the capsule neural network and bidirectional gated recurrent unit network with a self-attention mechanism for classification.Conclusion: Compared with other state-of-the-art methods, our model gets the highest accuracy of the prediction on the independent testing data set S-200; the accuracy improves 12.38%, and on the independent testing data set H-200, the accuracy improves 0.68%. Moreover, the dimensions of the features we derive from the RNA sequences are only 109,109, and 119 in H. sapiens, M. musculus, and S. cerevisiae, which is much smaller than those used in the traditional algorithms. On evaluation via tenfold cross-validation and two independent testing data sets, PseUdeep outperforms the best traditional machine learning model available. PseUdeep source code and data sets are available at https://github.com/dan111262/PseUdeep.

Download Full-text

Image Processing-Based Recognition of Wall Defects Using Machine Learning Approaches and Steerable Filters

Computational Intelligence and Neuroscience ◽

10.1155/2018/7913952 ◽

2018 ◽

Vol 2018 ◽

pp. 1-18 ◽

Cited By ~ 6

Author(s):

Nhat-Duc Hoang

Keyword(s):

Machine Learning ◽

Image Processing ◽

Support Vector Machine ◽

Structural Integrity ◽

Longitudinal Crack ◽

Machine Learning Algorithms ◽

Support Vector ◽

Data Set ◽

Promising Alternative ◽

Steerable Filters

Detection of defects including cracks and spalls on wall surface in high-rise buildings is a crucial task of buildings’ maintenance. If left undetected and untreated, these defects can significantly affect the structural integrity and the aesthetic aspect of buildings. Timely and cost-effective methods of building condition survey are of practicing need for the building owners and maintenance agencies to replace the time- and labor-consuming approach of manual survey. This study constructs an image processing approach for periodically evaluating the condition of wall structures. Image processing algorithms of steerable filters and projection integrals are employed to extract useful features from digital images. The newly developed model relies on the Support vector machine and least squares support vector machine to generalize the classification boundaries that categorize conditions of wall into five labels: longitudinal crack, transverse crack, diagonal crack, spall damage, and intact wall. A data set consisting of 500 image samples has been collected to train and test the machine learning based classifiers. Experimental results point out that the proposed model that combines the image processing and machine learning algorithms can achieve a good classification performance with a classification accuracy rate = 85.33%. Therefore, the newly developed method can be a promising alternative to assist maintenance agencies in periodic building surveys.

Download Full-text