An Unsupervised Machine Learning Approach to Assessing Designer Performance During Physical Prototyping

An important part of the engineering design process is prototyping, where designers build and test their designs. This process is typically iterative, time consuming, and manual in nature. For a given task, there are multiple objects that can be used, each with different time units associated with accomplishing the task. Current methods for reducing time spent during the prototyping process have focused primarily on optimizing designer to designer interactions, as opposed to designer to tool interactions. Advancements in commercially available sensing systems (e.g., the Kinect) and machine learning algorithms have opened the pathway toward real-time observation of designer's behavior in engineering workspaces during prototype construction. Toward this end, this work hypothesizes that an object O being used for task i is distinguishable from object O being used for task j, where i is the correct task and j is the incorrect task. The contributions of this work are: (i) the ability to recognize these objects in a free roaming engineering workshop environment and (ii) the ability to distinguish between the correct and incorrect use of objects used during a prototyping task. By distinguishing the difference between correct and incorrect uses, incorrect behavior (which often results in wasted time and materials) can be detected and quickly corrected. The method presented in this work learns as designers use objects, and infers the proper way to use them during prototyping. In order to demonstrate the effectiveness of the proposed method, a case study is presented in which participants in an engineering design workshop are asked to perform correct and incorrect tasks with a tool. The participants' movements are analyzed by an unsupervised clustering algorithm to determine if there is a statistical difference between tasks being performed correctly and incorrectly. Clusters which are a plurality incorrect are found to be significantly distinct for each node considered by the method, each with p ≪ 0.001.

Download Full-text

Prediction of Depth of Seawater Using Fuzzy C-Means Clustering Algorithm of Crowdsourced SONAR Data

Sustainability ◽

10.3390/su13115823 ◽

2021 ◽

Vol 13 (11) ◽

pp. 5823

Author(s):

Ahmadhon Akbarkhonovich Kamolov ◽

Suhyun Park

Keyword(s):

Machine Learning ◽

Clustering Algorithm ◽

Machine Learning Algorithms ◽

Human Beings ◽

Fuzzy C Means ◽

Sonar Sensors ◽

Crowdsourced Data ◽

Fuzzy C Means Clustering ◽

Fcm Clustering

Implementing AI in all fields is a solution to the complications that can be troublesome to solve for human beings and will be the key point of the advancement of those spheres. In the marine world, specialists also encounter some problems that can be revealed through addressing AI and machine learning algorithms. One of these challenges is determining the depth of the seabed with high precision. The depth of the seabed is utterly significant in the procedure of ships at sea occupying a safe route. Thus, it is considerably crucial that the ships do not sit in shallow water. In this article, we have addressed the fuzzy c-means (FCM) clustering algorithm, which is one of the vigorous unsupervised learning methods under machine learning to solve the mentioned problems. In the case study, crowdsourced data have been trained, which are gathered from vessels that have installed sound navigation and ranging (SONAR) sensors. The data for the training were collected from ships sailing in the south part of South Korea. In the training section, we segregated the training zone into the diminutive size areas (blocks). The data assembled in blocks had been trained in FCM. As a result, we have received data separated into clusters that can be supportive to differentiate data. The results of the effort show that FCM can be implemented and obtain accurate results on crowdsourced bathymetry.

Download Full-text

A Machine Learning Approach to Study Glycosidase Activities from Bifidobacterium

Microorganisms ◽

10.3390/microorganisms9051034 ◽

2021 ◽

Vol 9 (5) ◽

pp. 1034

Author(s):

Carlos Sabater ◽

Lorena Ruiz ◽

Abelardo Margolles

Keyword(s):

Machine Learning ◽

Supervised Classification ◽

Machine Learning Algorithms ◽

Learning Approach ◽

Human Milk Oligosaccharides ◽

Future Studies ◽

High Fiber ◽

Machine Learning Approach ◽

Prebiotic Oligosaccharides

This study aimed to recover metagenome-assembled genomes (MAGs) from human fecal samples to characterize the glycosidase profiles of Bifidobacterium species exposed to different prebiotic oligosaccharides (galacto-oligosaccharides, fructo-oligosaccharides and human milk oligosaccharides, HMOs) as well as high-fiber diets. A total of 1806 MAGs were recovered from 487 infant and adult metagenomes. Unsupervised and supervised classification of glycosidases codified in MAGs using machine-learning algorithms allowed establishing characteristic hydrolytic profiles for B. adolescentis, B. bifidum, B. breve, B. longum and B. pseudocatenulatum, yielding classification rates above 90%. Glycosidase families GH5 44, GH32, and GH110 were characteristic of B. bifidum. The presence or absence of GH1, GH2, GH5 and GH20 was characteristic of B. adolescentis, B. breve and B. pseudocatenulatum, while families GH1 and GH30 were relevant in MAGs from B. longum. These characteristic profiles allowed discriminating bifidobacteria regardless of prebiotic exposure. Correlation analysis of glycosidase activities suggests strong associations between glycosidase families comprising HMOs-degrading enzymes, which are often found in MAGs from the same species. Mathematical models here proposed may contribute to a better understanding of the carbohydrate metabolism of some common bifidobacteria species and could be extrapolated to other microorganisms of interest in future studies.

Download Full-text

Twitter Sentiment Analysis Using Machine Learning Algorithms: A Case Study

2020 International Conference on Advances in Computing, Communication & Materials (ICACCM) ◽

10.1109/icaccm50413.2020.9213011 ◽

2020 ◽

Author(s):

Sheresh Zahoor ◽

Rajesh Rohilla

Keyword(s):

Machine Learning ◽

Sentiment Analysis ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Machine Learning Model of Dimensionless Numbers to Predict Flow Patterns and Droplet Characteristics for Two-Phase Digital Flows

Applied Sciences ◽

10.3390/app11094251 ◽

2021 ◽

Vol 11 (9) ◽

pp. 4251

Author(s):

Jinsong Zhang ◽

Shuai Zhang ◽

Jianhua Zhang ◽

Zhiliang Wang

Keyword(s):

Machine Learning ◽

Learning Algorithms ◽

Digital Microfluidics ◽

Flow Patterns ◽

Machine Learning Algorithms ◽

Dimensionless Numbers ◽

Two Phase ◽

The Difference ◽

Input Variables ◽

Digital Microfluidic

In the digital microfluidic experiments, the droplet characteristics and flow patterns are generally identified and predicted by the empirical methods, which are difficult to process a large amount of data mining. In addition, due to the existence of inevitable human invention, the inconsistent judgment standards make the comparison between different experiments cumbersome and almost impossible. In this paper, we tried to use machine learning to build algorithms that could automatically identify, judge, and predict flow patterns and droplet characteristics, so that the empirical judgment was transferred to be an intelligent process. The difference on the usual machine learning algorithms, a generalized variable system was introduced to describe the different geometry configurations of the digital microfluidics. Specifically, Buckingham’s theorem had been adopted to obtain multiple groups of dimensionless numbers as the input variables of machine learning algorithms. Through the verification of the algorithms, the SVM and BPNN algorithms had classified and predicted the different flow patterns and droplet characteristics (the length and frequency) successfully. By comparing with the primitive parameters system, the dimensionless numbers system was superior in the predictive capability. The traditional dimensionless numbers selected for the machine learning algorithms should have physical meanings strongly rather than mathematical meanings. The machine learning algorithms applying the dimensionless numbers had declined the dimensionality of the system and the amount of computation and not lose the information of primitive parameters.

Download Full-text

Assessing the Relation between Mud Components and Rheology for Loss Circulation Prevention Using Polymeric Gels: A Machine Learning Approach

Energies ◽

10.3390/en14051377 ◽

2021 ◽

Vol 14 (5) ◽

pp. 1377

Author(s):

Musaab I. Magzoub ◽

Raj Kiran ◽

Saeed Salehi ◽

Ibnelwaleed A. Hussein ◽

Mustafa S. Nasser

Keyword(s):

Machine Learning ◽

Rheological Properties ◽

Nearest Neighbor ◽

Drilling Fluid ◽

Gradient Boosting ◽

K Nearest Neighbor ◽

Wide Range ◽

Machine Learning Approach ◽

Drilling Operations

The traditional way to mitigate loss circulation in drilling operations is to use preventative and curative materials. However, it is difficult to quantify the amount of materials from every possible combination to produce customized rheological properties. In this study, machine learning (ML) is used to develop a framework to identify material composition for loss circulation applications based on the desired rheological characteristics. The relation between the rheological properties and the mud components for polyacrylamide/polyethyleneimine (PAM/PEI)-based mud is assessed experimentally. Four different ML algorithms were implemented to model the rheological data for various mud components at different concentrations and testing conditions. These four algorithms include (a) k-Nearest Neighbor, (b) Random Forest, (c) Gradient Boosting, and (d) AdaBoosting. The Gradient Boosting model showed the highest accuracy (91 and 74% for plastic and apparent viscosity, respectively), which can be further used for hydraulic calculations. Overall, the experimental study presented in this paper, together with the proposed ML-based framework, adds valuable information to the design of PAM/PEI-based mud. The ML models allowed a wide range of rheology assessments for various drilling fluid formulations with a mean accuracy of up to 91%. The case study has shown that with the appropriate combination of materials, reasonable rheological properties could be achieved to prevent loss circulation by managing the equivalent circulating density (ECD).

Download Full-text

A Machine Learning Approach to Predict Deep Venous Thrombosis Among Hospitalized Patients

Clinical and Applied Thrombosis/Hemostasis ◽

10.1177/1076029621991185 ◽

2021 ◽

Vol 27 ◽

pp. 107602962199118

Author(s):

Logan Ryan ◽

Samson Mataraso ◽

Anna Siefkas ◽

Emily Pellegrini ◽

Gina Barnes ◽

...

Keyword(s):

Machine Learning ◽

Risk Stratification ◽

Venous Thrombosis ◽

Deep Venous Thrombosis ◽

Primary Outcome ◽

Scoring Systems ◽

Hospitalized Patients ◽

Machine Learning Algorithms ◽

Cancer History ◽

Machine Learning Approach

Deep venous thrombosis (DVT) is associated with significant morbidity, mortality, and increased healthcare costs. Standard scoring systems for DVT risk stratification often provide insufficient stratification of hospitalized patients and are unable to accurately predict which inpatients are most likely to present with DVT. There is a continued need for tools which can predict DVT in hospitalized patients. We performed a retrospective study on a database collected from a large academic hospital, comprised of 99,237 total general ward or ICU patients, 2,378 of whom experienced a DVT during their hospital stay. Gradient boosted machine learning algorithms were developed to predict a patient’s risk of developing DVT at 12- and 24-hour windows prior to onset. The primary outcome of interest was diagnosis of in-hospital DVT. The machine learning predictors obtained AUROCs of 0.83 and 0.85 for DVT risk prediction on hospitalized patients at 12- and 24-hour windows, respectively. At both 12 and 24 hours before DVT onset, the most important features for prediction of DVT were cancer history, VTE history, and internal normalized ratio (INR). Improved risk stratification may prevent unnecessary invasive testing in patients for whom DVT cannot be ruled out using existing methods. Improved risk stratification may also allow for more targeted use of prophylactic anticoagulants, as well as earlier diagnosis and treatment, preventing the development of pulmonary emboli and other sequelae of DVT.

Download Full-text

Data mining of coronavirus: SARS-CoV-2, SARS-CoV and MERS-CoV

BMC Research Notes ◽

10.1186/s13104-021-05561-4 ◽

2021 ◽

Vol 14 (1) ◽

Author(s):

Jung Eun Huh ◽

Seunghee Han ◽

Taeseon Yoon

Keyword(s):

Machine Learning ◽

Amino Acid ◽

Amino Acid Sequence ◽

Decision Tree ◽

Machine Learning Algorithms ◽

High Similarity ◽

Incubation Periods ◽

Initial Question ◽

The Difference ◽

Blast Program

Abstract Objective In this study we compare the amino acid and codon sequence of SARS-CoV-2, SARS-CoV and MERS-CoV using different statistics programs to understand their characteristics. Specifically, we are interested in how differences in the amino acid and codon sequence can lead to different incubation periods and outbreak periods. Our initial question was to compare SARS-CoV-2 to different viruses in the coronavirus family using BLAST program of NCBI and machine learning algorithms. Results The result of experiments using BLAST, Apriori and Decision Tree has shown that SARS-CoV-2 had high similarity with SARS-CoV while having comparably low similarity with MERS-CoV. We decided to compare the codons of SARS-CoV-2 and MERS-CoV to see the difference. Though the viruses are very alike according to BLAST and Apriori experiments, SVM proved that they can be effectively classified using non-linear kernels. Decision Tree experiment proved several remarkable properties of SARS-CoV-2 amino acid sequence that cannot be found in MERS-CoV amino acid sequence. The consequential purpose of this paper is to minimize the damage on humanity from SARS-CoV-2. Hence, further studies can be focused on the comparison of SARS-CoV-2 virus with other viruses that also can be transmitted during latent periods.

Download Full-text

A Machine Learning Approach to Short-Term Traffic Flow Prediction: A Case Study of Interstate 64 in Missouri

2018 IEEE International Smart Cities Conference (ISC2) ◽

10.1109/isc2.2018.8656924 ◽

2018 ◽

Cited By ~ 3

Author(s):

Osama Mohammed ◽

Jalil Kianfar

Keyword(s):

Machine Learning ◽

Traffic Flow ◽

Learning Approach ◽

Short Term ◽

Traffic Flow Prediction ◽

Flow Prediction ◽

Machine Learning Approach

Download Full-text

Estimating Lead Time Using Machine Learning Algorithms: A Case Study by a Textile Company

10.1109/asyu52992.2021.9599012 ◽

2021 ◽

Author(s):

Ceren Atik ◽

Recen Alp Kut ◽

Safak Birol

Keyword(s):

Machine Learning ◽

Lead Time ◽

Learning Algorithms ◽

Machine Learning Algorithms

Download Full-text

Scalable hierarchical clustering by composition rank vector encoding and tree structure

10.1101/2020.04.12.038026 ◽

2020 ◽

Author(s):

Xiao Lai ◽

Pu Tian

Keyword(s):

Machine Learning ◽

Hierarchical Clustering ◽

Clustering Algorithm ◽

High Dimensional Data ◽

Machine Learning Algorithms ◽

Tree Structure ◽

Supervised Machine Learning ◽

High Dimensional ◽

Rank Vector ◽

Nonlinear Correlations

AbstractSupervised machine learning, especially deep learning based on a wide variety of neural network architectures, have contributed tremendously to fields such as marketing, computer vision and natural language processing. However, development of un-supervised machine learning algorithms has been a bottleneck of artificial intelligence. Clustering is a fundamental unsupervised task in many different subjects. Unfortunately, no present algorithm is satisfactory for clustering of high dimensional data with strong nonlinear correlations. In this work, we propose a simple and highly efficient hierarchical clustering algorithm based on encoding by composition rank vectors and tree structure, and demonstrate its utility with clustering of protein structural domains. No record comparison, which is an expensive and essential common step to all present clustering algorithms, is involved. Consequently, it achieves linear time and space computational complexity hierarchical clustering, thus applicable to arbitrarily large datasets. The key factor in this algorithm is definition of composition, which is dependent upon physical nature of target data and therefore need to be constructed case by case. Nonetheless, the algorithm is general and applicable to any high dimensional data with strong nonlinear correlations. We hope this algorithm to inspire a rich research field of encoding based clustering well beyond composition rank vector trees.

Download Full-text