A Layered KNN-SVM Approach to Predict Missing Values of Functional Requirements in Product Customization

The conversion from functional requirements (FRs) to design parameters is the foundation of product customization. However, original customer needs usually result in incomplete FRs, limited by customers’ incomprehension on the design requirements of these products. As the incomplete FRs may undermine the design activities afterwards, managers need to develop an effective approach to predict the missing values of the FR. This study proposes an integrative approach to obtain the complete FR. The k nearest neighbor (KNN) algorithm is employed to predict the missing continuous variables in FR, using the improved distance formula for two incomplete FRs. Support vector machine (SVM) classifiers are adopted to classify the missing categorical variables in FR, combined with directed acyclic graph for multi-class classification. KNN and SVM are then integrated into a multi-layer framework to predict the missing values of FR, where categorical and continuous variables both exist. A case study on the elevator customization is conducted to verify that KNN-SVM is feasible in accurate prediction of elevator FR values. Furthermore, KNN-SVM outperforms other five single and five composite methods, with average reduction in root mean squared error (RMSE) of 39% and 21% against KNN and KNN-Tree, respectively.

Download Full-text

An Overview of AI-Assisted Design-on-Simulation Technology for Reliability Life Prediction of Advanced Packaging

Materials ◽

10.3390/ma14185342 ◽

2021 ◽

Vol 14 (18) ◽

pp. 5342

Author(s):

Sunil Kumar Panigrahy ◽

Yi-Chieh Tseng ◽

Bo-Ruei Lai ◽

Kuo-Ning Chiang

Keyword(s):

Neural Network ◽

Artificial Intelligence ◽

Electronic Packaging ◽

Prediction Accuracy ◽

Chip Thickness ◽

Design Parameters ◽

Support Vector ◽

K Nearest Neighbor ◽

Wafer Level ◽

Advanced Packaging

Several design parameters affect the reliability of wafer-level type advanced packaging, such as upper and lower pad sizes, solder volume, buffer layer thickness, and chip thickness, etc. Conventionally, the accelerated thermal cycling test (ATCT) is used to evaluate the reliability life of electronic packaging; however, optimizing the design parameters through ATCT is time-consuming and expensive, reducing the number of experiments becomes a critical issue. In recent years, many researchers have adopted the finite-element-based design-on-simulation (DoS) technology for the reliability assessment of electronic packaging. DoS technology can effectively shorten the design cycle, reduce costs, and effectively optimize the packaging structure. However, the simulation analysis results are highly dependent on the individual researcher and are usually inconsistent between them. Artificial intelligence (AI) can help researchers avoid the shortcomings of the human factor. This study demonstrates AI-assisted DoS technology by combining artificial intelligence and simulation technologies to predict wafer level package (WLP) reliability. In order to ensure reliability prediction accuracy, the simulation procedure was validated by several experiments prior to creating a large AI training database. This research studies several machine learning models, including artificial neural network (ANN), recurrent neural network (RNN), support vector regression (SVR), kernel ridge regression (KRR), K-nearest neighbor (KNN), and random forest (RF). These models are evaluated in this study based on prediction accuracy and CPU time consumption.

Download Full-text

Decoding continuous variables from EEG data using linear support vector regression (SVR) analysis with the Decision Decoding Toolbox (DDTBOX)

10.1101/2021.05.31.446502 ◽

2021 ◽

Author(s):

Stefan Bode ◽

Daniel Feuerriegel ◽

Elektra Schubert ◽

Hinze Hogendoorn

Keyword(s):

Support Vector Regression ◽

Simulation Study ◽

Continuous Variable ◽

Categorical Variables ◽

Subjective Rating ◽

Support Vector ◽

Response Force ◽

Continuous Variables ◽

Single Trial ◽

Eeg Data

Multivariate classification analysis for non-invasively acquired neuroimaging data is a powerful tool in cognitive neuroscience research. However, an important constraint of such pattern classifiers is that they are restricted to predicting categorical variables (i.e. assigning trials to classes). Here, we present an alternative approach, Support Vector Regression (SVR), which uses single-trial neuroimaging (e.g., EEG or MEG) data to predict a continuous variable of interest such as response time, response force, or any kind of subjective rating (e.g., emotional state, confidence, etc.). We describe how SVR can be used, how it is implemented in the Decision Decoding Toolbox (DDTBOX), and how it has been used in previous research. We then report results from two simulation studies, designed to closely resemble real EEG data, in which we predicted a continuous variable of interest across a range of analysis parameters. In Simulation Study 1, we observed that SVR was effective for analysis windows ranging from 2 ms - 100 ms, and that it was relatively unaffected by temporal averaging. In Simulation Study 2, we showed that prediction was still successful when only a small number of channels encoded information about the output variable, and that it was robust to temporal jitter regarding when that information was present in the EEG. Finally, we reanalysed a previously published dataset of similar size and observed highly comparable results in real EEG data. We conclude that linear SVR is a powerful tool for the investigation of single-trial EEG data in relation to continuous and more nuanced variables, which are not well-captured using classification approaches requiring distinct classes.

Download Full-text

CHOOSING APPROPRIATE IMPUTATION METHODS FOR MISSING DATA: A DECISION ALGORITHM ON METHODS FOR MISSING DATA

Journal of Al-Qadisiyah for Computer Science and Mathematics ◽

10.29304/jqcm.2019.11.2.588 ◽

2019 ◽

Vol 11 (2) ◽

pp. 65-73

Author(s):

Wisam A. Mahmood ◽

Mohammed S. Rashid ◽

Teaba Wala Aldeen ◽

Teaba Wala Aldeen

Keyword(s):

Missing Data ◽

Simulation Study ◽

Missing Values ◽

Nearest Neighbor ◽

Support Vector ◽

K Nearest Neighbor ◽

Decision Algorithm ◽

Imputation Methods ◽

Regression Imputation ◽

Mean Imputation

Missing values commonly happen in the realm of medical research, which is regarded creating a lot of bias in case it is neglected with poor handling. However, while dealing with such challenges, some standard statistical methods have been already developed and available, yet no credible method is available so far to infer credible estimates. The existing data size gets lowered, apart from a decrease in efficiency happens when missing values is found in a dataset. A number of imputation methods have addressed such challenges in early scholarly works for handling missing values. Some of the regular methods include complete case method, mean imputation method, Last Observation Carried Forward (LOCF) method, Expectation-Maximization (EM) algorithm, and Markov Chain Monte Carlo (MCMC), Mean Imputation (Mean), Hot Deck (HOT), Regression Imputation (Regress), K-nearest neighbor (KNN),K-Mean Clustering, Fuzzy K-Mean Clustering, Support Vector Machine, and Multiple Imputation (MI) method. In the present paper, a simulation study is attempted for carrying out an investigative exploration into the efficacy of the above mentioned archetypal imputation methods along with longitudinal data setting under missing completely at random (MCAR). We took out missingness from three cases in a block having low missingness of 5% as well as higher levels at 30% and 50%. With this simulation study, we concluded LOCF method having more bias than the other methods in most of the situations after carrying out a comparison through simulation study.

Download Full-text

Filling missing meteorological data with Computational Intelligence methods

ITM Web of Conferences ◽

10.1051/itmconf/20182300015 ◽

2018 ◽

Vol 23 ◽

pp. 00015

Author(s):

Joanna Kajewska-Szkudlarek ◽

Justyna Stańczyk

Keyword(s):

Computational Intelligence ◽

Missing Values ◽

Mean Squared Error ◽

Meteorological Data ◽

Time Of Day ◽

Daily Temperature ◽

Support Vector ◽

Temperature And Humidity ◽

Computational Intelligence Methods ◽

The City

Estimates of temperature and humidity values at a specific time of day, from hourly to monthly profiles, are needed for a number of environmental, ecological, agricultural and technical applications, ranging from natural hazards assessments, crop growth forecasting to designing solar energy systems. In climatology, they constitute the basis for drawing conclusions about climate variability. Data used in such analyses should be complete and reliable. Therefore, effective methods for filling missing values are sought. The initial scope of this research is to investigate the efficiency of computational intelligence methods in filling missing daily temperature and humidity parameters values. For this reason, a number of experiments have been conducted with Artificial Neural Networks and Support Vector Regression using meteorological data from the city of Wroclaw in Poland. The performance of these methods has been evaluated using standard statistical indicators, such as Correlation Coefficient and Root Mean Squared Error. Finally, certain computational intelligence techniques are proposed that can be used to predict daily temperature and humidity values more accurately in order to fill the missing data.

Download Full-text

Optimizing Error Rate in Intrusion Detection System Using Artificial Neural Network Algorithm

International Journal of Emerging Research in Management and Technology ◽

10.23956/ijermt.v6i9.102 ◽

2018 ◽

Vol 6 (9) ◽

pp. 152

Author(s):

S. Vijaya Rani ◽

G. N. K. Suresh Babu

Keyword(s):

Neural Network ◽

Artificial Neural Network ◽

Intrusion Detection ◽

Error Rate ◽

Learning Process ◽

Nearest Neighbor ◽

Detection System ◽

Support Vector ◽

K Nearest Neighbor ◽

Artificial Neural

The illegal hackers penetrate the servers and networks of corporate and financial institutions to gain money and extract vital information. The hacking varies from one computing system to many system. They gain access by sending malicious packets in the network through virus, worms, Trojan horses etc. The hackers scan a network through various tools and collect information of network and host. Hence it is very much essential to detect the attacks as they enter into a network. The methods available for intrusion detection are Naive Bayes, Decision tree, Support Vector Machine, K-Nearest Neighbor, Artificial Neural Networks. A neural network consists of processing units in complex manner and able to store information and make it functional for use. It acts like human brain and takes knowledge from the environment through training and learning process. Many algorithms are available for learning process This work carry out research on analysis of malicious packets and predicting the error rate in detection of injured packets through artificial neural network algorithms.

Download Full-text

Preoperative MRI findings and prediction of diagnostic utility of foramen ovale electrodes

Journal of Neurosurgery ◽

10.3171/2018.12.jns182093 ◽

2020 ◽

Vol 132 (3) ◽

pp. 692-699 ◽

Cited By ~ 1

Author(s):

Sarah K. Bick ◽

Marjan S. Dolatshahi ◽

Benjamin L. Grannan ◽

Andrew J. Cole ◽

Daniel B. Hoch ◽

...

Keyword(s):

Minimally Invasive ◽

Treatment Decision ◽

Temporal Lobectomy ◽

Categorical Variables ◽

Continuous Variables ◽

Foramen Ovale ◽

Mri Findings ◽

Noninvasive Methods ◽

Diagnostic Modality ◽

Diagnostic Investigations

OBJECTIVEForamen ovale electrodes (FOEs) are a minimally invasive method to localize mesial temporal seizures in cases in which noninvasive methods are inconclusive. The objective of this study was to identify factors predicting the ability of FOEs to yield a diagnosis in order to determine optimal candidates for this procedure.METHODSAll cases of diagnostic investigations performed with FOEs at the authors’ institution between 2005 and 2017 were reviewed. FOE investigation was defined as diagnostic if it led to a treatment decision. Demographic and clinical variables for diagnostic and nondiagnostic investigations were compared using a Wilcoxon rank-sum test for continuous variables and Fisher’s exact test for categorical variables.RESULTSNinety-three patients underwent investigations performed with FOEs during the study period and were included in the study. FOE investigation was diagnostic in 75.3% of cases. Of patients who underwent anterior temporal lobectomy following diagnostic FOE evaluation, 75.9% were Engel class I at last follow-up (average 40.1 months). When the diagnostic and nondiagnostic FOE groups were compared, patients who had diagnostic investigations were more likely to be male (57.1% male vs 26.1% in the nondiagnostic group, p = 0.015). They were also more likely to have temporal lesions on preoperative MRI (p = 0.018).CONCLUSIONSFOEs are a useful, minimally invasive diagnostic modality resulting in a treatment decision in 75% of cases. Male patients and patients with temporal lesions on MRI may be most likely to benefit from FOE investigation.

Download Full-text

Big Data to Knowledge: Application of Machine Learning to Predictive Modeling of Therapeutic Response in Cancer.

Current Genomics ◽

10.2174/1389202921999201224110101 ◽

2020 ◽

Vol 21 ◽

Author(s):

Sukanya Panja ◽

Sarra Rahem ◽

Cassandra J. Chu ◽

Antonina Mitrofanova

Keyword(s):

Machine Learning ◽

Missing Values ◽

Therapeutic Response ◽

Patient Data ◽

Machine Learning Techniques ◽

Support Vector ◽

Complex Data ◽

Human Machine Interaction ◽

Data Repositories ◽

Response Modeling

Background: In recent years, the availability of high throughput technologies, establishment of large molecular patient data repositories, and advancement in computing power and storage have allowed elucidation of complex mechanisms implicated in therapeutic response in cancer patients. The breadth and depth of such data, alongside experimental noise and missing values, requires a sophisticated human-machine interaction that would allow effective learning from complex data and accurate forecasting of future outcomes, ideally embedded in the core of machine learning design. Objective: In this review, we will discuss machine learning techniques utilized for modeling of treatment response in cancer, including Random Forests, support vector machines, neural networks, and linear and logistic regression. We will overview their mathematical foundations and discuss their limitations and alternative approaches all in light of their application to therapeutic response modeling in cancer. Conclusion: We hypothesize that the increase in the number of patient profiles and potential temporal monitoring of patient data will define even more complex techniques, such as deep learning and causal analysis, as central players in therapeutic response modeling.

Download Full-text

Application of Machine Learning Approaches for the Design and Study of Anticancer Drugs

Current Drug Targets ◽

10.2174/1389450119666180809122244 ◽

2019 ◽

Vol 20 (5) ◽

pp. 488-500 ◽

Cited By ~ 6

Author(s):

Yan Hu ◽

Yi Lu ◽

Shuo Wang ◽

Mengying Zhang ◽

Xiaosheng Qu ◽

...

Keyword(s):

Machine Learning ◽

Drug Design ◽

Anticancer Drugs ◽

Nearest Neighbor ◽

Cost Effective ◽

Support Vector ◽

Learning Approaches ◽

K Nearest Neighbor ◽

Activity Prediction ◽

Linear Discriminant

Background: Globally the number of cancer patients and deaths are continuing to increase yearly, and cancer has, therefore, become one of the world's highest causes of morbidity and mortality. In recent years, the study of anticancer drugs has become one of the most popular medical topics. Objective: In this review, in order to study the application of machine learning in predicting anticancer drugs activity, some machine learning approaches such as Linear Discriminant Analysis (LDA), Principal components analysis (PCA), Support Vector Machine (SVM), Random forest (RF), k-Nearest Neighbor (kNN), and Naïve Bayes (NB) were selected, and the examples of their applications in anticancer drugs design are listed. Results: Machine learning contributes a lot to anticancer drugs design and helps researchers by saving time and is cost effective. However, it can only be an assisting tool for drug design. Conclusion: This paper introduces the application of machine learning approaches in anticancer drug design. Many examples of success in identification and prediction in the area of anticancer drugs activity prediction are discussed, and the anticancer drugs research is still in active progress. Moreover, the merits of some web servers related to anticancer drugs are mentioned.

Download Full-text

Rapid Determining Contents of the Rhubarb Anthraquinones Compounds by Support Vector Machines Modeling based on Near Infrared Spectra

Current Analytical Chemistry ◽

10.2174/1573411016666200317111412 ◽

2020 ◽

Vol 16 ◽

Author(s):

Linqi Liu ◽

JInhua Luo ◽

Chenxi Zhao ◽

Bingxue Zhang ◽

Wei Fan ◽

...

Keyword(s):

Infrared Spectra ◽

Near Infrared ◽

Mean Squared Error ◽

Rapid Determination ◽

Partial Least Square ◽

Least Square ◽

Support Vector ◽

Near Infrared Spectra ◽

Aloe Emodin ◽

Relative Differences

BACKGROUND: Measuring medicinal compounds to evaluate their quality and efficacy has been recognized as a useful approach in treatment. Rhubarb anthraquinones compounds (mainly including aloe-emodin, rhein, emodin, chrysophanol and physcion) are its main effective components as purgating drug. In the current Chinese Pharmacopoeia, the total anthraquinones content is designated as its quantitative quality and control index while the content of each compound has not been specified. METHODS: On the basis of forty rhubarb samples, the correlation models between the near infrared spectra and UPLC analysis data were constructed using support vector machine (SVM) and partial least square (PLS) methods according to Kennard and Stone algorithm for dividing the calibration/prediction datasets. Good models mean they have high correlation coefficients (R2) and low root mean squared error of prediction (RMSEP) values. RESULTS: The models constructed by SVM have much better performance than those by PLS methods. The SVM models have high R2 of 0.8951, 0.9738, 0.9849, 0.9779, 0.9411 and 0.9862 that correspond to aloe-emodin, rhein, emodin, chrysophanol, physcion and total anthraquinones contents, respectively. The corresponding RMSEPs are 0.3592, 0.4182, 0.4508, 0.7121, 0.8365 and 1.7910, respectively. 75% of the predicted results have relative differences being lower than 10%. As for rhein and total anthraquinones, all of the predicted results have relative differences being lower than 10%. CONCLUSION: The nonlinear models constructed by SVM showed good performances with predicted values close to the experimental values. This can perform the rapid determination of the main medicinal ingredients in rhubarb medicinal materials.

Download Full-text

Predicting Future Occurrence of Acute Hypotensive Episodes Using Noninvasive and Invasive Features

Military Medicine ◽

10.1093/milmed/usaa418 ◽

2021 ◽

Vol 186 (Supplement_1) ◽

pp. 445-451

Author(s):

Yifei Sun ◽

Navid Rashedi ◽

Vikrant Vaze ◽

Parikshit Shah ◽

Ryan Halter ◽

...

Keyword(s):

Machine Learning ◽

Support Vector Machine ◽

Real World ◽

Short Term Memory ◽

Model Performance ◽

Learning Technologies ◽

Machine Learning Algorithms ◽

Support Vector ◽

K Nearest Neighbor ◽

Continuous Map

ABSTRACT Introduction Early prediction of the acute hypotensive episode (AHE) in critically ill patients has the potential to improve outcomes. In this study, we apply different machine learning algorithms to the MIMIC III Physionet dataset, containing more than 60,000 real-world intensive care unit records, to test commonly used machine learning technologies and compare their performances. Materials and Methods Five classification methods including K-nearest neighbor, logistic regression, support vector machine, random forest, and a deep learning method called long short-term memory are applied to predict an AHE 30 minutes in advance. An analysis comparing model performance when including versus excluding invasive features was conducted. To further study the pattern of the underlying mean arterial pressure (MAP), we apply a regression method to predict the continuous MAP values using linear regression over the next 60 minutes. Results Support vector machine yields the best performance in terms of recall (84%). Including the invasive features in the classification improves the performance significantly with both recall and precision increasing by more than 20 percentage points. We were able to predict the MAP with a root mean square error (a frequently used measure of the differences between the predicted values and the observed values) of 10 mmHg 60 minutes in the future. After converting continuous MAP predictions into AHE binary predictions, we achieve a 91% recall and 68% precision. In addition to predicting AHE, the MAP predictions provide clinically useful information regarding the timing and severity of the AHE occurrence. Conclusion We were able to predict AHE with precision and recall above 80% 30 minutes in advance with the large real-world dataset. The prediction of regression model can provide a more fine-grained, interpretable signal to practitioners. Model performance is improved by the inclusion of invasive features in predicting AHE, when compared to predicting the AHE based on only the available, restricted set of noninvasive technologies. This demonstrates the importance of exploring more noninvasive technologies for AHE prediction.

Download Full-text