THE IMPACT OF ARTIFICIAL NEURAL NETWORK’S STRUCTURE ON ITS EFFICIENCY FOR FINANCIAL INDICATORS FORECASTING

The article investigates the influence of artificial neural network’s structure on the results, with example of multlayer perceptron for forecasting some of the financial indicators. Multiple tests were made with various networks structures: different numbers of hidden layers and different numbers of neurons in these layers. Based on tests results, the increase of network’s size is effective to a certain extent, but at some point the further size increase is unreasonable. Also, the test results demonstrate that overfitting problem for multilayer perceptron is not as crucial as for the other machine learning models, such as regression. Key words: artificial neural networks, forecasting, multlayer perceptron, overfitting, artificial neural netwok’s size.

Download Full-text

Prediction of Head Movement in 360-Degree Videos Using Attention Model

Sensors ◽

10.3390/s21113678 ◽

2021 ◽

Vol 21 (11) ◽

pp. 3678

Author(s):

Dongwon Lee ◽

Minji Choi ◽

Joohyun Lee

Keyword(s):

Machine Learning ◽

Short Term Memory ◽

Moving Average ◽

The Other ◽

Learning Models ◽

Short Term ◽

Term Memory ◽

Attention Model ◽

Long Short Term Memory ◽

Machine Learning Models

In this paper, we propose a prediction algorithm, the combination of Long Short-Term Memory (LSTM) and attention model, based on machine learning models to predict the vision coordinates when watching 360-degree videos in a Virtual Reality (VR) or Augmented Reality (AR) system. Predicting the vision coordinates while video streaming is important when the network condition is degraded. However, the traditional prediction models such as Moving Average (MA) and Autoregression Moving Average (ARMA) are linear so they cannot consider the nonlinear relationship. Therefore, machine learning models based on deep learning are recently used for nonlinear predictions. We use the Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU) neural network methods, originated in Recurrent Neural Networks (RNN), and predict the head position in the 360-degree videos. Therefore, we adopt the attention model to LSTM to make more accurate results. We also compare the performance of the proposed model with the other machine learning models such as Multi-Layer Perceptron (MLP) and RNN using the root mean squared error (RMSE) of predicted and real coordinates. We demonstrate that our model can predict the vision coordinates more accurately than the other models in various videos.

Download Full-text

O-203 Application of machine learning to predict aneuploidy and mosaicism in embryos from in vitro fertilization (IVF) cycles

Human Reproduction ◽

10.1093/humrep/deab128.014 ◽

2021 ◽

Vol 36 (Supplement_1) ◽

Author(s):

J A Ortiz ◽

R Morales ◽

B Lledo ◽

E Garcia-Hernandez ◽

A Cascales ◽

...

Keyword(s):

Machine Learning ◽

Predictive Model ◽

Predictive Models ◽

Maternal Age ◽

The Other ◽

Predictor Variables ◽

Learning Models ◽

Male Factor ◽

Factors Associated ◽

Machine Learning Models

Abstract Study question Is it possible to predict the likelihood of an IVF embryo being aneuploid and/or mosaic using a machine learning algorithm? Summary answer There are paternal, maternal, embryonic and IVF-cycle factors that are associated with embryonic chromosomal status that can be used as predictors in machine learning models. What is known already The factors associated with embryonic aneuploidy have been extensively studied. Mostly maternal age and to a lesser extent male factor and ovarian stimulation have been related to the occurrence of chromosomal alterations in the embryo. On the other hand, the main factors that may increase the incidence of embryo mosaicism have not yet been established. The models obtained using classical statistical methods to predict embryonic aneuploidy and mosaicism are not of high reliability. As an alternative to traditional methods, different machine and deep learning algorithms are being used to generate predictive models in different areas of medicine, including human reproduction. Study design, size, duration The study design is observational and retrospective. A total of 4654 embryos from 1558 PGT-A cycles were included (January-2017 to December-2020). The trophoectoderm biopsies on D5, D6 or D7 blastocysts were analysed by NGS. Embryos with ≤25% aneuploid cells were considered euploid, between 25-50% were classified as mosaic and aneuploid with >50%. The variables of the PGT-A were recorded in a database from which predictive models of embryonic aneuploidy and mosaicism were developed. Participants/materials, setting, methods The main indications for PGT-A were advanced maternal age, abnormal sperm FISH and recurrent miscarriage or implantation failure. Embryo analysis were performed using Veriseq-NGS (Illumina). The software used to carry out all the analysis was R (RStudio). The library used to implement the different algorithms was caret. In the machine learning models, 22 predictor variables were introduced, which can be classified into 4 categories: maternal, paternal, embryonic and those specific to the IVF cycle. Main results and the role of chance The different couple, embryo and stimulation cycle variables were recorded in a database (22 predictor variables). Two different predictive models were performed, one for aneuploidy and the other for mosaicism. The predictor variable was of multi-class type since it included the segmental and whole chromosome alteration categories. The dataframe were first preprocessed and the different classes to be predicted were balanced. A 80% of the data were used for training the model and 20% were reserved for further testing. The classification algorithms applied include multinomial regression, neural networks, support vector machines, neighborhood-based methods, classification trees, gradient boosting, ensemble methods, Bayesian and discriminant analysis-based methods. The algorithms were optimized by minimizing the Log_Loss that measures accuracy but penalizing misclassifications. The best predictive models were achieved with the XG-Boost and random forest algorithms. The AUC of the predictive model for aneuploidy was 80.8% (Log_Loss 1.028) and for mosaicism 84.1% (Log_Loss: 0.929). The best predictor variables of the models were maternal age, embryo quality, day of biopsy and whether or not the couple had a history of pregnancies with chromosomopathies. The male factor only played a relevant role in the mosaicism model but not in the aneuploidy model. Limitations, reasons for caution Although the predictive models obtained can be very useful to know the probabilities of achieving euploid embryos in an IVF cycle, increasing the sample size and including additional variables could improve the models and thus increase their predictive capacity. Wider implications of the findings Machine learning can be a very useful tool in reproductive medicine since it can allow the determination of factors associated with embryonic aneuploidies and mosaicism in order to establish a predictive model for both. To identify couples at risk of embryo aneuploidy/mosaicism could benefit them of the use of PGT-A. Trial registration number Not Applicable

Download Full-text

Implementing clinical decision support for oncology advanced care planning: A systems engineering framework to optimize the usability and utility of a machine learning predictive model in clinical practice.

Journal of Clinical Oncology ◽

10.1200/jco.2020.39.28_suppl.330 ◽

2021 ◽

Vol 39 (28_suppl) ◽

pp. 330-330

Author(s):

Teja Ganta ◽

Stephanie Lehrman ◽

Rachel Pappalardo ◽

Madalene Crow ◽

Meagan Will ◽

...

Keyword(s):

Machine Learning ◽

High Risk ◽

Predictive Model ◽

Systems Engineering ◽

Care Planning ◽

Learning Models ◽

Predictive Tool ◽

Risk Of Death ◽

The Impact ◽

Machine Learning Models

330 Background: Machine learning models are well-positioned to transform cancer care delivery by providing oncologists with more accurate or accessible information to augment clinical decisions. Many machine learning projects, however, focus on model accuracy without considering the impact of using the model in real-world settings and rarely carry forward to clinical implementation. We present a human-centered systems engineering approach to address clinical problems with workflow interventions utilizing machine learning algorithms. Methods: We aimed to develop a mortality predictive tool, using a Random Forest algorithm, to identify oncology patients at high risk of death within 30 days to move advance care planning (ACP) discussions earlier in the illness trajectory. First, a project sponsor defined the clinical need and requirements of an intervention. The data scientists developed the predictive algorithm using data available in the electronic health record (EHR). A multidisciplinary workgroup was assembled including oncology physicians, advanced practice providers, nurses, social workers, chaplain, clinical informaticists, and data scientists. Meeting bi-monthly, the group utilized human-centered design (HCD) methods to understand clinical workflows and identify points of intervention. The workgroup completed a workflow redesign workshop, a 90-minute facilitated group discussion, to integrate the model in a future state workflow. An EHR (Epic) analyst built the user interface to support the intervention per the group’s requirements. The workflow was piloted in thoracic oncology and bone marrow transplant with plans to scale to other cancer clinics. Results: Our predictive model performance on test data was acceptable (sensitivity 75%, specificity 75%, F-1 score 0.71, AUC 0.82). The workgroup identified a “quality of life coordinator” who: reviews an EHR report of patients scheduled in the upcoming 7 days who have a high risk of 30-day mortality; works with the oncology team to determine ACP clinical appropriateness; documents the need for ACP; identifies potential referrals to supportive oncology, social work, or chaplain; and coordinates the oncology appointment. The oncologist receives a reminder on the day of the patient’s scheduled visit. Conclusions: This workgroup is a viable approach that can be replicated at institutions to address clinical needs and realize the full potential of machine learning models in healthcare. The next steps for this project are to address end-user feedback from the pilot, expand the intervention to other cancer disease groups, and track clinical metrics.

Download Full-text

Grid search in hyperparameter optimization of machine learning models for prediction of HIV/AIDS test results

International Journal of Computers and Applications ◽

10.1080/1206212x.2021.1974663 ◽

2021 ◽

pp. 1-12

Author(s):

Daniel Mesafint Belete ◽

Manjaiah D. Huchaiah

Keyword(s):

Machine Learning ◽

Grid Search ◽

Test Results ◽

Learning Models ◽

Hyperparameter Optimization ◽

Machine Learning Models ◽

Hiv Aids

Download Full-text

Influence of social determinants of health and county vaccination rates on machine learning models to predict COVID-19 case growth in Tennessee

BMJ Health & Care Informatics ◽

10.1136/bmjhci-2021-100439 ◽

2021 ◽

Vol 28 (1) ◽

pp. e100439

Author(s):

Lukasz S Wylezinski ◽

Coleman R Harris ◽

Cody N Heiser ◽

Jamieson D Gray ◽

Charles F Spurlock

Keyword(s):

Machine Learning ◽

Risk Factors ◽

Social Determinants Of Health ◽

Social Determinants ◽

Determinants Of Health ◽

Learning Models ◽

Vaccination Rates ◽

Data Framework ◽

The Impact ◽

Machine Learning Models

IntroductionThe SARS-CoV-2 (COVID-19) pandemic has exposed health disparities throughout the USA, particularly among racial and ethnic minorities. As a result, there is a need for data-driven approaches to pinpoint the unique constellation of clinical and social determinants of health (SDOH) risk factors that give rise to poor patient outcomes following infection in US communities.MethodsWe combined county-level COVID-19 testing data, COVID-19 vaccination rates and SDOH information in Tennessee. Between February and May 2021, we trained machine learning models on a semimonthly basis using these datasets to predict COVID-19 incidence in Tennessee counties. We then analyzed SDOH data features at each time point to rank the impact of each feature on model performance.ResultsOur results indicate that COVID-19 vaccination rates play a crucial role in determining future COVID-19 disease risk. Beginning in mid-March 2021, higher vaccination rates significantly correlated with lower COVID-19 case growth predictions. Further, as the relative importance of COVID-19 vaccination data features grew, demographic SDOH features such as age, race and ethnicity decreased while the impact of socioeconomic and environmental factors, including access to healthcare and transportation, increased.ConclusionIncorporating a data framework to track the evolving patterns of community-level SDOH risk factors could provide policy-makers with additional data resources to improve health equity and resilience to future public health emergencies.

Download Full-text

A machine learning approach to inform developmental milestone achievement for children with autism (Preprint)

10.2196/preprints.29242 ◽

2021 ◽

Author(s):

Munirul M. Haque ◽

Masud Rabbani ◽

Dipranjan Das Dipal ◽

Md Ishrak Islam Zarif ◽

Anik Iqbal ◽

...

Keyword(s):

Machine Learning ◽

Autism Spectrum Disorder ◽

Children With Autism ◽

Autism Spectrum ◽

The Other ◽

Supervised Machine Learning ◽

Learning Models ◽

Children With Asd ◽

Socio Demographic Factors ◽

Machine Learning Models

BACKGROUND Care for children with autism spectrum disorder (ASD) can be challenging for families and medical care systems. This is especially true in Low-and-Middle-Income-countries (LMIC) like Bangladesh. To improve family-practitioner communication and developmental monitoring of children with ASD, [spell out] (mCARE) was developed. Within this study, mCARE was used to track child milestone achievement and family socio-demographic assets to inform mCARE feasibility/scalability and family-asset informed practitioner recommendations. OBJECTIVE The objectives of this paper are three-fold. First, document how mCARE can be used to monitor child milestone achievement. Second, demonstrate how advanced machine learning models can inform our understanding of milestone achievement in children with ASD. Third, describe family/child socio-demographic factors that are associated with earlier milestone achievement in children with ASD (across five machine learning models). METHODS Using mCARE collected data, this study assessed milestone achievement in 300 children with ASD from Bangladesh. In this study, we used four supervised machine learning (ML) algorithms (Decision Tree, Logistic Regression, k-Nearest Neighbors, Artificial Neural Network) and one unsupervised machine learning (K-means Clustering) to build models of milestone achievement based on family/child socio-demographic details. For analyses, the sample was randomly divided in half to train the ML models and then their accuracy was estimated based on the other half of the sample. Each model was specified for the following milestones: Brushes teeth, Asks to use the toilet, Urinates in the toilet or potty, and Buttons large buttons. RESULTS This study aimed to find a suitable machine learning algorithm for milestone prediction/achievement for children with ASD using family/child socio-demographic characteristics. For, Brushes teeth, the three supervised machine learning models met or exceeded an accuracy of 95% with Logistic Regression, KNN, and ANN as the most robust socio-demographic predictors. For Asks to use toilet, 84.00% accuracy was achieved with the KNN and ANN models. For these models, the family socio-demographic predictors of “family expenditure” and “parents’ age” accounted for most of the model variability. The last two parameters, Urinates in toilet or potty and Buttons large buttons had an accuracy of 91.00% and 76.00%, respectively, in ANN. Overall, the ANN had a higher accuracy (Above ~80% on average) among the other algorithms for all the parameters. Across the models and milestones, “family expenditure”, “family size/ type”, “living places” and “parent’s age and occupation” were the most influential family/child socio-demographic factors. CONCLUSIONS mCARE was successfully deployed in an LMIC (i.e., Bangladesh), allowing parents and care-practitioners a mechanism to share detailed information on child milestones achievement. Using advanced modeling techniques this study demonstrates how family/child socio-demographic elements can inform child milestone achievement. Specifically, families with fewer socio-demographic resources reported later milestone attainment. Developmental science theories highlight how family/systems can directly influence child development and this study provides a clear link between family resources and child developmental progress. Clinical implications for this work could include supporting the larger family system to improve child milestone achievement. CLINICALTRIAL We took the IRB from Marquette University Institutional Review Board on July 9, 2020, with the protocol number HR-1803022959, and titled “MOBILE-BASED CARE FOR CHILDREN WITH AUTISM SPECTRUM DISORDER USING REMOTE EXPERIENCE SAMPLING METHOD (MCARE)” for recruiting a total of 316 subjects, of which we recruited 300. (Details description of participants in Methods section)

Download Full-text

Intra-domain and cross-domain transfer learning for time series

10.5194/egusphere-egu21-12142 ◽

2021 ◽

Author(s):

Erik Otović ◽

Marko Njirjak ◽

Dario Jozinović ◽

Goran Mauša ◽

Alberto Michelini ◽

...

Keyword(s):

Machine Learning ◽

Time Series ◽

Transfer Learning ◽

Time Series Data ◽

The Other ◽

Series Data ◽

Sound Recognition ◽

Transfer Of Knowledge ◽

Learning Models ◽

Machine Learning Models

In this study, we compared the performance of machine learning models trained using transfer learning and those that were trained from scratch - on time series data. Four machine learning models were used for the experiment. Two models were taken from the field of seismology, and the other two are general-purpose models for working with time series data. The accuracy of selected models was systematically observed and analyzed when switching within the same domain of application (seismology), as well as between mutually different domains of application (seismology, speech, medicine, finance). In seismology, we used two databases of local earthquakes (one in counts, and the other with the instrument response removed) and a database of global earthquakes for predicting earthquake magnitude; other datasets targeted classifying spoken words (speech), predicting stock prices (finance) and classifying muscle movement from EMG signals (medicine). In practice, it is very demanding and sometimes impossible to collect datasets of tagged data large enough to successfully train a machine learning model. Therefore, in our experiment, we use reduced data sets of 1,500 and 9,000 data instances to mimic such conditions. Using the same scaled-down datasets, we trained two sets of machine learning models: those that used transfer learning for training and those that were trained from scratch. We compared the performances between pairs of models in order to draw conclusions about the utility of transfer learning. In order to confirm the validity of the obtained results, we repeated the experiments several times and applied statistical tests to confirm the significance of the results. The study shows when, within the set experimental framework, the transfer of knowledge brought improvements in terms of model accuracy and in terms of model convergence rate. Our results show that it is possible to achieve better performance and faster convergence by transferring knowledge from the domain of global earthquakes to the domain of local earthquakes; sometimes also vice versa. However, improvements in seismology can sometimes also be achieved by transferring knowledge from medical and audio domains. The results show that the transfer of knowledge between other domains brought even more significant improvements, compared to those within the field of seismology. For example, it has been shown that models in the field of sound recognition have achieved much better performance compared to classical models and that the domain of sound recognition is very compatible with knowledge from other domains. We came to similar conclusions for the domains of medicine and finance. Ultimately, the paper offers suggestions when transfer learning is useful, and the explanations offered can provide a good starting point for knowledge transfer using time series data.

Download Full-text

Practical Considerations for Accuracy Evaluation in Sensor-Based Machine Learning and Deep Learning

Sensors ◽

10.3390/s19163491 ◽

2019 ◽

Vol 19 (16) ◽

pp. 3491 ◽

Cited By ~ 1

Author(s):

Issam Hammad ◽

Kamal El-Sankary

Keyword(s):

Machine Learning ◽

Thermal Noise ◽

Error Resilience ◽

Sensor Data ◽

Accuracy Evaluation ◽

Sensor Failure ◽

Learning Models ◽

Analog To Digital ◽

The Impact ◽

Machine Learning Models

Accuracy evaluation in machine learning is based on the split of data into a training set and a test set. This critical step is applied to develop machine learning models including models based on sensor data. For sensor-based problems, comparing the accuracy of machine learning models using the train/test split provides only a baseline comparison in ideal situations. Such comparisons won’t consider practical production problems that can impact the inference accuracy such as the sensors’ thermal noise, performance with lower inference quantization, and tolerance to sensor failure. Therefore, this paper proposes a set of practical tests that can be applied when comparing the accuracy of machine learning models for sensor-based problems. First, the impact of the sensors’ thermal noise on the models’ inference accuracy was simulated. Machine learning algorithms have different levels of error resilience to thermal noise, as will be presented. Second, the models’ accuracy using lower inference quantization was compared. Lowering inference quantization leads to lowering the analog-to-digital converter (ADC) resolution which is cost-effective in embedded designs. Moreover, in custom designs, analog-to-digital converters’ (ADCs) effective number of bits (ENOB) is usually lower than the ideal number of bits due to various design factors. Therefore, it is practical to compare models’ accuracy using lower inference quantization. Third, the models’ accuracy tolerance to sensor failure was evaluated and compared. For this study, University of California Irvine (UCI) ‘Daily and Sports Activities’ dataset was used to present these practical tests and their impact on model selection.

Download Full-text

Identification of Primary Antimicrobial Resistance Drivers in Agricultural Nontyphoidal Salmonella enterica Serovars by Using Machine Learning

mSystems ◽

10.1128/msystems.00211-19 ◽

2019 ◽

Vol 4 (4) ◽

Cited By ~ 1

Author(s):

Finlay Maguire ◽

Muhammad Attiq Rehman ◽

Catherine Carrillo ◽

Moussa S. Diarra ◽

Robert G. Beiko

Keyword(s):

Machine Learning ◽

Logistic Regression ◽

Antimicrobial Resistance ◽

Broiler Chicken ◽

Genomic Data ◽

Set Covering ◽

Learning Models ◽

Commercial Chicken ◽

The Impact ◽

Machine Learning Models

ABSTRACT Nontyphoidal Salmonella (NTS) is a leading global cause of bacterial foodborne morbidity and mortality. Our ability to treat severe NTS infections has been impaired by increasing antimicrobial resistance (AMR). To understand and mitigate the global health crisis AMR represents, we need to link the observed resistance phenotypes with their underlying genomic mechanisms. Broiler chickens represent a key reservoir and vector for NTS infections, but isolates from this setting have been characterized in only very low numbers relative to clinical isolates. In this study, we sequenced and assembled 97 genomes encompassing 7 serotypes isolated from broiler chicken in farms in British Columbia between 2005 and 2008. Through application of machine learning (ML) models to predict the observed AMR phenotype from this genomic data, we were able to generate highly (0.92 to 0.99) precise logistic regression models using known AMR gene annotations as features for 7 antibiotics (amoxicillin-clavulanic acid, ampicillin, cefoxitin, ceftiofur, ceftriaxone, streptomycin, and tetracycline). Similarly, we also trained “reference-free” k-mer-based set-covering machine phenotypic prediction models (0.91 to 1.0 precision) for these antibiotics. By combining the inferred k-mers and logistic regression weights, we identified the primary drivers of AMR for the 7 studied antibiotics in these isolates. With our research representing one of the largest studies of a diverse set of NTS isolates from broiler chicken, we can thus confirm that the AmpC-like CMY-2 β-lactamase is a primary driver of β-lactam resistance and that the phosphotransferases APH(6)-Id and APH(3″-Ib) are the principal drivers of streptomycin resistance in this important ecosystem. IMPORTANCE Antimicrobial resistance (AMR) represents an existential threat to the function of modern medicine. Genomics and machine learning methods are being increasingly used to analyze and predict AMR. This type of surveillance is very important to try to reduce the impact of AMR. Machine learning models are typically trained using genomic data, but the aspects of the genomes that they use to make predictions are rarely analyzed. In this work, we showed how, by using different types of machine learning models and performing this analysis, it is possible to identify the key genes underlying AMR in nontyphoidal Salmonella (NTS). NTS is among the leading cause of foodborne illness globally; however, AMR in NTS has not been heavily studied within the food chain itself. Therefore, in this work we performed a broad-scale analysis of the AMR in NTS isolates from commercial chicken farms and identified some priority AMR genes for surveillance.

Download Full-text

Rainfall Prediction for Udaipur, Rajasthan using Machine Learning Models Based on Temperature, Vapour Pressure and Relative Humidity

International Journal of Recent Technology and Engineering - 2 ◽

10.35940/ijrte.f1024.0386s20 ◽

2020 ◽

Vol 8 (6S) ◽

pp. 133-137

Keyword(s):

Machine Learning ◽

Relative Humidity ◽

Vapour Pressure ◽

Predictor Variables ◽

Ensemble Model ◽

Learning Models ◽

Rainfall Prediction ◽

Predictor Importance ◽

The Impact ◽

Machine Learning Models

The study aims at Rainfall prediction using Machine Learning models using the minimum of features. The prediction here is based on temperature, vapour pressure and relative humidity. Numerous studies carried out earlier used more features than this study. A training-test split of 75-25 was used. The best results were obtained by combining the best of the candidate models into an ensemble model to identify that predictor importance of vapour pressure was 0.89 while that of relative humidity was 0.11 with temperature not seen as a significant predictor for rainfall though the high correlation of temperature (°C) with vapour pressure (Torr) and relative humidity (Percentage) suggests that the two predictor variables subsume the impact of temperature.

Download Full-text