Analysis and Prediction of Soccer Games: An Application to the Kaggle European Soccer Database

<p>The study of soccer game data has many applications for both fans and teams. The effective analytical work can not only help the teams to improve their offensive and defensive skills and strategies, but also could assist the fans to make a bet. In this work, the authors study the European League Dataset with statistical methods to analyze the game data. Moreover, machine learning techniques are designed to predict the game results based on in-game performance and pre-game odds provided by bookmakers. With rational feature engineering and model selection, our model results in an overall 95% accuracy.</p>

Download Full-text

Wind Speed Forecasting by Conventional Statistical Methods and Machine Learning Techniques

10.1109/epec52095.2021.9621686 ◽

2021 ◽

Author(s):

Shah Mohammad Rezwanul Haque Shawon ◽

Md Abu Saaklayen ◽

Xiaodong Liang

Keyword(s):

Machine Learning ◽

Wind Speed ◽

Statistical Methods ◽

Machine Learning Techniques ◽

Wind Speed Forecasting ◽

Learning Techniques

Download Full-text

Machine learning and materials modelling interpretation of in vivo toxicological response to TiO2 nanoparticles library (UV and non-UV exposure)

Nanoscale ◽

10.1039/d1nr03231c ◽

2021 ◽

Author(s):

Susana I. L. Gomes ◽

Mónica J. B. Amorim ◽

Suman Pokhrel ◽

Lutz Mädler ◽

Matteo Fasano ◽

...

Keyword(s):

Machine Learning ◽

Tio2 Nanoparticles ◽

Statistical Methods ◽

Multiscale Modelling ◽

Machine Learning Techniques ◽

Uv Exposure ◽

Biological Functions ◽

Learning Techniques ◽

Materials Modelling

Based on a highly detailed materials characterisation database (including atomistic and multiscale modelling), single and univariate statistical methods, combined with machine learning techniques, revealed key descriptors of biological functions.

Download Full-text

Leukemia Drug Prediction Using Machine Learning Techniques with Feature Engineering

Journal of Advanced Research in Dynamical and Control Systems ◽

10.5373/jardcs/v12sp4/20201475 ◽

2020 ◽

Vol 12 (SP4) ◽

pp. 141-146

Author(s):

Dr. Priya N.

Keyword(s):

Machine Learning ◽

Machine Learning Techniques ◽

Feature Engineering ◽

Learning Techniques

Download Full-text

Deep Learning Approach for Extracting Catch Phrases from Legal Documents

Advances in Computer and Electrical Engineering - Neural Networks for Natural Language Processing ◽

10.4018/978-1-7998-1159-6.ch009 ◽

2020 ◽

pp. 143-158

Author(s):

Kayalvizhi S. ◽

Thenmozhi D.

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Statistical Methods ◽

Deep Neural Network ◽

Mean Average Precision ◽

Machine Learning Techniques ◽

Learning Approach ◽

Legal Documents ◽

Learning Techniques

Catch phrases are the important phrases that precisely explain the document. They represent the context of the whole document. They can also be used to retrieve relevant prior cases by the judges and lawyers for assuring justice in the domain of law. Currently, catch phrases are extracted using statistical methods, machine learning techniques, and deep learning techniques. The authors propose a sequence to sequence (Seq2Seq) deep neural network to extract catch phrases from legal documents. They have employed several layers, namely embedding layer, encoder-decoder layer, projection layer, and loss layer to build the deep neural network. The methodology is evaluated on IRLeD@FIRE-2017 dataset and the method has obtained 0.787 and 0.607 as mean average precision and recall scores respectively. Results show that the proposed method outperforms the existing systems.

Download Full-text

Machine Learning Techniques and Statistical Methods for Business Applications: Implications on Big Data Gold Rush

Advanced Science Letters ◽

10.1166/asl.2018.11760 ◽

2018 ◽

Vol 24 (7) ◽

pp. 5474-5477 ◽

Cited By ~ 1

Author(s):

Se-Hak Chun

Keyword(s):

Machine Learning ◽

Big Data ◽

Statistical Methods ◽

Gold Rush ◽

Machine Learning Techniques ◽

Business Applications ◽

Learning Techniques

Download Full-text

Statistical methods versus machine learning techniques for donor-recipient matching in liver transplantation

PLoS ONE ◽

10.1371/journal.pone.0252068 ◽

2021 ◽

Vol 16 (5) ◽

pp. e0252068

Author(s):

David Guijo-Rubio ◽

Javier Briceño ◽

Pedro Antonio Gutiérrez ◽

Maria Dolores Ayllón ◽

Rubén Ciria ◽

...

Keyword(s):

Machine Learning ◽

Liver Transplantation ◽

Statistical Methods ◽

Machine Learning Techniques ◽

Gradient Boosting ◽

Support Vector ◽

End Points ◽

Liver Allocation ◽

Learning Techniques ◽

Modelling Techniques

Donor-Recipient (D-R) matching is one of the main challenges to be fulfilled nowadays. Due to the increasing number of recipients and the small amount of donors in liver transplantation, the allocation method is crucial. In this paper, to establish a fair comparison, the United Network for Organ Sharing database was used with 4 different end-points (3 months, and 1, 2 and 5 years), with a total of 39, 189 D-R pairs and 28 donor and recipient variables. Modelling techniques were divided into two groups: 1) classical statistical methods, including Logistic Regression (LR) and Naïve Bayes (NB), and 2) standard machine learning techniques, including Multilayer Perceptron (MLP), Random Forest (RF), Gradient Boosting (GB) or Support Vector Machines (SVM), among others. The methods were compared with standard scores, MELD, SOFT and BAR. For the 5-years end-point, LR (AUC = 0.654) outperformed several machine learning techniques, such as MLP (AUC = 0.599), GB (AUC = 0.600), SVM (AUC = 0.624) or RF (AUC = 0.644), among others. Moreover, LR also outperformed standard scores. The same pattern was reproduced for the others 3 end-points. Complex machine learning methods were not able to improve the performance of liver allocation, probably due to the implicit limitations associated to the collection process of the database.

Download Full-text

Machine Learning for Plant Breeding and Biotechnology

Agriculture ◽

10.3390/agriculture10100436 ◽

2020 ◽

Vol 10 (10) ◽

pp. 436 ◽

Cited By ~ 2

Author(s):

Mohsen Niazian ◽

Gniewko Niedbała

Keyword(s):

Machine Learning ◽

Plant Breeding ◽

Statistical Methods ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Machine Learning Techniques ◽

Learning Techniques ◽

Plant Characteristics

Classical univariate and multivariate statistics are the most common methods used for data analysis in plant breeding and biotechnology studies. Evaluation of genetic diversity, classification of plant genotypes, analysis of yield components, yield stability analysis, assessment of biotic and abiotic stresses, prediction of parental combinations in hybrid breeding programs, and analysis of in vitro-based biotechnological experiments are mainly performed by classical statistical methods. Despite successful applications, these classical statistical methods have low efficiency in analyzing data obtained from plant studies, as the genotype, environment, and their interaction (G × E) result in nondeterministic and nonlinear nature of plant characteristics. Large-scale data flow, including phenomics, metabolomics, genomics, and big data, must be analyzed for efficient interpretation of results affected by G × E. Nonlinear nonparametric machine learning techniques are more efficient than classical statistical models in handling large amounts of complex and nondeterministic information with “multiple-independent variables versus multiple-dependent variables” nature. Neural networks, partial least square regression, random forest, and support vector machines are some of the most fascinating machine learning models that have been widely applied to analyze nonlinear and complex data in both classical plant breeding and in vitro-based biotechnological studies. High interpretive power of machine learning algorithms has made them popular in the analysis of plant complex multifactorial characteristics. The classification of different plant genotypes with morphological and molecular markers, modeling and predicting important quantitative characteristics of plants, the interpretation of complex and nonlinear relationships of plant characteristics, and predicting and optimizing of in vitro breeding methods are the examples of applications of machine learning in conventional plant breeding and in vitro-based biotechnological studies. Precision agriculture is possible through accurate measurement of plant characteristics using imaging techniques and then efficient analysis of reliable extracted data using machine learning algorithms. Perfect interpretation of high-throughput phenotyping data is applicable through coupled machine learning-image processing. Some applied and potentially applicable capabilities of machine learning techniques in conventional and in vitro-based plant breeding studies have been discussed in this overview. Discussions are of great value for future studies and could inspire researchers to apply machine learning in new layers of plant breeding.

Download Full-text

Overview of Statistical and Machine Learning Techniques for Determining Causes of Death from Verbal Autopsies: A Systematic Literature Review

10.21203/rs.3.rs-95087/v1 ◽

2020 ◽

Author(s):

Michael Tonderai Mapundu ◽

Chodziwadziwa Kabudula ◽

Eustasius Musenge ◽

Turgay Celik

Keyword(s):

Machine Learning ◽

Statistical Methods ◽

Cause Of Death ◽

Verbal Autopsy ◽

Causes Of Death ◽

Machine Learning Techniques ◽

Health Priorities ◽

Learning Approaches ◽

Learning Techniques ◽

Physician Diagnosis

Abstract Background: The process of determining causes of death in areas where there is limited clinical services using verbal autopsies has become a key issue in terms of accuracy on cause of death (prone to errors and subjective), quality of data among many drawbacks. This is mainly because there is no proper standard available in performing verbal autopsy, even though it is important for civil registration systems and strengthening of health priorities. Physician diagnosis is the only gold standard in reviewing verbal autopsy narratives. In practice, conventional statistical methods are used to perform verbal autopsies due to their simplicity and transparency. However, in literature complex machine learning models can be found that can replace the traditional statistical methods. There has not been much application of machine learning techniques in verbal autopsy to determine cause of death, despite the advances in technology. As such, there is a need for a thorough survey of recent literature on statistical and machine learning approaches applied in verbal autopsy to determine cause of death. Methods: A systematic review was conducted and included a search from six databases. Our study only included scientiﬁc articles published in last decade that reported on verbal autopsy and: (1) algorithms; (2) statistical techniques; (3) machine learning and (4) deep learning. The search yielded 110 articles, after meta analysis, we identiﬁed 85 articles as being relevant and discarded the other 25. We investigated and compared the most commonly used statistical and machine learning techniques in VAs, identiﬁed limitations of each of these techniques, proposed a guiding machine learning framework and pointed to future directions. Results: Eighty ﬁve studies met the inclusion criteria. Apart from physician diagnosis, statistical methods are the most currently applied tools to determine cause of death from verbal autopsies. However, there has been little application of traditional machine learning and emerging techniques, even though they have shown promising results in other domains. Conclusions: Technological application of machine learning to determine cause of death, should focus on effective ideal strategies of pre-processing, transparency, robust feature engineering techniques and data balancing in order to attain optimal model performance.

Download Full-text

p1-FP: Extraction, Classification, and Prediction of Website Fingerprints with Deep Learning

Proceedings on Privacy Enhancing Technologies ◽

10.2478/popets-2019-0043 ◽

2019 ◽

Vol 2019 (3) ◽

pp. 191-209 ◽

Cited By ~ 1

Author(s):

Se Eun Oh ◽

Saikrishna Sunkam ◽

Nicholas Hopper

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Deep Neural Network ◽

State Of The Art ◽

Machine Learning Techniques ◽

Feature Engineering ◽

Engineering Process ◽

Learning Techniques ◽

Wide Range

Abstract Recent advances in Deep Neural Network (DNN) architectures have received a great deal of attention due to their ability to outperform state-of-the-art machine learning techniques across a wide range of application, as well as automating the feature engineering process. In this paper, we broadly study the applicability of deep learning to website fingerprinting. First, we show that unsupervised DNNs can generate lowdimensional informative features that improve the performance of state-of-the-art website fingerprinting attacks. Second, when used as classifiers, we show that they can exceed performance of existing attacks across a range of application scenarios, including fingerprinting Tor website traces, fingerprinting search engine queries over Tor, defeating fingerprinting defenses, and fingerprinting TLS-encrypted websites. Finally, we investigate which site-level features of a website influence its fingerprintability by DNNs.

Download Full-text