scholarly journals The Prediction of Application for Loan using Machine Learning Technique

Machine learning techniques are used to verify the many kinds of loan prediction problems. This study pursueS two major goals. Firstly, this paper is to understand the role of variables in loan prediction modeling better. Secondly, the study evaluates the predictive performance of the decision trees. The corresponding variable information is drawn from a third-party website, international challenge on the popular internet platform Kaggle (www.kaggle.com), which provides data in the title of ‘Loan Prediction’ that was uploaded by Amit Parajapet. We used decision tree which is a powerful and popular machine learning algorithm to this date for predicting and classifying big data. Based on these results, first, women seem to be more likely to get to loan than men. credit history, self-employed, property area, and applicant income also show significance with loan prediction. This study contributes to the literature regarding loan prediction by providing a global model summarizing the loan prediction determinants of customers’ factors.

2020 ◽  
Vol 28 (2) ◽  
pp. 253-265 ◽  
Author(s):  
Gabriela Bitencourt-Ferreira ◽  
Amauri Duarte da Silva ◽  
Walter Filgueira de Azevedo

Background: The elucidation of the structure of cyclin-dependent kinase 2 (CDK2) made it possible to develop targeted scoring functions for virtual screening aimed to identify new inhibitors for this enzyme. CDK2 is a protein target for the development of drugs intended to modulate cellcycle progression and control. Such drugs have potential anticancer activities. Objective: Our goal here is to review recent applications of machine learning methods to predict ligand- binding affinity for protein targets. To assess the predictive performance of classical scoring functions and targeted scoring functions, we focused our analysis on CDK2 structures. Methods: We have experimental structural data for hundreds of binary complexes of CDK2 with different ligands, many of them with inhibition constant information. We investigate here computational methods to calculate the binding affinity of CDK2 through classical scoring functions and machine- learning models. Results: Analysis of the predictive performance of classical scoring functions available in docking programs such as Molegro Virtual Docker, AutoDock4, and Autodock Vina indicated that these methods failed to predict binding affinity with significant correlation with experimental data. Targeted scoring functions developed through supervised machine learning techniques showed a significant correlation with experimental data. Conclusion: Here, we described the application of supervised machine learning techniques to generate a scoring function to predict binding affinity. Machine learning models showed superior predictive performance when compared with classical scoring functions. Analysis of the computational models obtained through machine learning could capture essential structural features responsible for binding affinity against CDK2.


2021 ◽  
Author(s):  
Praveeen Anandhanathan ◽  
Priyanka Gopalan

Abstract Coronavirus disease (COVID-19) is spreading across the world. Since at first it has appeared in Wuhan, China in December 2019, it has become a serious issue across the globe. There are no accurate resources to predict and find the disease. So, by knowing the past patients’ records, it could guide the clinicians to fight against the pandemic. Therefore, for the prediction of healthiness from symptoms Machine learning techniques can be implemented. From this we are going to analyse only the symptoms which occurs in every patient. These predictions can help clinicians in the easier manner to cure the patients. Already for prediction of many of the diseases, techniques like SVM (Support vector Machine), Fuzzy k-Means Clustering, Decision Tree algorithm, Random Forest Method, ANN (Artificial Neural Network), KNN (k-Nearest Neighbour), Naïve Bayes, Linear Regression model are used. As we haven’t faced this disease before, we can’t say which technique will give the maximum accuracy. So, we are going to provide an efficient result by comparing all the such algorithms in RStudio.


2020 ◽  
Vol 7 (10) ◽  
pp. 380-389
Author(s):  
Asogwa D.C ◽  
Anigbogu S.O ◽  
Anigbogu G.N ◽  
Efozia F.N

Author's age prediction is the task of determining the author's age by studying the texts written by them. The prediction of author’s age can be enlightening about the different trends, opinions social and political views of an age group. Marketers always use this to encourage a product or a service to an age group following their conveyed interests and opinions. Methodologies in natural language processing have made it possible to predict author’s age from text by examining the variation of linguistic characteristics. Also, many machine learning algorithms have been used in author’s age prediction. However, in social networks, computational linguists are challenged with numerous issues just as machine learning techniques are performance driven with its own challenges in realistic scenarios. This work developed a model that can predict author's age from text with a machine learning algorithm (Naïve Bayes) using three types of features namely, content based, style based and topic based. The trained model gave a prediction accuracy of 80%.


2020 ◽  
pp. 1314-1330 ◽  
Author(s):  
Mohamed Elhadi Rahmani ◽  
Abdelmalek Amine ◽  
Reda Mohamed Hamou

Botanists study in general the characteristics of leaves to give to each plant a scientific name; such as shape, margin...etc. This paper proposes a comparison of supervised plant identification using different approaches. The identification is done according to three different features extracted from images of leaves: a fine-scale margin feature histogram, a Centroid Contour Distance Curve shape signature and an interior texture feature histogram. First represent each leaf by one feature at a time in, then represent leaves by two features, and each leaf was represented by the three features. After that, the authors classified the obtained vectors using different supervised machine learning techniques; the used techniques are Decision tree, Naïve Bayes, K-nearest neighbour, and neural network. Finally, they evaluated the classification using cross validation. The main goal of this work is studying the influence of representation of leaves' images on the identification of plants, and also studying the use of supervised machine learning algorithm for plant leaves classification.


Author(s):  
Deepti Rani ◽  
Anju Sangwan ◽  
Anupma Sangwan ◽  
Tajinder Singh

With the enormous growth of sensor networks, information seeking from such networks has become an invaluable source of knowledge for various organizations to enhance the comprehension of people interests. Not only wireless sensor networks (WSNs) but its various classes also remain the hot topics of research. In this chapter, the primary focus is to understand the concept of sensor network in underwater scenario. Various mechanisms are used to recognize the activities underwater using sensor which examines the real-time events. With these features, a few challenges are also associated with sensor networks, which are addressed here. Machine learning (ML) techniques are the perfect key of success to resolve such issues due to their feasibility and adaption in complex problem environment. Therefore, various ML techniques have been explained to enhance the operational performance of WSNs, especially in underwater WSNs (UWSNs). The main objective of this chapter is to understand the concepts of UWSNs and role of ML to address the performance issues of UWSNs.


2020 ◽  
pp. 101806
Author(s):  
Omid Khalaj ◽  
Moslem Ghobadi ◽  
Alireza Zarezadeh ◽  
Ehsan Saebnoori ◽  
Hana Jirková ◽  
...  

2017 ◽  
Author(s):  
Guillaume Paré ◽  
Shihong Mao ◽  
Wei Q. Deng

AbstractMachine-learning techniques have helped solve a broad range of prediction problems, yet are not widely used to build polygenic risk scores for the prediction of complex traits. We propose a novel heuristic based on machine-learning techniques (GraBLD) to boost the predictive performance of polygenic risk scores. Gradient boosted regression trees were first used to optimize the weights of SNPs included in the score, followed by a novel regional adjustment for linkage disequilibrium. A calibration set with sample size of ~200 individuals was sufficient for optimal performance. GraBLD yielded prediction R2 of 0.239 and 0.082 using GIANT summary association statistics for height and BMI in the UK Biobank study (N=130K; 1.98M SNPs), explaining 46.9% and 32.7% of the overall polygenic variance, respectively. For diabetes status, the area under the receiver operating characteristic curve was 0.602 in the UK Biobank study using summary-level association statistics from the DIAGRAM consortium. GraBLD outperformed other polygenic score heuristics for the prediction of height (p<2.2x10−16) and BMI (p<1.57x10−4), and was equivalent to LDpred for diabetes. Results were independently validated in the Health and Retirement Study (N=8,292; 688,398 SNPs). Our report demonstrates the use of machine-learning techniques, coupled with summary-level data from large genome-wide meta-analyses to improve the prediction of polygenic traits.


2017 ◽  
Author(s):  
Ari S. Benjamin ◽  
Hugo L. Fernandes ◽  
Tucker Tomlinson ◽  
Pavan Ramkumar ◽  
Chris VerSteeg ◽  
...  

AbstractNeuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. It is often unknown how much of explainable neural activity is captured, or missed, when fitting a GLM. Here we compared the predictive performance of GLMs to three leading machine learning methods: feedforward neural networks, gradient boosted trees (using XGBoost), and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from standard representations of reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods (particularly XGBoost and the ensemble) produced more accurate spike predictions and were less sensitive to the preprocessing of features. This discrepancy in performance suggests that standard feature sets may often relate to neural activity in a nonlinear manner not captured by GLMs. Encoding models built with machine learning techniques, which can be largely automated, more accurately predict spikes and can offer meaningful benchmarks for simpler models.


Sign in / Sign up

Export Citation Format

Share Document