A hybrid predictive framework for evaluating P2P credit risks

2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Liang He ◽  
Haiyan Xu ◽  
Ginger Y. Ke

PurposeDespite better accessibility and flexibility, peer-to-peer (P2P) lending has suffered from excessive credit risks, which may cause significant losses to the lenders and even lead to the collapse of P2P platforms. The purpose of this research is to construct a hybrid predictive framework that integrates classification, feature selection, and data balance algorithms to cope with the high-dimensional and imbalanced nature of P2P credit data.Design/methodology/approachAn improved synthetic minority over-sampling technique (IMSMOTE) is developed to incorporate the randomness and probability into the traditional synthetic minority over-sampling technique (SMOTE) to enhance the quality of synthetic samples and the controllability of synthetic processes. IMSMOTE is then implemented along with the grey relational clustering (GRC) and the support vector machine (SVM) to facilitate a comprehensive assessment of the P2P credit risks. To enhance the associativity and functionality of the algorithm, a dynamic selection approach is integrated with GRC and then fed in the SVM's process of parameter adaptive adjustment to select the optimal critical value. A quantitative model is constructed to recognize key criteria via multidimensional representativeness.FindingsA series of experiments based on real-world P2P data from Prosper Funding LLC demonstrates that our proposed model outperforms other existing approaches. It is also confirmed that the grey-based GRC approach with dynamic selection succeeds in reducing data dimensions, selecting a critical value, identifying key criteria, and IMSMOTE can efficiently handle the imbalanced data.Originality/valueThe grey-based machine-learning framework proposed in this work can be practically implemented by P2P platforms in predicting the borrowers' credit risks. The dynamic selection approach makes the first attempt in the literature to select a critical value and indicate key criteria in a dynamic, visual and quantitative manner.

Kybernetes ◽  
2014 ◽  
Vol 43 (8) ◽  
pp. 1150-1164 ◽  
Author(s):  
Bilal M’hamed Abidine ◽  
Belkacem Fergani ◽  
Mourad Oussalah ◽  
Lamya Fergani

Purpose – The task of identifying activity classes from sensor information in smart home is very challenging because of the imbalanced nature of such data set where some activities occur more frequently than others. Typically probabilistic models such as Hidden Markov Model (HMM) and Conditional Random Fields (CRF) are known as commonly employed for such purpose. The paper aims to discuss these issues. Design/methodology/approach – In this work, the authors propose a robust strategy combining the Synthetic Minority Over-sampling Technique (SMOTE) with Cost Sensitive Support Vector Machines (CS-SVM) with an adaptive tuning of cost parameter in order to handle imbalanced data problem. Findings – The results have demonstrated the usefulness of the approach through comparison with state of art of approaches including HMM, CRF, the traditional C-Support vector machines (C-SVM) and the Cost-Sensitive-SVM (CS-SVM) for classifying the activities using binary and ubiquitous sensors. Originality/value – Performance metrics in the experiment/simulation include Accuracy, Precision/Recall and F measure.


Author(s):  
Jie Sun ◽  
Xin Liu ◽  
Wenguo Ai ◽  
Qianyuan Tian

This study proposes two approaches for dynamic financial distress prediction (FDP) based on class-imbalanced data batches by considering both concept drift and class imbalance. One is based on sliding time window and synthetic minority over-sampling technique (SMOTE) and the other is based on sliding time window and majority class partition. Support vector machine, multiple discriminant analysis (MDA) and logistic regression are used as base classifiers in the experiments on a real-world dataset. The results indicate that the two approaches perform better than the pure dynamic FDP (DFDP) models without class imbalance processing and the static FDP models either with or without class imbalance processing.


2015 ◽  
Vol 49 (1) ◽  
pp. 2-22
Author(s):  
Jiunn-Liang Guo ◽  
Hei-Chia Wang ◽  
Ming-Way Lai

Purpose – The purpose of this paper is to develop a novel feature selection approach for automatic text classification of large digital documents – e-books of online library system. The main idea mainly aims on automatically identifying the discourse features in order to improving the feature selection process rather than focussing on the size of the corpus. Design/methodology/approach – The proposed framework intends to automatically identify the discourse segments within e-books and capture proper discourse subtopics that are cohesively expressed in discourse segments and treating these subtopics as informative and prominent features. The selected set of features is then used to train and perform the e-book classification task based on the support vector machine technique. Findings – The evaluation of the proposed framework shows that identifying discourse segments and capturing subtopic features leads to better performance, in comparison with two conventional feature selection techniques: TFIDF and mutual information. It also demonstrates that discourse features play important roles among textual features, especially for large documents such as e-books. Research limitations/implications – Automatically extracted subtopic features cannot be directly entered into FS process but requires control of the threshold. Practical implications – The proposed technique has demonstrated the promised application of using discourse analysis to enhance the classification of large digital documents – e-books as against to conventional techniques. Originality/value – A new FS technique is proposed which can inspect the narrative structure of large documents and it is new to the text classification domain. The other contribution is that it inspires the consideration of discourse information in future text analysis, by providing more evidences through evaluation of the results. The proposed system can be integrated into other library management systems.


2021 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Author(s):  
Mithun B. Patil ◽  
Rekha Patil

Purpose Vertical handoff mechanism (VHO) becomes very popular because of the improvements in the mobility models. These developments are less to certain circumstances and thus do not provide support in generic mobility, but the vertical handover management providing in the heterogeneous wireless networks (HWNs) is crucial and challenging. Hence, this paper introduces the vertical handoff management approach based on an effective network selection scheme. Design/methodology/approach This paper aims to improve the working principle of previous methods and make VHO more efficient and reliable for the HWN.Initially, the handover triggering techniques is modelled for identifying an appropriate place to initiate handover based on the computed coverage area of cellular base station or wireless local area network (WLAN) access point. Then, inappropriate networks are eliminated for determining the better network to perform handover. Accordingly, a network selection approach is introduced on the basis ofthe Fractional-dolphin echolocation-based support vector neural network (Fractional-DE-based SVNN). The Fractional-DE is designed by integrating Fractional calculus (FC) in Dolphin echolocation (DE), and thereby, modifying the update rule of the DE algorithm based on the location of the solutions in past iterations. The proposed Fractional-DE algorithm is used to train Support vector neural network (SVNN) for selecting the best weights. Several parameters, like Bit error rate (BER), End to end delay (EED), jitter, packet loss, and energy consumption are considered for choosing the best network. Findings The performance of the proposed VHO mechanism based on Fractional-DE is evaluated based on delay, energy consumption, staytime, and throughput. The proposed Fractional-DE method achieves the minimal delay of 0.0100 sec, the minimal energy consumption of 0.348, maximal staytime of 4.373 sec, and the maximal throughput of 109.20 kbps. Originality/value In this paper, a network selection approach is introduced on the basis of the Fractional-Dolphin Echolocation-based Support vector neural network (Fractional-DE-based SVNN). The Fractional-DE is designed by integrating Fractional calculus (FC) in Dolphin echolocation (DE), and thereby, modifying the update rule of the DE algorithm based on the location of the solutions in past iterations. The proposed Fractional-DE algorithm is used to train SVNN for selecting the best weights. Several parameters, like Bit error rate (BER), End to end delay (EED), jitter, packet loss, and energy consumption are considered for choosing the best network.The performance of the proposed VHO mechanism based on Fractional-DE is evaluated based on delay, energy consumption, staytime, and throughput, in which the proposed method offers the best performance.


Author(s):  
Yafei Wu ◽  
Ya Fang

Timely stroke diagnosis and intervention are necessary considering its high prevalence. Previous studies have mainly focused on stroke prediction with balanced data. Thus, this study aimed to develop machine learning models for predicting stroke with imbalanced data in an elderly population in China. Data were obtained from a prospective cohort that included 1131 participants (56 stroke patients and 1075 non-stroke participants) in 2012 and 2014, respectively. Data balancing techniques including random over-sampling (ROS), random under-sampling (RUS), and synthetic minority over-sampling technique (SMOTE) were used to process the imbalanced data in this study. Machine learning methods such as regularized logistic regression (RLR), support vector machine (SVM), and random forest (RF) were used to predict stroke with demographic, lifestyle, and clinical variables. Accuracy, sensitivity, specificity, and areas under the receiver operating characteristic curves (AUCs) were used for performance comparison. The top five variables for stroke prediction were selected for each machine learning method based on the SMOTE-balanced data set. The total prevalence of stroke was high in 2014 (4.95%), with men experiencing much higher prevalence than women (6.76% vs. 3.25%). The three machine learning methods performed poorly in the imbalanced data set with extremely low sensitivity (approximately 0.00) and AUC (approximately 0.50). After using data balancing techniques, the sensitivity and AUC considerably improved with moderate accuracy and specificity, and the maximum values for sensitivity and AUC reached 0.78 (95% CI, 0.73–0.83) for RF and 0.72 (95% CI, 0.71–0.73) for RLR. Using AUCs for RLR, SVM, and RF in the imbalanced data set as references, a significant improvement was observed in the AUCs of all three machine learning methods (p < 0.05) in the balanced data sets. Considering RLR in each data set as a reference, only RF in the imbalanced data set and SVM in the ROS-balanced data set were superior to RLR in terms of AUC. Sex, hypertension, and uric acid were common predictors in all three machine learning methods. Blood glucose level was included in both RLR and RF. Drinking, age and high-sensitivity C-reactive protein level, and low-density lipoprotein cholesterol level were also included in RLR, SVM, and RF, respectively. Our study suggests that machine learning methods with data balancing techniques are effective tools for stroke prediction with imbalanced data.


2019 ◽  
Vol 24 (3) ◽  
pp. 294-308
Author(s):  
Michael Adesi ◽  
De-Graft Owusu-Manu ◽  
Frank Boateng

Purpose Notwithstanding that numerous studies have focused on strategy in quantity surveying (QS) professional service firms, there is a paucity of investigation on the segmentation of QS professional services. The purpose of this study is to investigate the segmentation of QS services for diversification and a focus strategy formation. Design/methodology/approach This study adopts the positivist stance and quantitative approach in which a simple random sampling technique was used to select participants. In total, 110 survey questionnaires were administered to registered professional QS, out of which 79 completed questionnaires were returned for analysis. Findings The paper identifies three main QS service segments characterised by low, moderate and high competition. In addition, this study found that the concentration of traditional QS services in the building construction sector is due to the unwillingness of QS professional service firms to diversify into the non-construction sectors such as oil and gas. The diversification of QS services in the low competitive segment requires the adoption of agile approaches. Research limitations/implications The study was limited to numeric analyses and so would be complemented by qualitative research in the future. Practical implications This paper is useful to QS professional service firms interested in diversifying their services into the non-construction sectors to enhance the pricing of their services. Originality/value Segmentation of QS services is fundamental to the formulation of focus strategy for non-construction sectors such as oil and gas and mining to enhance the pricing of QS professional services.


2018 ◽  
Vol 10 (1) ◽  
pp. 85-110 ◽  
Author(s):  
Syed Zulfiqar Ali Shah ◽  
Maqsood Ahmad ◽  
Faisal Mahmood

Purpose This paper aims to clarify the mechanism by which heuristics influences the investment decisions of individual investors, actively trading on the Pakistan Stock Exchange (PSX), and the perceived efficiency of the market. Most studies focus on well-developed financial markets and very little is known about investors’ behaviour in less developed financial markets or emerging markets. The present study contributes to filling this gap in the literature. Design/methodology/approach Investors’ heuristic biases have been measured using a questionnaire, containing numerous items, including indicators of speculators, investment decisions and perceived market efficiency variables. The sample consists of 143 investors trading on the PSX. A convenient, purposively sampling technique was used for data collection. To examine the relationship between heuristic biases, investment decisions and perceived market efficiency, hypotheses were tested by using correlation and regression analysis. Findings The paper provides empirical insights into the relationship of heuristic biases, investment decisions and perceived market efficiency. The results suggest that heuristic biases (overconfidence, representativeness, availability and anchoring) have a markedly negative impact on investment decisions made by individual investors actively trading on the PSX and on perceived market efficiency. Research limitations/implications The primary limitation of the empirical review is the tiny size of the sample. A larger sample would have given more trustworthy results and could have empowered a more extensive scope of investigation. Practical implications The paper encourages investors to avoid relying on heuristics or their feelings when making investments. It provides awareness and understanding of heuristic biases in investment management, which could be very useful for decision makers and professionals in financial institutions, such as portfolio managers and traders in commercial banks, investment banks and mutual funds. This paper helps investors to select better investment tools and avoid repeating expensive errors, which occur due to heuristic biases. They can improve their performance by recognizing their biases and errors of judgment, to which we are all prone, resulting in a more efficient market. So, it is necessary to focus on a specific investment strategy to control “mental mistakes” by investors, due to heuristic biases. Originality/value The current study is the first of its kind, focusing on the link between heuristics, individual investment decisions and perceived market efficiency within the specific context of Pakistan.


Mathematics ◽  
2021 ◽  
Vol 9 (9) ◽  
pp. 936
Author(s):  
Jianli Shao ◽  
Xin Liu ◽  
Wenqing He

Imbalanced data exist in many classification problems. The classification of imbalanced data has remarkable challenges in machine learning. The support vector machine (SVM) and its variants are popularly used in machine learning among different classifiers thanks to their flexibility and interpretability. However, the performance of SVMs is impacted when the data are imbalanced, which is a typical data structure in the multi-category classification problem. In this paper, we employ the data-adaptive SVM with scaled kernel functions to classify instances for a multi-class population. We propose a multi-class data-dependent kernel function for the SVM by considering class imbalance and the spatial association among instances so that the classification accuracy is enhanced. Simulation studies demonstrate the superb performance of the proposed method, and a real multi-class prostate cancer image dataset is employed as an illustration. Not only does the proposed method outperform the competitor methods in terms of the commonly used accuracy measures such as the F-score and G-means, but also successfully detects more than 60% of instances from the rare class in the real data, while the competitors can only detect less than 20% of the rare class instances. The proposed method will benefit other scientific research fields, such as multiple region boundary detection.


2018 ◽  
Vol 8 (3) ◽  
pp. 293-304 ◽  
Author(s):  
Chukwuka Christian Ohueri ◽  
Wallace Imoudu Enegbuma ◽  
Ngie Hing Wong ◽  
Kuok King Kuok ◽  
Russell Kenley

Purpose The purpose of this paper is to develop a motivation framework that will enhance labour productivity for Iskandar Malaysia (IM) construction projects. The vision of IM development corridor is to become Southern Peninsular Malaysia’s most developed region by the year 2025. IM cannot realise this foresight without effective labour productivity. Previous studies have reported that the labour productivity of IM construction projects was six times lower than the labour productivity of Singapore construction projects, due to lack of motivation among IM labourers, and a shortage of local skilled labour. Therefore, there is a need to study how to motivate IM construction labourers, so as to increase their productivity. Design/methodology/approach A quantitative research method was used to collect data from IM construction skilled labourers and construction professionals, using two sets of questionnaire. The respondents were selected using a purposive sampling technique. In total, 40 skilled labourers and 50 construction professionals responded to the questionnaire survey, and the data were analysed using Statistical Package for Social Science software (version 22). Findings The analysis revealed the major factors that motivate labourers participating in IM construction projects. The factors were ranked hierarchically using Relative Importance Index (RII) and the outcome of the ranking indicated that effective management, viable construction practices, financial incentives, continuous training and development, and safe working environment were the most significant motivation strategies that positively influence IM construction labourers. Originality/value The study developed and validated a framework that can be used to boost the morale of IM construction labourers, so that their productivity can be increased. Implementation of the established motivation framework will also lead to career progression of IM construction labourers, based on the training elements in the framework. This career prospect will attract local skilled labourers to participate in IM construction projects.


2019 ◽  
Vol 31 (5) ◽  
pp. 740-757 ◽  
Author(s):  
Syed Ali Raza Shah ◽  
Khairur Rijal Jamaludin ◽  
Hayati Habibah Abdul Talib ◽  
Sha’ri Mohd Yusof

Purpose The purpose of this paper is to identify the critical success factors (CSFs) of integrated quality environmental management (IQEM) and analyze their impact on operational performance (OP) and environmental performance (EP) in food processing Small and medium-sized enterprises (SMEs) in Pakistan. Design/methodology/approach The study is based on collecting data using a survey questionnaire through snowball sampling technique. A total of 302 food processing SMEs operating in Punjab, Pakistan, responded to the survey. SPSS version-23 and SmartPLS-3 were used for data analysis. Findings The literature review identified leadership (LS), employee management (EM), strategic planning (SP), information management (IM), process management (PM), supplier management (SM) and customer focus (CF) as CSFs of IQEM. The results of this study found a significant relationship of all identified CSFs with operational performance in food processing SMEs whereas EM, IM, PM and SM were insignificant with the EP in the food processing SMEs. Research limitations/implications Although this study has collected data from one province, the Punjab province, it still relevant in identifying the CSFs for IQEM implementation within food processing SMEs to improve performance. Originality/value Despite the wide spread of integrated systems practices in the developed countries, little attention has been placed to implement and assess the IQEM initiatives by organizations in the developing countries. Thus, this study identified CSFs of IQEM based on empirical studies and analyzed their impact on OP and EP of food processing SMEs.


Sign in / Sign up

Export Citation Format

Share Document