scholarly journals An Analysis of Large Data Classification using Ensemble Neural Network

2018 ◽  
Vol 7 (2.14) ◽  
pp. 53
Author(s):  
Mumtazimah Mohamad ◽  
Wan Nor Shuhadah Wan Nik ◽  
Zahrahtul Amani Zakaria ◽  
Arifah Che Alhadi

In this paper, operational and complexity analysis are investigated for a proposed model of ensemble Artificial Neural Networks (ANN) multiple classifiers. The main idea to this is to employ more classifiers to obtain a more accurate prediction as well as to enhance the classification capabilities in case of larger data. The classification result analyzed between a single classifier and multiple classifiers followed by the estimates of upper bounds of converged functional error with the partitioning of the benchmark dataset. The estimates derived using the Apriori method shows that the proposed ensemble ANN algorithm with a different approach is feasible where such problems with a high number of inputs and classes can be solved with time complexity of O(n^k ) for some k, which is a type of polynomial. This result is in line with the significant performance achieved by the diversity rule applied with the use of reordering technique. As conclusion, an ensemble heterogeneous ANN classifier is practical and relevant to theoretical and experimental of combiners for the ensemble ANN classifier systems for a large dataset.  

2021 ◽  
Author(s):  
Zhenling Jiang

This paper studies price bargaining when both parties have left-digit bias when processing numbers. The empirical analysis focuses on the auto finance market in the United States, using a large data set of 35 million auto loans. Incorporating left-digit bias in bargaining is motivated by several intriguing observations. The scheduled monthly payments of auto loans bunch at both $9- and $0-ending digits, especially over $100 marks. In addition, $9-ending loans carry a higher interest rate, and $0-ending loans have a lower interest rate. We develop a Nash bargaining model that allows for left-digit bias from both consumers and finance managers of auto dealers. Results suggest that both parties are subject to this basic human bias: the perceived difference between $9- and the next $0-ending payments is larger than $1, especially between $99- and $00-ending payments. The proposed model can explain the phenomena of payments bunching and differential interest rates for loans with different ending digits. We use counterfactuals to show a nuanced impact of left-digit bias, which can both increase and decrease the payments. Overall, bias from both sides leads to a $33 increase in average payment per loan compared with a benchmark case with no bias. This paper was accepted by Matthew Shum, marketing.


Author(s):  
Aya Taleb ◽  
Rizik M. H. Al-Sayyed ◽  
Hamed S. Al-Bdour

In this research, a new technique to improve the accuracy of the link prediction for most of the networks is proposed; it is based on the prediction ensemble approach using the voting merging technique. The new proposed ensemble called Jaccard, Katz, and Random models Wrapper (JKRW), it scales up the prediction accuracy and provides better predictions for different sizes of populations including small, medium, and large data. The proposed model has been tested and evaluated based on the area under curve (AUC) and accuracy (ACC) measures. These measures applied to the other models used in this study that has been built based on the Jaccard Coefficient, Katz, Adamic/Adar, and Preferential attachment. Results from applying the evaluation matrices verify the improvement of JKRW effectiveness and stability in comparison to the other tested models.  The results from applying the Wilcoxon signed-rank method (one of the non-parametric paired tests) indicate that JKRW has significant differences compared to the other models in the different populations at <strong>0.95</strong> confident interval.


2021 ◽  
Vol 10 (10) ◽  
pp. 698
Author(s):  
Ruren Li ◽  
Shoujia Li ◽  
Zhiwei Xie

Integration development of urban agglomeration is important for regional economic research and management. In this paper, a method was proposed to study the integration development of urban agglomeration by trajectory gravity model. It can analyze the gravitational strength of the core city to other cities and characterize the spatial trajectory of its gravitational direction, expansion, etc. quantitatively. The main idea is to do the fitting analysis between the urban axes and the gravitational lines. The correlation coefficients retrieved from the fitting analysis can reflect the correlation of two indices. For the different cities in the same year, a higher value means a stronger relationship. There is a clear gravitational force between the cities when the value above 0.75. For the most cities in different years, the gravitational force between the core city with itself is increasing by years. At the same time, the direction of growth of the urban axes tends to increase in the direction of the gravitational force between cities. There is a clear tendency for the trajectories of the cities to move closer together. The proposed model was applied to the integration development of China Liaoning central urban agglomeration from 2008 to 2016. The results show that cities are constantly attracted to each other through urban gravity.


2020 ◽  
Vol 8 (6) ◽  
pp. 5820-5825

Human computer interaction is a fast growing area of research where in the physiological signals are used to identify human emotion states. Identifying emotion states can be done using various approaches. One such approach which gained interest of research is through physiological signals using EEG. In the present work, a novel approach is proposed to elicit emotion states using 3-D Video-audio stimuli. Around 66 subjects were involved during data acquisition using 32 channel Enobio device. FIR filter is used to preprocess the acquired raw EEG signals. The desired frequency bands like alpha, delta, beta and theta are extracted using 8-level DWT. The statistical features, Hurst exponential, entropy, power, energy, differential entropy of each bands are computed. Artificial Neural network is implemented using Sequential Keras model and applied on the extracted features to classify in to four classes (HVLA, HVHA, LVHA and LVLA) and eight discrete emotion states like clam, relax, happy, joy, sad, fear, tensed and bored. The performance of ANN classifier found to perform better for 4- classes than 8-classes with a classification rate of 90.835% and 74.0446% respectively. The proposed model achieved better performance rate in detecting discrete emotion states. This model can be used to build applications on health like stress / depression detection and on entertainment to build emotional DJ.


2021 ◽  
Vol 14 (11) ◽  
pp. 2369-2382
Author(s):  
Monica Chiosa ◽  
Thomas B. Preußer ◽  
Gustavo Alonso

Data analysts often need to characterize a data stream as a first step to its further processing. Some of the initial insights to be gained include, e.g., the cardinality of the data set and its frequency distribution. Such information is typically extracted by using sketch algorithms, now widely employed to process very large data sets in manageable space and in a single pass over the data. Often, analysts need more than one parameter to characterize the stream. However, computing multiple sketches becomes expensive even when using high-end CPUs. Exploiting the increasing adoption of hardware accelerators, this paper proposes SKT , an FPGA-based accelerator that can compute several sketches along with basic statistics (average, max, min, etc.) in a single pass over the data. SKT has been designed to characterize a data set by calculating its cardinality, its second frequency moment, and its frequency distribution. The design processes data streams coming either from PCIe or TCP/IP, and it is built to fit emerging cloud service architectures, such as Microsoft's Catapult or Amazon's AQUA. The paper explores the trade-offs of designing sketch algorithms on a spatial architecture and how to combine several sketch algorithms into a single design. The empirical evaluation shows how SKT on an FPGA offers a significant performance gain over high-end, server-class CPUs.


Author(s):  
Akhil Bansal ◽  
Piyush Kumar Shukla ◽  
Manish Kumar Ahirwar

Nowadays, IoT is an emerging technique and has evolved in many areas such as healthcare, smart homes, agriculture, smart city, education, industries, automation, etc. Many sensor and actuator-based devices deployed in these areas collect data or sense the environment. This data is further used to classify the complicated problem related to the particular environment around us, which also increases efficiency, productivity, accuracy and the economic benefit of the devices. The main aim of this survey article is how the data collected by these sensors in the Internet of Things-based applications are handled and classified by classification algorithms. This survey article also identifies various classification algorithms such as KNN, Random forest logistic regression, SVM with different parameters, such as accuracy cross validation, etc., applied on the large dataset generated by sensor-based devices in various IoT-based applications to classify it. In addition, this article also gives a brief review on advance IoT called CIoT.


2013 ◽  
Vol 2013 ◽  
pp. 1-9 ◽  
Author(s):  
Reza Kiani Mavi ◽  
Sajad Kazemi ◽  
Jay M. Jahangiri

Data envelopment analysis (DEA) is used to evaluate the performance of decision making units (DMUs) with multiple inputs and outputs in a homogeneous group. In this way, the acquired relative efficiency score for each decision making unit lies between zero and one where a number of them may have an equal efficiency score of one. DEA successfully divides them into two categories of efficient DMUs and inefficient DMUs. A ranking for inefficient DMUs is given but DEA does not provide further information about the efficient DMUs. One of the popular methods for evaluating and ranking DMUs is the common set of weights (CSW) method. We generate a CSW model with considering nondiscretionary inputs that are beyond the control of DMUs and using ideal point method. The main idea of this approach is to minimize the distance between the evaluated decision making unit and the ideal decision making unit (ideal point). Using an empirical example we put our proposed model to test by applying it to the data of some 20 bank branches and rank their efficient units.


Author(s):  
Mark Kirk ◽  
Marjorie Erickson ◽  
Richard Link

In 2006, EricksonKirk and EricksonKirk proposed a model describing a temperature dependence for upper shelf fracture toughness (JIc), based on the Zerilli-Armstrong (ZA) temperature dependence of the flow stress, that was common to the large number of ferritic steel datasets studied. The equation describing the temperature dependence of JIc was found to be a simple scalar multiple of the temperature dependence predicted by ZA for flow stress. Since that time a large dataset has been developed containing many experimental measurements of JIc for the purpose assessing and refining the previously proposed model. The new data, reported herein, validates the previously proposed model of JIc temperature dependence but suggests that revisions of the previously proposed model of JIc uncertainty are needed to ensure the applicability of the model to both low and high fracture toughness steels.


Author(s):  
Tomasz J. Zlamaniec ◽  
Kuo-Ming Chao ◽  
Nick Godwin

It is a trend for the public organizations to digitalize and publish their large dataset as open linked data to the public users for queries and other applications for further utilizations. Different users’ queries with various frequencies over time create different workload patterns to the servers which cannot guarantee the QoS during peak usages. Materialization is a well-known effective method to reduce peaks, but it is not used by semantic webs, due to frequently evolving schema. This research is able to estimate workloads based on previous queries, analyze and normalize their structures to materialize views, and map the queries to the views with populated data. By analyzing how access patterns of individual views contribute to the overall system workload, the proposed model aims at selection of candidates offering the highest reduction of the peak workload. Consequently, rather than optimizing all queries equally, a system using the new selection method can offer higher query throughput when it is the most needed, allowing for a higher number of concurrent users without compromising QoS during the peak usage. Finally, two case studies were used to evaluate the proposed method.


Author(s):  
Yusuke Iwasawa ◽  
Kotaro Nakayama ◽  
Ikuko Yairi ◽  
Yutaka Matsuo

Deep neural networks have been successfully applied to activity recognition with wearables in terms of recognition performance. However, the black-box nature of neural networks could lead to privacy concerns. Namely, generally it is hard to expect what neural networks learn from data, and so they possibly learn features that highly discriminate user-information unintentionally, which increases the risk of information-disclosure. In this study, we analyzed the features learned by conventional deep neural networks when applied to data of wearables to confirm this phenomenon.Based on the results of our analysis, we propose the use of an adversarial training framework to suppress the risk of sensitive/unintended information disclosure. Our proposed model considers both an adversarial user classifier and a regular activity-classifier during training, which allows the model to learn representations that help the classifier to distinguish the activities but which, at the same time, prevents it from accessing user-discriminative information. This paper provides an empirical validation of the privacy issue and efficacy of the proposed method using three activity recognition tasks based on data of wearables. The empirical validation shows that our proposed method suppresses the concerns without any significant performance degradation, compared to conventional deep nets on all three tasks.


Sign in / Sign up

Export Citation Format

Share Document