scholarly journals Data Improvement Model Based on ECG Biometric for User Authentication and Identification

Sensors ◽  
2020 ◽  
Vol 20 (10) ◽  
pp. 2920 ◽  
Author(s):  
Alex Barros ◽  
Paulo Resque ◽  
João Almeida ◽  
Renato Mota ◽  
Helder Oliveira ◽  
...  

The rapid spread of wearable technologies has motivated the collection of a variety of signals, such as pulse rate, electrocardiogram (ECG), electroencephalogram (EEG), and others. As those devices are used to do so many tasks and store a significant amount of personal data, the concern of how our data can be exposed starts to gain attention as the wearable devices can become an attack vector or a security breach. In this context, biometric also has expanded its use to meet new security requirements of authentication demanded by online applications, and it has been used in identification systems by a large number of people. Existing works on ECG for user authentication do not consider a population size close to a real application. Finding real data that has a big number of people ECG’s data is a challenge. This work investigates a set of steps that can improve the results when working with a higher number of target classes in a biometric identification scenario. These steps, such as increasing the number of examples, removing outliers, and including a few additional features, are proven to increase the performance in a large data set. We propose a data improvement model for ECG biometric identification (user identification based on electrocardiogram—DETECT), which improves the performance of the biometric system considering a greater number of subjects, which is closer to a security system in the real world. The DETECT model increases precision from 78% to 92% within 1500 subjects, and from 90% to 95% within 100 subjects. Moreover, good False Rejection Rate (i.e., 0.064003) and False Acceptance Rate (i.e., 0.000033) were demonstrated. We designed our proposed method over PhysioNet Computing in Cardiology 2018 database.

2019 ◽  
Vol 23 (Suppl. 1) ◽  
pp. 113-120
Author(s):  
Burhaneddin Izgi

We introduce a new notion of transitive relative return rate and present its applications based on the stochastic differential equations. First, we define the notion of a relative return rate and show how to construct the transitive relative return rate (TRRR) on it. Then, we state some propositions and theorems about relative return rate and TRRR and prove them. Moreover, we exhibit the theoretical framework of the generalization of TRRR for 3 n ? cases and prove it, as well. Furthermore, we illustrate our approach with real data applications of daily relative return rates for Borsa Istanbul-30 (BIST-30) and Intel Corporation indexes with respect to daily interest rate of Central Bank of the Republic of Turkey between June 18, 2003 and June 17, 2013. For this purpose, we perform simulations via Milstein method. We succeed to present usefulness of the relative return rate for the relevant real large data set using the numerical solution of the stochastic differential equations. The simulation results show that the proposed closely approximates the real data


2019 ◽  
Author(s):  
Martin Papenberg ◽  
Gunnar W. Klau

Numerous applications in psychological research require that a pool of elements is partitioned into multiple parts. While many applications seek groups that are well-separated, i.e., dissimilar from each other, others require the different groups to be as similar as possible. Examples include the assignment of students to parallel courses, assembling stimulus sets in experimental psychology, splitting achievement tests into parts of equal difficulty, and dividing a data set for cross validation. We present anticlust, an easy-to-use and free software package for solving these problems fast and in an automated manner. The package anticlust is an open source extension to the R programming language and implements the methodology of anticlustering. Anticlustering divides elements into similar parts, ensuring similarity between groups by enforcing heterogeneity within groups. Thus, anticlustering is the direct reversal of cluster analysis that aims to maximize homogeneity within groups and dissimilarity between groups. Our package anticlust implements two anticlustering criteria, reversing the clustering methods k-means and cluster editing, respectively. In a simulation study, we show that anticlustering returns excellent results and outperforms alternative approaches like random assignment and matching. In three example applications, we illustrate how to apply anticlust on real data sets. We demonstrate how to assign experimental stimuli to equivalent sets based on norming data, how to divide a large data set for cross validation, and how to split a test into parts of equal item difficulty and discrimination.


2012 ◽  
Vol 8 (4) ◽  
pp. 82-107 ◽  
Author(s):  
Renxia Wan ◽  
Yuelin Gao ◽  
Caixia Li

Up to now, several algorithms for clustering large data sets have been presented. Most clustering approaches for data sets are the crisp ones, which cannot be well suitable to the fuzzy case. In this paper, the authors explore a single pass approach to fuzzy possibilistic clustering over large data set. The basic idea of the proposed approach (weighted fuzzy-possibilistic c-means, WFPCM) is to use a modified possibilistic c-means (PCM) algorithm to cluster the weighted data points and centroids with one data segment as a unit. Experimental results on both synthetic and real data sets show that WFPCM can save significant memory usage when comparing with the fuzzy c-means (FCM) algorithm and the possibilistic c-means (PCM) algorithm. Furthermore, the proposed algorithm is of an excellent immunity to noise and can avoid splitting or merging the exact clusters into some inaccurate clusters, and ensures the integrity and purity of the natural classes.


1994 ◽  
Vol 21 (1) ◽  
pp. 41-43 ◽  
Author(s):  
W. Burt Thompson

I describe the use of a student-designed Student Information Questionnaire that generates a large data set useful for teaching a variety of statistical procedures and concepts. This questionnaire helps statistics instructors minimize the use of uninteresting artificial data in their classes. Also, students learn firsthand that data analysis is an integral part of the research process, rather than an isolated set of procedures. Evaluations of the technique suggest that students find real data more interesting than artificial data and more helpful for learning statistics.


2020 ◽  
Vol 63 (10) ◽  
pp. 3488-3500
Author(s):  
Jun Ho Chai ◽  
Chang Huan Lo ◽  
Julien Mayor

Purpose This study introduces a framework to produce very short versions of the MacArthur–Bates Communicative Development Inventories (CDIs) by combining the Bayesian-inspired approach introduced by Mayor and Mani (2019) with an item response theory–based computerized adaptive testing that adapts to the ability of each child, in line with Makransky et al. (2016). Method We evaluated the performance of our approach—dynamically selecting maximally informative words from the CDI and combining parental response with prior vocabulary data—by conducting real-data simulations using four CDI versions having varying sample sizes on Wordbank—the online repository of digitalized CDIs: American English (a very large data set), Danish (a large data set), Beijing Mandarin (a medium-sized data set), and Italian (a small data set). Results Real-data simulations revealed that correlations exceeding .95 with full CDI administrations were reached with as few as 15 test items, with high levels of reliability, even when languages (e.g., Italian) possessed few digitalized administrations on Wordbank. Conclusions The current approach establishes a generic framework that produces very short (less than 20 items) adaptive early vocabulary assessments—hence considerably reducing their administration time. This approach appears to be robust even when CDIs have smaller samples in online repositories, for example, with around 50 samples per month-age.


2019 ◽  
Vol 8 (2) ◽  
pp. 159
Author(s):  
Morteza Marzjarani

Heteroscedasticity plays an important role in data analysis. In this article, this issue along with a few different approaches for handling heteroscedasticity are presented. First, an iterative weighted least square (IRLS) and an iterative feasible generalized least square (IFGLS) are deployed and proper weights for reducing heteroscedasticity are determined. Next, a new approach for handling heteroscedasticity is introduced. In this approach, through fitting a multiple linear regression (MLR) model or a general linear model (GLM) to a sufficiently large data set, the data is divided into two parts through the inspection of the residuals based on the results of testing for heteroscedasticity, or via simulations. The first part contains the records where the absolute values of the residuals could be assumed small enough to the point that heteroscedasticity would be ignorable. Under this assumption, the error variances are small and close to their neighboring points. Such error variances could be assumed known (but, not necessarily equal).The second or the remaining portion of the said data is categorized as heteroscedastic. Through real data sets, it is concluded that this approach reduces the number of unusual (such as influential) data points suggested for further inspection and more importantly, it will lowers the root MSE (RMSE) resulting in a more robust set of parameter estimates.


2016 ◽  
pp. 403-437 ◽  
Author(s):  
Khaled Mohammed Fouad ◽  
Basma Mohammed Hassan ◽  
Mahmoud F. Hassan

Biometric identification is a very good candidate technology, which can facilitate a trusted user authentication with minimum constraints on the security of the access point. However, most of the biometric identification techniques require special hardware, thus complicate the access point and make it costly. Keystroke recognition is a biometric identification technique which relies on the user behavior while typing on the keyboard. It is a more secure and does not need any additional hardware to the access point. This paper presents a developed behavioral biometric authentication method which enables to identify the user based on his Keystroke Static Authentication (KSA) and describes an authentication system that explains the ability of keystroke technique to authenticate the user based on his template profile saved in the database. Also, an algorithm based on dynamic keystroke analysis has been presented, synthesized, simulated and implemented on Field Programmable Gate Array (FPGA). The proposed algorithm is tested on 25 individuals, achieving a False Rejection Rate (FRR) about 4% and a False Acceptance Rate (FAR) about 0%. This performance is reached using the same sampling text for all the individuals. In this paper, two methods are used to implement the proposed approach: method one (H/W based Sorter) and method two (S/W based Sorter) are achieved execution time about 50.653 ns and 9.650 ns, respectively. Method two achieved a lower execution time; the time in which the proposed algorithm is executed on FPGA board, compared to some published results. As the second method achieved a small execution time and area utilization so it is the preferred method to be implemented on FPGA.


Author(s):  
Khaled Mohammed Fouad ◽  
Basma Mohammed Hassan ◽  
Mahmoud F. Hassan

Biometric identification is a very good candidate technology, which can facilitate a trusted user authentication with minimum constraints on the security of the access point. However, most of the biometric identification techniques require special hardware, thus complicate the access point and make it costly. Keystroke recognition is a biometric identification technique which relies on the user behavior while typing on the keyboard. It is a more secure and does not need any additional hardware to the access point. This paper presents a developed behavioral biometric authentication method which enables to identify the user based on his Keystroke Static Authentication (KSA) and describes an authentication system that explains the ability of keystroke technique to authenticate the user based on his template profile saved in the database. Also, an algorithm based on dynamic keystroke analysis has been presented, synthesized, simulated and implemented on Field Programmable Gate Array (FPGA). The proposed algorithm is tested on 25 individuals, achieving a False Rejection Rate (FRR) about 4% and a False Acceptance Rate (FAR) about 0%. This performance is reached using the same sampling text for all the individuals. In this paper, two methods are used to implement the proposed approach: method one (H/W based Sorter) and method two (S/W based Sorter) are achieved execution time about 50.653 ns and 9.650 ns, respectively. Method two achieved a lower execution time; the time in which the proposed algorithm is executed on FPGA board, compared to some published results. As the second method achieved a small execution time and area utilization so it is the preferred method to be implemented on FPGA.


2020 ◽  
Vol 39 (5) ◽  
pp. 6419-6430
Author(s):  
Dusan Marcek

To forecast time series data, two methodological frameworks of statistical and computational intelligence modelling are considered. The statistical methodological approach is based on the theory of invertible ARIMA (Auto-Regressive Integrated Moving Average) models with Maximum Likelihood (ML) estimating method. As a competitive tool to statistical forecasting models, we use the popular classic neural network (NN) of perceptron type. To train NN, the Back-Propagation (BP) algorithm and heuristics like genetic and micro-genetic algorithm (GA and MGA) are implemented on the large data set. A comparative analysis of selected learning methods is performed and evaluated. From performed experiments we find that the optimal population size will likely be 20 with the lowest training time from all NN trained by the evolutionary algorithms, while the prediction accuracy level is lesser, but still acceptable by managers.


2019 ◽  
Vol XVI (2) ◽  
pp. 1-11
Author(s):  
Farrukh Jamal ◽  
Hesham Mohammed Reyad ◽  
Soha Othman Ahmed ◽  
Muhammad Akbar Ali Shah ◽  
Emrah Altun

A new three-parameter continuous model called the exponentiated half-logistic Lomax distribution is introduced in this paper. Basic mathematical properties for the proposed model were investigated which include raw and incomplete moments, skewness, kurtosis, generating functions, Rényi entropy, Lorenz, Bonferroni and Zenga curves, probability weighted moment, stress strength model, order statistics, and record statistics. The model parameters were estimated by using the maximum likelihood criterion and the behaviours of these estimates were examined by conducting a simulation study. The applicability of the new model is illustrated by applying it on a real data set.


Sign in / Sign up

Export Citation Format

Share Document