Reliability of the Data-Driven Identification algorithm with respect to incomplete input data

2019 ◽  
pp. 311-316
Author(s):  
Marie Dalémat ◽  
Michel Coret ◽  
Adrien Leygue ◽  
Erwan Verron
Sensors ◽  
2019 ◽  
Vol 19 (16) ◽  
pp. 3492 ◽  
Author(s):  
Jongkwon Choi ◽  
Youngmin Choo ◽  
Keunhwa Lee

Four data-driven methods—random forest (RF), support vector machine (SVM), feed-forward neural network (FNN), and convolutional neural network (CNN)—are applied to discriminate surface and underwater vessels in the ocean using low-frequency acoustic pressure data. Acoustic data are modeled considering a vertical line array by a Monte Carlo simulation using the underwater acoustic propagation model, KRAKEN, in the ocean environment of East Sea in Korea. The raw data are preprocessed and reorganized into the phone-space cross-spectral density matrix (pCSDM) and mode-space cross-spectral density matrix (mCSDM). Two additional matrices are generated using the absolute values of matrix elements in each CSDM. Each of these four matrices is used as input data for supervised machine learning. Binary classification is performed by using RF, SVM, FNN, and CNN, and the obtained results are compared. All machine-learning algorithms show an accuracy of >95% for three types of input data—the pCSDM, mCSDM, and mCSDM with the absolute matrix elements. The CNN is the best in terms of low percent error. In particular, the result using the complex pCSDM is encouraging because these data-driven methods inherently do not require environmental information. This work demonstrates the potential of machine learning to discriminate between surface and underwater vessels in the ocean.


Author(s):  
Fabiola Fernández-Gutiérrez ◽  
Jonathan Kennedy ◽  
Roxanne Cooksey ◽  
Mark Atkinson ◽  
Ernest Choy ◽  
...  

ABSTRACTObjectives 1) To develop a fully data-driven framework for automatically identifying patients with a condition from routine electronic primary care records; 2) to identify informative codes (risk factors) of arthropathy conditions in primary care records that can accurately predict a diagnosis of the conditions in secondary care records. ApproachThis study linked routine primary and secondary care records in Wales, UK held in the SAIL (Secured Anonymised Information Linkage) databank, in which the secondary care records were used as golden standard. As such, we proposed to use machine learning techniques to extract patient information and identify cohorts with a condition from the large and high-dimensional linked dataset using the following phases: data preparation, performed in the machine learning context fashion; pre-selection of initial features, ranking and selecting features into a meaningful subset by using feature selection methods; and identification algorithm development which incorporates mechanisms of tackling the imbalanced nature of the data. This data-driven framework was then validated on an independent dataset, and compared with existing algorithm which had been developed using expert clinician knowledge for arthropathy conditions. ResultsRheumatoid arthritis (RA) and ankylosing spondylitis (AS) were used to demonstrate the feasibility of this framework. Linking primary care records with the secondary care rheumatology clinical system, we collected 9,657 patients with 1,484 RA patients and 204 AS patients. The proposed framework identified various compact subsets of informative features (risk factors) from 43,100 potential Read codes. Applying to an independent test data, this framework achieved the classification accuracy and positive predictive values (PPVs) of 86.19% and 88.46% respectively for RA and 99.23 % and 97.75% respectively for AS, which are comparable with the performance of clinical knowledge-based method - the accuracy of 85.85%, the PPV of 85.28% for RA and the accuracy of 97.86% , the PPV of 95.65% for AS. ConclusionThe proposed data-driven framework provides a rapid and cost-effective way of reliably identifying patients with a medical condition from primary care data. It performed as well as the clinically derived algorithm. This framework does not intend to substitute clinical expertise, instead it provides an decision support tool for clinicians during their decision process, in particular selection of patients for clinical trials.


Author(s):  
Anil Kumar ◽  
Amina Khatun ◽  
Sanjib Kumar Agarwalla ◽  
Amol Dighe

AbstractAtmospheric neutrino experiments can show the “oscillation dip” feature in data, due to their sensitivity over a large L/E range. In experiments that can distinguish between neutrinos and antineutrinos, like INO, oscillation dips can be observed in both these channels separately. We present the dip-identification algorithm employing a data-driven approach – one that uses the asymmetry in the upward-going and downward-going events, binned in the reconstructed L/E of muons – to demonstrate the dip, which would confirm the oscillation hypothesis. We further propose, for the first time, the identification of an “oscillation valley” in the reconstructed ($$E_\mu $$ E μ ,$$\,\cos \theta _\mu $$ cos θ μ ) plane, feasible for detectors like ICAL having excellent muon energy and direction resolutions. We illustrate how this two-dimensional valley would offer a clear visual representation and test of the L/E dependence, the alignment of the valley quantifying the atmospheric mass-squared difference. Owing to the charge identification capability of the ICAL detector at INO, we always present our results using $$\mu ^{-}$$ μ - and $$\mu ^{+}$$ μ + events separately. Taking into account the statistical fluctuations and systematic errors, and varying oscillation parameters over their currently allowed ranges, we estimate the precision to which atmospheric neutrino oscillation parameters would be determined with the 10-year simulated data at ICAL using our procedure.


Energies ◽  
2020 ◽  
Vol 13 (15) ◽  
pp. 3791
Author(s):  
Yong Li ◽  
Jue Yang ◽  
Wei Long Liu ◽  
Cheng Lin Liao

The lithium-ion battery is a complicated non-linear system with multi electrochemical processes including mass and charge conservations as well as electrochemical kinetics. The calculation process of the electrochemical model depends on an in-depth understanding of the physicochemical characteristics and parameters, which can be costly and time-consuming. We investigated the electrochemical modeling, reduction, and identification methods of the lithium-ion battery from the electrode-level to the system-level. A reduced 9th order linear model was proposed using electrode-level physicochemical modeling and the cell-level mathematical reduction method. The data-driven predictor-based subspace identification algorithm was presented for the estimation of lithium-ion battery model in the system-level. The effectiveness of the proposed modeling and identification methods was validated in an experimental study based on LiFePO4 cells. The accuracy and dynamic characteristics of the identified model were found to be much more likely related to the operating State of Charge (SOC) range. Experimental results showed that the proposed methods perform well with high precision and good robustness in the SOC range of 90% to 10%, and the tracking error increases significantly within higher (100–90%) or lower (10–0%) SOC ranges. Moreover, to achieve an optimal balance between high-precision and low complexity, statistical analysis revealed that the 6th, 3rd, and 5th order battery model is the optimal choice in the SOC range of 90% to 100%, 90% to 10%, and 10% to 0%, respectively.


Complexity ◽  
2018 ◽  
Vol 2018 ◽  
pp. 1-15 ◽  
Author(s):  
Kailong Liu ◽  
Kang Li ◽  
Qiao Peng ◽  
Yuanjun Guo ◽  
Li Zhang

Temperature is a crucial state to guarantee the reliability and safety of a battery during operation. The ability to estimate battery temperature, especially the internal temperature, is of paramount importance to the battery management system for monitoring and thermal control purposes. In this paper, a data-driven approach combining the RBF neural network (NN) and the extended Kalman filter (EKF) is proposed to estimate the internal temperature for lithium-ion battery thermal management. To be specific, the suitable input terms and the number of hidden nodes for the RBF NN are first optimized by a two-stage stepwise identification algorithm (TSIA). Then, the teaching-learning-based optimization (TLBO) algorithm is developed to optimize the centres and widths in every neuron of basis function. After optimizing the RBF NN model, a battery lumped thermal model is adopted as the state function with the EKF to filter out the outliers of the RBF model and reduce the estimation error. This data-driven approach is validated under four different conditions in comparison with the linear NN models. The experimental results demonstrate that the proposed RBF data-driven approach outperforms the other approaches and can be extended to other types of batteries for thermal monitoring and management.


2014 ◽  
Vol 2014 ◽  
pp. 1-11 ◽  
Author(s):  
Xiaosuo Luo ◽  
Yongduan Song

This paper presents a data-driven adaptive predictive control method using closed-loop subspace identification. As the predictor is the key element of the predictive controller, we propose to derive such predictor based on the subspace matrices which are obtained through the closed-loop subspace identification algorithm driven by input-output data. Taking advantage of transformational system model, the closed-loop data is effectively processed in this subspace algorithm. By combining the merits of receding window and recursive identification methods, an adaptive mechanism for online updating subspace matrices is given. Further, the data inspection strategy is introduced to eliminate the negative impact of the harmful (or useless) data on the system performance. The problems of online excitation data inaccuracy and closed-loop identification in adaptive control are well solved in the proposed method. Simulation results show the efficiency of this method.


2021 ◽  
Vol 9 ◽  
Author(s):  
Huang Yu ◽  
Yufeng Wu ◽  
Weiling Guan ◽  
Daolu Zhang ◽  
Tao Yu ◽  
...  

For low-voltage distribution networks (LVDNs), accurate models depicting network and phase connectivity are crucial to the analysis, planning, and operation of these networks. However, phase connectivity data in the LVDN are usually incorrect or missing. Wrong or incomplete phase information collected could lead to unbalanced operation of three-phase distribution systems and increased power loss. Based on the advanced measurement infrastructure (AMI) in the development of smart grids, in this study, a novel data-driven phase identification algorithm is proposed. Firstly, the method involves extracting features from voltage–time matrices using a non-linear dimension reduction algorithm. Secondly, the density-based spatial clustering of applications with noise (DBSCAN) algorithm is used to divide customers into clusters with arbitrary shape. Finally, the algorithms were tested with the IEEE European Low Voltage Test Feeder of the IEEE PES AMPS DSAS Test Feeder working group. The results showed an accuracy of over 90% for the method.


Sign in / Sign up

Export Citation Format

Share Document