scholarly journals Detecting Distributional Shift Responsible for Predictive Model’s Failure*

Author(s):  
Dipanwita Sinha Mukherjee ◽  
Divyanshy Bhandari ◽  
Naveen Yeri

<div>Any predictive software deployed with this hypothesis that test data distribution will not differ from training data distribution. Real time scenario does not follow this rule, which results inconsistent and non-transferable observation in various cases. This makes the dataset shift, a growing concern. In this paper, we’ve explored the recent concept of Label shift detection and classifier correction with the help of Black Box shift detection(BBSD), Black Box shift estimation(BBSE) and Black Box shift correction(BBSC). Digits dataset from ”sklearn” and ”LogisticRegression” classifier have been used for this investigation. Knock out shift was clearly detected by applying Kolmogorov–Smirnov test for BBSD. Performance of the classifier got improved after applying BBSE and BBSC from 91% to 97% in terms of overall accuracy.</div>

2021 ◽  
Author(s):  
Dipanwita Sinha Mukherjee ◽  
Divyanshy Bhandari ◽  
Naveen Yeri

<div>Any predictive software deployed with this hypothesis that test data distribution will not differ from training data distribution. Real time scenario does not follow this rule, which results inconsistent and non-transferable observation in various cases. This makes the dataset shift, a growing concern. In this paper, we’ve explored the recent concept of Label shift detection and classifier correction with the help of Black Box shift detection(BBSD), Black Box shift estimation(BBSE) and Black Box shift correction(BBSC). Digits dataset from ”sklearn” and ”LogisticRegression” classifier have been used for this investigation. Knock out shift was clearly detected by applying Kolmogorov–Smirnov test for BBSD. Performance of the classifier got improved after applying BBSE and BBSC from 91% to 97% in terms of overall accuracy.</div>


2021 ◽  
Author(s):  
Dipanwita Sinha Mukherjee ◽  
Divyanshy Bhandari ◽  
Naveen Yeri

<div>Any predictive software deployed with this hypothesis that test data distribution will not differ from training data distribution. Real time scenario does not follow this rule, which results inconsistent and non-transferable observation in various cases. This makes the dataset shift, a growing concern. In this paper, we’ve explored the recent concept of Label shift detection and classifier correction with the help of Black Box shift detection(BBSD), Black Box shift estimation(BBSE) and Black Box shift correction(BBSC). Digits dataset from ”sklearn” and ”LogisticRegression” classifier have been used for this investigation. Knock out shift was clearly detected by applying Kolmogorov–Smirnov test for BBSD. Performance of the classifier got improved after applying BBSE and BBSC from 91% to 97% in terms of overall accuracy.</div>


2015 ◽  
Vol 32 (2) ◽  
pp. 132-143
Author(s):  
Mohammad Saleh Owlia ◽  
Mohammad Saber Fallah Nezhad ◽  
Mohesn Sheikh Sajadieh

Purpose – The purpose of this paper is to propose a new method based on goodness of fit tests for shift detection problems. Design/methodology/approach – In this method, although the distribution of gathered data from the process is the subject of control, but any out-of-control signal could also be generalized to the overall state of the process including the parameters of the distribution. Findings – Results of simulation study denote that among goodness of fit tests, the χ2 test has a better performance than the Kolmogorov-Smirnov test in detecting shifts of process. Also comparison of proposed method with traditional methods denotes that, proposed method generally has smaller probabilities of first and second type errors. Originality/value – To the best of author’s knowledge, no attention has previously been paid to application of goodness of fit tests in process control.


2013 ◽  
Vol 64 (3) ◽  
pp. 371-377 ◽  
Author(s):  
Mahmoud Mohammadyan ◽  
Bijan Shabankhani

Abstract This study was carried out to determine the distribution of particles in classrooms in primary schools located in the centre of the city of Sari, Iran and identify the relationship between indoor classroom particle levels and outdoor PM2.5 concentrations. Outdoor PM2.5 and indoor PM1, PM2.5, and PM10 were monitored using a real-time Micro Dust Pro monitor and a GRIMM monitor, respectively. Both monitors were calibrated by gravimetric method using filters. The Kolmogorov-Smirnov test showed that all indoor and outdoor data fitted normal distribution. Mean indoor PM1, PM2.5, PM10 and outdoor PM2.5 concentrations for all of the classrooms were 17.6 μg m-3, 46.6 μg m-3, 400.9 μg m-3, and 36.9 μg m-3, respectively. The highest levels of indoor and outdoor PM2.5 concentrations were measured at the Shahed Boys School (69.1 μg m-3 and 115.8 μg m-3, respectively). The Kazemi school had the lowest levels of indoor and outdoor PM2.5 (29.1 μg m-3 and 15.5 μg m-3, respectively). In schools located near both main and small roads, the association between indoor fine particle (PM2.5 and PM1) and outdoor PM2.5 levels was stronger than that between indoor PM10 and outdoor PM2.5 levels. Mean indoor PM2.5 and PM10 and outdoor PM2.5 were higher than the standards for PM2.5 and PM10, and there was a good correlation between indoor and outdoor fine particle concentrations.


2019 ◽  
Vol 21 (2) ◽  
pp. 53-62
Author(s):  
Ayşe Atay DDS, PhD ◽  
Zülal Palazli DDS ◽  
Işıl Gürdal DDS ◽  
Aslıhan Üşümez DDS, PhD

The purpose of this study is to evaluate the effect of thermocycling on the color change of the amine-free dual-cure resin cements. IPS e.max CAD blocs were cut into specimens of 1 mm thickness (N=28) and cemented with one of the 4 different amine-free dual-cure resin cements (NX3 Nexus [NX], Kerr Dental; Variolink Esthetic DC [VE], Ivoclar Vivadent; Panavia V5 [PV], Kuraray Dental; G-CEM Linkforce [GC], GC Corporation) (n=7). A spectrophotometer was used for color measurements. Specimens were subjected to thermocycling (5°C and 55°C; 5000 and 10000 cycles). Normality of data distribution was tested by using the Kolmogorov-Smirnov test. Statistical analysis was performed using a two-way analysis of variance (ANOVA) and Tukey’s multiple comparison tests at a significance level of p<0.05. ∆E values were significantly influenced by the resin cements and the cycle periods (p<0.05). There were no significant differences between NX and VE groups after 5000 thermocycling, however after 10000 thermocycling VE group showed higher ∆E1 values than NX group (p>0.05).  There were no statistically significant differences between the ∆E0 and ∆E1 values of the GC group, however the other groups were affected after 10000 thermocycling (p>0.05). Amine-free resin cements used for cementation showed color change after thermocycling except GC group. All resin cements were showed clinically acceptable color change after thermocycling (∆E < 3.5).


2021 ◽  
Vol 11 (15) ◽  
pp. 7148
Author(s):  
Bedada Endale ◽  
Abera Tullu ◽  
Hayoung Shi ◽  
Beom-Soo Kang

Unmanned aerial vehicles (UAVs) are being widely utilized for various missions: in both civilian and military sectors. Many of these missions demand UAVs to acquire artificial intelligence about the environments they are navigating in. This perception can be realized by training a computing machine to classify objects in the environment. One of the well known machine training approaches is supervised deep learning, which enables a machine to classify objects. However, supervised deep learning comes with huge sacrifice in terms of time and computational resources. Collecting big input data, pre-training processes, such as labeling training data, and the need for a high performance computer for training are some of the challenges that supervised deep learning poses. To address these setbacks, this study proposes mission specific input data augmentation techniques and the design of light-weight deep neural network architecture that is capable of real-time object classification. Semi-direct visual odometry (SVO) data of augmented images are used to train the network for object classification. Ten classes of 10,000 different images in each class were used as input data where 80% were for training the network and the remaining 20% were used for network validation. For the optimization of the designed deep neural network, a sequential gradient descent algorithm was implemented. This algorithm has the advantage of handling redundancy in the data more efficiently than other algorithms.


2020 ◽  
Vol 0 (0) ◽  
Author(s):  
Mohammadreza Azimi ◽  
Seyed Ahmad Rasoulinejad ◽  
Andrzej Pacut

AbstractIn this paper, we attempt to answer the questions whether iris recognition task under the influence of diabetes would be more difficult and whether the effects of diabetes and individuals’ age are uncorrelated. We hypothesized that the health condition of volunteers plays an important role in the performance of the iris recognition system. To confirm the obtained results, we reported the distribution of usable area in each subgroup to have a more comprehensive analysis of diabetes effects. There is no conducted study to investigate for which age group (young or old) the diabetes effect is more acute on the biometric results. For this purpose, we created a new database containing 1,906 samples from 509 eyes. We applied the weighted adaptive Hough ellipsopolar transform technique and contrast-adjusted Hough transform for segmentation of iris texture, along with three different encoding algorithms. To test the hypothesis related to physiological aging effect, Welches’s t-test and Kolmogorov–Smirnov test have been used to study the age-dependency of diabetes mellitus influence on the reliability of our chosen iris recognition system. Our results give some general hints related to age effect on performance of biometric systems for people with diabetes.


2021 ◽  
pp. 1-11
Author(s):  
Tingting Zhao ◽  
Xiaoli Yi ◽  
Zhiyong Zeng ◽  
Tao Feng

YTNR (Yunnan Tongbiguan Nature Reserve) is located in the westernmost part of China’s tropical regions and is the only area in China with the tropical biota of the Irrawaddy River system. The reserve has abundant tropical flora and fauna resources. In order to realize the real-time detection of wild animals in this area, this paper proposes an improved YOLO (You only look once) network. The original YOLO model can achieve higher detection accuracy, but due to the complex model structure, it cannot achieve a faster detection speed on the CPU detection platform. Therefore, the lightweight network MobileNet is introduced to replace the backbone feature extraction network in YOLO, which realizes real-time detection on the CPU platform. In response to the difficulty in collecting wild animal image data, the research team deployed 50 high-definition cameras in the study area and conducted continuous observations for more than 1,000 hours. In the end, this research uses 1410 images of wildlife collected in the field and 1577 wildlife images from the internet to construct a research data set combined with the manual annotation of domain experts. At the same time, transfer learning is introduced to solve the problem of insufficient training data and the network is difficult to fit. The experimental results show that our model trained on a training set containing 2419 animal images has a mean average precision of 93.6% and an FPS (Frame Per Second) of 3.8 under the CPU. Compared with YOLO, the mean average precision is increased by 7.7%, and the FPS value is increased by 3.


Sign in / Sign up

Export Citation Format

Share Document