Enhanced Credit Prediction Using Artificial Data

Author(s):  
Peter Mitic ◽  
James Cooper
Keyword(s):  
2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Naomi A. Arnold ◽  
Raul J. Mondragón ◽  
Richard G. Clegg

AbstractDiscriminating between competing explanatory models as to which is more likely responsible for the growth of a network is a problem of fundamental importance for network science. The rules governing this growth are attributed to mechanisms such as preferential attachment and triangle closure, with a wealth of explanatory models based on these. These models are deliberately simple, commonly with the network growing according to a constant mechanism for its lifetime, to allow for analytical results. We use a likelihood-based framework on artificial data where the network model changes at a known point in time and demonstrate that we can recover the change point from analysis of the network. We then use real datasets and demonstrate how our framework can show the changing importance of network growth mechanisms over time.


2014 ◽  
Vol 70 (3) ◽  
pp. 248-256 ◽  
Author(s):  
Julian Henn ◽  
Kathrin Meindl

The formerly introduced theoreticalRvalues [Henn & Schönleber (2013).Acta Cryst.A69, 549–558] are used to develop a relative indicator of systematic errors in model refinements,Rmeta, and applied to published charge-density data. The counter ofRmetagives an absolute measure of systematic errors in percentage points. The residuals (Io−Ic)/σ(Io) of published data are examined. It is found that most published models correspond to residual distributions that are not consistent with the assumption of a Gaussian distribution. The consistency with a Gaussian distribution, however, is important, as the model parameter estimates and their standard uncertainties from a least-squares procedure are valid only under this assumption. The effect of correlations introduced by the structure model is briefly discussed with the help of artificial data and discarded as a source of serious correlations in the examined example. Intensity and significance cutoffs applied in the refinement procedure are found to be mechanisms preventing residual distributions from becoming Gaussian. Model refinements against artificial data yield zero or close-to-zero values forRmetawhen the data are not truncated and small negative values in the case of application of a moderate cutoffIo> 0. It is well known from the literature that the application of cutoff values leads to model bias [Hirshfeld & Rabinovich (1973).Acta Cryst.A29, 510–513].


2014 ◽  
Vol 11 (5) ◽  
pp. 2391-2422
Author(s):  
F. Miesner ◽  
A. Lechleiter ◽  
C. Müller

Abstract. Temperature fields in marine sediments are studied for various purposes. Often, the target of research is the steady state heat flow as a (possible) source of energy but there are also studies attempting to reconstruct bottom water temperature variations to understand more about climate history. The bottom water temperature propagates into the sediment to different depths, depending on the amplitude and period of the deviation. The steady state heat flow can only be determined when the bottom water temperature is constant while the bottom water temperature history can only be reconstructed when the deviation has an amplitude large enough or the measurements are taken in great depths. In this work, the aim is to reconstruct recent bottom water temperature history such as the last two years. To this end, measurements to depths of up to 6 m shall be adequate and amplitudes smaller than 1 K should be reconstructable. First, a commonly used forward model is introduced and analyzed: knowing the bottom water temperature deviation in the last years and the thermal properties of the sediments, the forward model gives the sediment temperature field. Next, an inversion operator and two common inversion schemes are introduced. The analysis of the inversion operator and both algorithms is kept short, but sources for further reading are given. The algorithms are then tested for artificial data with different noise levels and for two example data sets, one from the German North Sea and one from the Davis Strait. Both algorithms show good and stable results for artificial data. The achieved results for measured data have low variances and match to the observed oceanographic settings. Lastly, the desired and obtained accuracy are discussed. For artificial data, the presented method yields satisfying results. However, for measured data the interpretation of the results is more difficult as the exact form of the bottom water deviation is not known. Nevertheless, the presented inversion method seems rather promising due to its accuracy and stability for artificial data. Continuing to work on the development of more sophisticated models for the bottom water temperature, we hope to cover more different oceanographic settings in the future.


2017 ◽  
Vol 33 (1) ◽  
pp. 155-186
Author(s):  
Marcela Cohen Martelotte ◽  
Reinaldo Castro Souza ◽  
Eduardo Antônio Barros da Silva

Abstract Considering that many macroeconomic time series present changing seasonal behaviour, there is a need for filters that are robust to such changes. This article proposes a method to design seasonal filters that address this problem. The design was made in the frequency domain to estimate seasonal fluctuations that are spread around specific bands of frequencies. We assessed the generated filters by applying them to artificial data with known seasonal behaviour based on the ones of the real macroeconomic series, and we compared their performance with the one of X-13A-S. The results have shown that the designed filters have superior performance for series with pronounced moving seasonality, being a good alternative in these cases.


2018 ◽  
Vol 61 (2) ◽  
pp. 210-222 ◽  
Author(s):  
Joseph M Matthes ◽  
A Dwayne Ball

Establishing discriminant validity has been a keystone of measurement validity in empirical marketing research for many decades. Without statistically showing that constructs have discriminant validity, contributions to marketing literature are likely to foster the proliferation of constructs that are operationally the same as other constructs already present in the literature, thus leading to confusion in the development of theory. This article addresses this concern by evaluating well-established methods for testing discriminant validity through the simulation of artificial datasets (containing varying levels of correlation between constructs, sample size, measurement error, and distribution skewness). The artificial data are applied to six commonly used approaches for testing the existence of discriminant validity. Results strongly suggest that several methods are much more likely than others to yield accurate assessments of whether discriminant validity exists, especially under specific conditions. Recommendations for practice in the assessment of discriminant validity are suggested.


Galaxies ◽  
2019 ◽  
Vol 8 (1) ◽  
pp. 3
Author(s):  
Vesna Lukic ◽  
Francesco de Gasperin ◽  
Marcus Brüggen

Finding and classifying astronomical sources is key in the scientific exploitation of radio surveys. Source-finding usually involves identifying the parts of an image belonging to an astronomical source, against some estimated background. This can be problematic in the radio regime, owing to the presence of correlated noise, which can interfere with the signal from the source. In the current work, we present ConvoSource, a novel method based on a deep learning technique, to identify the positions of radio sources, and compare the results to a Gaussian-fitting method. Since the deep learning approach allows the generation of more training images, it should perform well in the source-finding task. We test the source-finding methods on artificial data created for the data challenge of the Square Kilometer Array (SKA). We investigate sources that are divided into three classes: star forming galaxies (SFGs) and two classes of active galactic nuclei (AGN). The artificial data are given at two different frequencies (560 MHz and 1400 MHz), three total integration times (8 h, 100 h, 1000 h), and three signal-to-noise ratios (SNRs) of 1, 2, and 5. At lower SNRs, ConvoSource tends to outperform a Gaussian-fitting approach in the recovery of SFGs and all sources, although at the lowest SNR of one, the better performance is likely due to chance matches. The Gaussian-fitting method performs better in the recovery of the AGN-type sources at lower SNRs. At a higher SNR, ConvoSource performs better on average in the recovery of AGN sources, whereas the Gaussian-fitting method performs better in the recovery of SFGs and all sources. ConvoSource usually performs better at shorter total integration times and detects more true positives and misses fewer sources compared to the Gaussian-fitting method; however, it detects more false positives.


Sign in / Sign up

Export Citation Format

Share Document