scholarly journals The Methodology on Statistical Analysis of Data Transformation for Model Development

2013 ◽  
Vol 2 (6) ◽  
pp. 94-100 ◽  
Author(s):  
Md Diah J. ◽  
Ahmad J. ◽  
Mukri M.
2020 ◽  
Vol 44 (4) ◽  
Author(s):  
George Alter ◽  
Darrell Donakowski ◽  
Jack Gager ◽  
Pascal Heus ◽  
Carson Hunter ◽  
...  

Structured Data Transformation Language (SDTL) provides structured, machine actionable representations of data transformation commands found in statistical analysis software.   The Continuous Capture of Metadata for Statistical Data Project (C2Metadata) created SDTL as part of an automated system that captures provenance metadata from data transformation scripts and adds variable derivations to standard metadata files.  SDTL also has potential for auditing scripts and for translating scripts between languages.  SDTL is expressed in a set of JSON schemas, which are machine actionable and easily serialized to other formats.  Statistical software languages have a number of special features that have been carried into SDTL.  We explain how SDTL handles differences among statistical languages and complex operations, such as merging files and reshaping data tables from “wide” to “long”. 


2018 ◽  
Vol 46 (1) ◽  
pp. 5-20 ◽  
Author(s):  
Poojari Yugendar ◽  
K.V.R. Ravishankar

Abstract Religious occasions, gathering at fairs and terminals, are the events of crowd gatherings. Such gatherings act as severe threats for crowds because of high density in less space, which ends up in adverse outcomes resulting in crowd stampedes. The movement of an individual person in a crowd is influenced by the physical factors. In the present study, characteristics like age, gender, group size, child holding, child carrying, people with luggage and without luggage are considered for crowd behaviour analysis. The average speed of the crowd movement was observed as 0.86 m/s. The statistical analysis concluded that there was a significant effect of age, gender, density and luggage on the crowd walking speed. Multi-linear regression (MLR) model was developed between crowd speed and significant factors observed from the statistical analysis. Location 1 data was used for the model development. This developed model was validated using Location 2 data. Gender has more significant effect on speed followed by luggage and age. This study helps in proper dispersal of crowd in a planned manner to that of diversified directional flow that exist during crowd gathering events.


2021 ◽  
Vol 13 (24) ◽  
pp. 13512
Author(s):  
Branislav Bošković ◽  
Mirjana Bugarinović ◽  
Gordana Savić ◽  
Ratko Djuričić

It has been exactly 20 years since the common grounds for the design of track access charges (TAC) were laid for the European railways by the publication of Directive 2001/14/EC. However, these grounds were defined broadly, thus resulting in significant divergence both in the models applied by countries and during the model redesign within one country over the course of time. The participants in the process of charge system redesign includes all stakeholders from a country’s railway sector (infrastructure manager, train operating companies, the ministries responsible for transport, finance and economy, government, and regulatory bodies). Their opinions and requirements are often opposed, and they all need to be acknowledged simultaneously. This paper aims to solve the issue of ensuring continuity in the charge model redesign while achieving a balance between the requirements of all stakeholders. Moreover, it tackles the issue of producing a sustainable long-term TAC model by using survey methods and statistical analysis. The proposed approach was tested in practice during the access charge model redesign for the railways of Montenegro. The results show the importance of continual enhancement in TAC model development as one of the challenges and key precursors for the harmonization of all stakeholders’ requirements.


2020 ◽  
Vol 73 (6) ◽  
pp. 503-508
Author(s):  
Dong Kyu Lee

Several assumptions such as normality, linear relationship, and homoscedasticity are frequently required in parametric statistical analysis methods. Data collected from the clinical situation or experiments often violate these assumptions. Variable transformation provides an opportunity to make data available for parametric statistical analysis without statistical errors. The purpose of variable transformation to enable parametric statistical analysis and its final goal is a perfect interpretation of the result with transformed variables. Variable transformation usually changes the original characteristics and nature of units of variables. Back-transformation is crucial for the interpretation of the estimated results. This article introduces general concepts about variable transformation, mainly focused on logarithmic transformation. Back-transformation and other important considerations are also described herein.


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Emad M. Grais ◽  
Xiaoya Wang ◽  
Jie Wang ◽  
Fei Zhao ◽  
Wen Jiang ◽  
...  

AbstractWideband Absorbance Immittance (WAI) has been available for more than a decade, however its clinical use still faces the challenges of limited understanding and poor interpretation of WAI results. This study aimed to develop Machine Learning (ML) tools to identify the WAI absorbance characteristics across different frequency-pressure regions in the normal middle ear and ears with otitis media with effusion (OME) to enable diagnosis of middle ear conditions automatically. Data analysis included pre-processing of the WAI data, statistical analysis and classification model development, and key regions extraction from the 2D frequency-pressure WAI images. The experimental results show that ML tools appear to hold great potential for the automated diagnosis of middle ear diseases from WAI data. The identified key regions in the WAI provide guidance to practitioners to better understand and interpret WAI data and offer the prospect of quick and accurate diagnostic decisions.


2013 ◽  
Vol 10 (1) ◽  
Author(s):  
Lenka Hudrlíková ◽  
Jana Kramulová

The general aim of a multitude of research projects is to assess a social, economic or environmental process or phenomenon by various indicators that are often measured in different units. In such situations, the data transformation and/or normalisation are inevitable. The present paper focuses on benefits and drawbacks of different normalisation methods. Further, it compares the results produced by several methods from the consistency and quality of the measurement perspective. The case of Czech NUTS 3 regions sustainability indicators is introduced. The authors employ 40 indicators divided into three sustainability pillars, attempting to conclude which method is the most suitable for further statistical analysis under the preference of dimensionless numbers.


1966 ◽  
Vol 24 ◽  
pp. 188-189
Author(s):  
T. J. Deeming

If we make a set of measurements, such as narrow-band or multicolour photo-electric measurements, which are designed to improve a scheme of classification, and in particular if they are designed to extend the number of dimensions of classification, i.e. the number of classification parameters, then some important problems of analytical procedure arise. First, it is important not to reproduce the errors of the classification scheme which we are trying to improve. Second, when trying to extend the number of dimensions of classification we have little or nothing with which to test the validity of the new parameters.Problems similar to these have occurred in other areas of scientific research (notably psychology and education) and the branch of Statistics called Multivariate Analysis has been developed to deal with them. The techniques of this subject are largely unknown to astronomers, but, if carefully applied, they should at the very least ensure that the astronomer gets the maximum amount of information out of his data and does not waste his time looking for information which is not there. More optimistically, these techniques are potentially capable of indicating the number of classification parameters necessary and giving specific formulas for computing them, as well as pinpointing those particular measurements which are most crucial for determining the classification parameters.


Author(s):  
Gianluigi Botton ◽  
Gilles L'espérance

As interest for parallel EELS spectrum imaging grows in laboratories equipped with commercial spectrometers, different approaches were used in recent years by a few research groups in the development of the technique of spectrum imaging as reported in the literature. Either by controlling, with a personal computer both the microsope and the spectrometer or using more powerful workstations interfaced to conventional multichannel analysers with commercially available programs to control the microscope and the spectrometer, spectrum images can now be obtained. Work on the limits of the technique, in terms of the quantitative performance was reported, however, by the present author where a systematic study of artifacts detection limits, statistical errors as a function of desired spatial resolution and range of chemical elements to be studied in a map was carried out The aim of the present paper is to show an application of quantitative parallel EELS spectrum imaging where statistical analysis is performed at each pixel and interpretation is carried out using criteria established from the statistical analysis and variations in composition are analyzed with the help of information retreived from t/γ maps so that artifacts are avoided.


Sign in / Sign up

Export Citation Format

Share Document