A Priori Membership for Data Representation: Case Study of SPECT Heart Data Set

Author(s):  
Hamido Fujita ◽  
Yu-Chien Ko
Geophysics ◽  
2007 ◽  
Vol 72 (1) ◽  
pp. F25-F34 ◽  
Author(s):  
Benoit Tournerie ◽  
Michel Chouteau ◽  
Denis Marcotte

We present and test a new method to correct for the static shift affecting magnetotelluric (MT) apparent resistivity sounding curves. We use geostatistical analysis of apparent resistivity and phase data for selected periods. For each period, we first estimate and model the experimental variograms and cross variogram between phase and apparent resistivity. We then use the geostatistical model to estimate, by cokriging, the corrected apparent resistivities using the measured phases and apparent resistivities. The static shift factor is obtained as the difference between the logarithm of the corrected and measured apparent resistivities. We retain as final static shift estimates the ones for the period displaying the best correlation with the estimates at all periods. We present a 3D synthetic case study showing that the static shift is retrieved quite precisely when the static shift factors are uniformly distributed around zero. If the static shift distribution has a nonzero mean, we obtained best results when an apparent resistivity data subset can be identified a priori as unaffected by static shift and cokriging is done using only this subset. The method has been successfully tested on the synthetic COPROD-2S2 2D MT data set and on a 3D-survey data set from Las Cañadas Caldera (Tenerife, Canary Islands) severely affected by static shift.


2004 ◽  
Vol 14 (02) ◽  
pp. 163-176 ◽  
Author(s):  
MATTEO COMIN ◽  
CARLO FERRARI ◽  
CONCETTINA GUERRA

In this paper we present a scenario for the grid immersion of the procedures that solve the protein structural similarity determination problem. The emphasis is on the way various computational components and data resources are tied together into a workflow to be executed on a grid. The grid deployment has been organized according to the bag-of-service model: a set of different modules (with their data set) is made available to the application designers. Each module deals with a specific subproblem using a proper protein data representation. At the design level, the process of task selection produces a first general workflow that establishes which subproblems need to be solved and their temporal relations. A further refinement requires to select a procedure for each previously identified task that solves it: the choice is made among different available methods and representations. The final outcome is an instance of the workflow ready for execution on a grid. Our approach to protein structure comparison is based on a combination of indexing and dynamic programming techniques to achieve fast and reliable matching. All the components have been implemented on a grid infrastructure using Globus, and the overall tool has been tested by choosing proteins from different fold classes. The obtained results are compared against SCOP, a standard tool for the classification of known proteins.


2013 ◽  
Vol 2 (4) ◽  
pp. 12-24 ◽  
Author(s):  
Alan Redmond ◽  
Roger West ◽  
Alan Hore

This paper reviews the rationale for using a partial data set in Building Information Modeling (BIM) exchanges, influenced by the recognized difficulty of exchanging data at element or object level which depends on the information requiring compatible hardware and software, in order for the data to be read and transferred freely between applications. The solution was not to introduce a new schema in contrast to the industry's existing open exchange model ‘Industry Foundation Classes' which has been in existence since the 1980's, but for the authors to re-engineer an existing Simplified Markup Language ‘BIM XML' into subsets via XML Style Sheet Transition. The language of XML was chosen because Web services, which are developed from XML data representation format and Hypertext Transfer Protocol (HTTP) communication protocol, are platform neutral, widely accepted and utilized and come with a wide range of useful technologies. Furthermore, they support Service Oriented Architecture (SOA) – the internet platform that enables interoperability between different software programs. The methodology involved developing a full hybrid research model based on mixed methods, ‘quantitative and qualitative', interlaced into two main phases. The first phase comprised of a main survey questionnaire, focus groups, two Delphi questionnaires, semi-structured interviews and a case study. The final phase, ‘product design and testing', used semantic methods and tools, such as Business Process Management Notation. The final case study (a prototype test) successfully itemized the potential of combining three applications asynchronously in real-time. The interoperable capabilities of Web services APIs for exchanging partial sets of BIM data enabled assumptions with a higher amount of detail to be reviewed at the feasibility design stage. Future services will be built upon existing Web Ontology languages such as SPARQL descriptions to be used in conjunction with several web services connecting together on a Cloud platform to produce a knowledge ‘Semantic Web'.


2021 ◽  
Vol 12 (5) ◽  
Author(s):  
Angelo Augusto Frozza ◽  
Eduardo Dias Defreyn ◽  
Ronaldo Dos Santos Mello

Although NoSQL databases do not require a schema a priori, being aware of the database schema is essential for activities like data integration, data validation, or data interoperability. This paper presents a process for the extraction of columnar NoSQL database schemas. We adopt JSON as a canonical format for data representation, and we validate the proposed process through a prototype tool that is able to extract schemas from the HBase columnar NoSQL database system. HBase was chosen as a case study because it is one of the most popular columnar NoSQL solutions. When compared to related work, we innovate by proposing a simple solution for the inference of column data types for columnar NoSQL databases that store only byte arrays as column values, and a resulting schema that follows the JSON Schema format.


Author(s):  
Michael W. Pratt ◽  
M. Kyle Matsuba

Chapter 7 begins with an overview of Erikson’s ideas about intimacy and its place in the life cycle, followed by a summary of Bowlby and Ainsworth’s attachment theory framework and its relation to family development. The authors review existing longitudinal research on the development of family relationships in adolescence and emerging adulthood, focusing on evidence with regard to links to McAdams and Pals’ personality model. They discuss the evidence, both questionnaire and narrative, from the Futures Study data set on family relationships, including emerging adults’ relations with parents and, separately, with grandparents, as well as their anticipations of their own parenthood. As a way of illustrating the key personality concepts from this family chapter, the authors end with a case study of Jane Fonda in youth and her father, Henry Fonda, to illustrate these issues through the lives of a 20th-century Hollywood dynasty of actors.


Author(s):  
Michael W. Pratt ◽  
M. Kyle Matsuba

Chapter 6 reviews research on the topic of vocational/occupational development in relation to the McAdams and Pals tripartite personality framework of traits, goals, and life stories. Distinctions between types of motivations for the work role (as a job, career, or calling) are particularly highlighted. The authors then turn to research from the Futures Study on work motivations and their links to personality traits, identity, generativity, and the life story, drawing on analyses and quotes from the data set. To illustrate the key concepts from this vocation chapter, the authors end with a case study on Charles Darwin’s pivotal turning point, his round-the-world voyage as naturalist for the HMS Beagle. Darwin was an emerging adult in his 20s at the time, and we highlight the role of this journey as a turning point in his adult vocational development.


2003 ◽  
Vol 42 (05) ◽  
pp. 564-571 ◽  
Author(s):  
M. Schumacher ◽  
E. Graf ◽  
T. Gerds

Summary Objectives: A lack of generally applicable tools for the assessment of predictions for survival data has to be recognized. Prediction error curves based on the Brier score that have been suggested as a sensible approach are illustrated by means of a case study. Methods: The concept of predictions made in terms of conditional survival probabilities given the patient’s covariates is introduced. Such predictions are derived from various statistical models for survival data including artificial neural networks. The idea of how the prediction error of a prognostic classification scheme can be followed over time is illustrated with the data of two studies on the prognosis of node positive breast cancer patients, one of them serving as an independent test data set. Results and Conclusions: The Brier score as a function of time is shown to be a valuable tool for assessing the predictive performance of prognostic classification schemes for survival data incorporating censored observations. Comparison with the prediction based on the pooled Kaplan Meier estimator yields a benchmark value for any classification scheme incorporating patient’s covariate measurements. The problem of an overoptimistic assessment of prediction error caused by data-driven modelling as it is, for example, done with artificial neural nets can be circumvented by an assessment in an independent test data set.


2021 ◽  
Vol 4 (1) ◽  
pp. 251524592095492
Author(s):  
Marco Del Giudice ◽  
Steven W. Gangestad

Decisions made by researchers while analyzing data (e.g., how to measure variables, how to handle outliers) are sometimes arbitrary, without an objective justification for choosing one alternative over another. Multiverse-style methods (e.g., specification curve, vibration of effects) estimate an effect across an entire set of possible specifications to expose the impact of hidden degrees of freedom and/or obtain robust, less biased estimates of the effect of interest. However, if specifications are not truly arbitrary, multiverse-style analyses can produce misleading results, potentially hiding meaningful effects within a mass of poorly justified alternatives. So far, a key question has received scant attention: How does one decide whether alternatives are arbitrary? We offer a framework and conceptual tools for doing so. We discuss three kinds of a priori nonequivalence among alternatives—measurement nonequivalence, effect nonequivalence, and power/precision nonequivalence. The criteria we review lead to three decision scenarios: Type E decisions (principled equivalence), Type N decisions (principled nonequivalence), and Type U decisions (uncertainty). In uncertain scenarios, multiverse-style analysis should be conducted in a deliberately exploratory fashion. The framework is discussed with reference to published examples and illustrated with the help of a simulated data set. Our framework will help researchers reap the benefits of multiverse-style methods while avoiding their pitfalls.


2015 ◽  
Vol 8 (2) ◽  
pp. 941-963 ◽  
Author(s):  
T. Vlemmix ◽  
F. Hendrick ◽  
G. Pinardi ◽  
I. De Smedt ◽  
C. Fayt ◽  
...  

Abstract. A 4-year data set of MAX-DOAS observations in the Beijing area (2008–2012) is analysed with a focus on NO2, HCHO and aerosols. Two very different retrieval methods are applied. Method A describes the tropospheric profile with 13 layers and makes use of the optimal estimation method. Method B uses 2–4 parameters to describe the tropospheric profile and an inversion based on a least-squares fit. For each constituent (NO2, HCHO and aerosols) the retrieval outcomes are compared in terms of tropospheric column densities, surface concentrations and "characteristic profile heights" (i.e. the height below which 75% of the vertically integrated tropospheric column density resides). We find best agreement between the two methods for tropospheric NO2 column densities, with a standard deviation of relative differences below 10%, a correlation of 0.99 and a linear regression with a slope of 1.03. For tropospheric HCHO column densities we find a similar slope, but also a systematic bias of almost 10% which is likely related to differences in profile height. Aerosol optical depths (AODs) retrieved with method B are 20% high compared to method A. They are more in agreement with AERONET measurements, which are on average only 5% lower, however with considerable relative differences (standard deviation ~ 25%). With respect to near-surface volume mixing ratios and aerosol extinction we find considerably larger relative differences: 10 ± 30, −23 ± 28 and −8 ± 33% for aerosols, HCHO and NO2 respectively. The frequency distributions of these near-surface concentrations show however a quite good agreement, and this indicates that near-surface concentrations derived from MAX-DOAS are certainly useful in a climatological sense. A major difference between the two methods is the dynamic range of retrieved characteristic profile heights which is larger for method B than for method A. This effect is most pronounced for HCHO, where retrieved profile shapes with method A are very close to the a priori, and moderate for NO2 and aerosol extinction which on average show quite good agreement for characteristic profile heights below 1.5 km. One of the main advantages of method A is the stability, even under suboptimal conditions (e.g. in the presence of clouds). Method B is generally more unstable and this explains probably a substantial part of the quite large relative differences between the two methods. However, despite a relatively low precision for individual profile retrievals it appears as if seasonally averaged profile heights retrieved with method B are less biased towards a priori assumptions than those retrieved with method A. This gives confidence in the result obtained with method B, namely that aerosol extinction profiles tend on average to be higher than NO2 profiles in spring and summer, whereas they seem on average to be of the same height in winter, a result which is especially relevant in relation to the validation of satellite retrievals.


Sign in / Sign up

Export Citation Format

Share Document