Ad hoc procedure for optimising agreement between observational records

Observational studies in the field of sport are complicated by the added difficulty of having to analyse multiple, complex events or behaviours that may last just a fraction of a second. In this study, we analyse three aspects related to the reliability of data collected in such a study. The first aim was to analyse and compare the reliability of data sets assessed quantitatively (calculation of kappa statistic) and qualitatively (consensus agreement method). The second aim was to describe how, by ensuring the alignment of events, we calculated the kappa statistic for the order parameter using SDIS-GSEQ software (version 5.1) for data sets containing different numbers of sequences. The third objective was to describe a new consultative procedure designed to remove the confusion generated by discordant data sets and improve the reliability of the data. The procedure is called “consultative” because it involves the participation of a new observer who is responsible for consulting the existing observations and deciding on the definitive result.

Download Full-text

Experiments of Image Classification Using Dissimilarity Spaces Built with Siamese Networks

Sensors ◽

10.3390/s21051573 ◽

2021 ◽

Vol 21 (5) ◽

pp. 1573

Author(s):

Loris Nanni ◽

Giovanni Minchio ◽

Sheryl Brahnam ◽

Gianluca Maguolo ◽

Alessandra Lumini

Keyword(s):

Vector Space ◽

Image Classification ◽

Ad Hoc ◽

Feature Space ◽

Medical Data ◽

Training Data ◽

Data Sets ◽

Large Set ◽

Clustering Methods ◽

Siamese Networks

Traditionally, classifiers are trained to predict patterns within a feature space. The image classification system presented here trains classifiers to predict patterns within a vector space by combining the dissimilarity spaces generated by a large set of Siamese Neural Networks (SNNs). A set of centroids from the patterns in the training data sets is calculated with supervised k-means clustering. The centroids are used to generate the dissimilarity space via the Siamese networks. The vector space descriptors are extracted by projecting patterns onto the similarity spaces, and SVMs classify an image by its dissimilarity vector. The versatility of the proposed approach in image classification is demonstrated by evaluating the system on different types of images across two domains: two medical data sets and two animal audio data sets with vocalizations represented as images (spectrograms). Results show that the proposed system’s performance competes competitively against the best-performing methods in the literature, obtaining state-of-the-art performance on one of the medical data sets, and does so without ad-hoc optimization of the clustering methods on the tested data sets.

Download Full-text

Prediction of Progestin Affinity for the Human Progesterone Receptor Based on Corrected RBA Data

Biomedical Chemistry Research and Methods ◽

10.18097/bmcrm00080 ◽

2018 ◽

Vol 1 (4) ◽

pp. e00080

Author(s):

A.V. Mikurova ◽

V.S. Skvortsov

Keyword(s):

Progesterone Receptor ◽

Prediction Equation ◽

Binding Activity ◽

Data Sets ◽

Data Set ◽

The Third ◽

Relative Binding ◽

Human Progesterone Receptor ◽

Nuclear Progesterone Receptor

The modeling of complexes of 3 sets of steroid and nonsteroidal progestins with the ligand-binding domain of the nuclear progesterone receptor was performed. Molecular docking procedure, long-term simulation of molecular dynamics and subsequent analysis by MM-PBSA (MM-GBSA) were used to model the complexes. Using the characteristics obtained by the MM-PBSA method two data sets of steroid compounds obtained in different scientific groups a prediction equation for the value of relative binding activity (RBA) was constructed. The RBA value was adjusted so that in all samples the actual activity was compared with the progesterone activity. The third data set of nonsteroidal compounds was used as a test. The resulted equation showed that the prediction results could be applied to both steroid molecules and nonsteroidal progestins.

Download Full-text

Eco-Cultural Tourism for Biodiversity Conservation and Sustainable Development of Remote Ecosystems in the Third World

International Tourism and Hospitality in the Digital Age - Advances in Hospitality, Tourism, and the Services Industry ◽

10.4018/978-1-4666-8268-9.ch003 ◽

2015 ◽

pp. 34-55 ◽

Cited By ~ 2

Author(s):

G. Poyyamoli

Keyword(s):

Sustainable Development ◽

Biodiversity Conservation ◽

Third World ◽

Ad Hoc ◽

Management Strategies ◽

Active Management ◽

Paradigm Shifts ◽

Resource Base ◽

The Third ◽

The Third World

Most of the remote areas such as mountains and islands are characterized by the features such as remoteness, fragility, endemism, and upland/lowland or island/mainland linkages, besides richer biodiversity and indigenous knowledge, thus attracting a large number of quality conscious tourists. However, conventional “top-down”, reactive and ad-hoc approaches and ill-conceived “development” activities such as infrastructure for mass tourism will destroy the very natural and cultural resource base on which the tourism thrives in these areas. These trends have led to the paradigm shifts towards community-based, participatory, and pro-active management strategies. Appropriate strategies for integrating biodiversity conservation and sustainable livelihoods by regenerating nature and culture for facilitating sustainable development of remote ecosystems in the third world are discussed in this chapter.

Download Full-text

Localization of Data Sets in Distributed Database Systems Using Slope-Based Vertical Fragmentation

Handling Priority Inversion in Time-Constrained Distributed Databases - Advances in Data Mining and Database Management ◽

10.4018/978-1-7998-2491-6.ch003 ◽

2020 ◽

pp. 36-60

Author(s):

Ashish Ranjan Mishra ◽

Neelendra Badal

Keyword(s):

Database Systems ◽

Distributed Database ◽

Communication Cost ◽

Data Sets ◽

Distributed Database Systems ◽

The Third ◽

Vertical Partitioning ◽

Vertical Fragmentation ◽

Partitioning Algorithm ◽

Better Than

This chapter explains an algorithm that can perform vertical partitioning of database tables dynamically on distributed database systems. After vertical partitioning, a new algorithm is developed to allocate that fragments to the proper sites. To accomplish this, three major tasks are performed in this chapter. The first task is to develop a partitioning algorithm, which can partition the relation in such a way that it would perform better than most of the existing algorithms. The second task is to allocate the fragments to the appropriate sites where allocating the fragments will incur low communication cost with respect to other sites. The third task is to monitor the change in frequency of queries at different sites as well as same site. If the change in frequency of queries at different sites as well as the same site exceeds the threshold, the re-partitioning and re-allocation are performed.

Download Full-text

Graphs and Maps

Hurricane Climatology ◽

10.1093/oso/9780199827633.003.0008 ◽

2013 ◽

Author(s):

James B. Elsner ◽

Thomas H. Jagger

Keyword(s):

Web Site ◽

Data Sets ◽

Good Strategy ◽

Sample Mean ◽

The Third ◽

Standard Base ◽

Open Session ◽

Box Plot

Graphs and maps help you reason with data. They also help you communicate results. A good graph gives you the most information in the shortest time, with the least ink in the smallest space (Tufte, 1997). In this chapter, we show you how to make graphs and maps using R. A good strategy is to follow along with an open session, typing (or copying) the code as you read. Before you begin make sure you have the following data sets available in your workspace. Do this by typing . . . > SOI = read.table("SOI.txt", header=TRUE) > NAO = read.table("NAO.txt", header=TRUE) > SST = read.table("SST.txt", header=TRUE) > A = read.table("ATL.txt", header=TRUE) > US = read.table("H.txt", header=TRUE) . . . Not all the code is shown but all is available on our Web site. It is easy to make a graph. Here we provide guidance to help you make informative graphs. It is a tutorial on how to create publishable figures from your data. In R you have several choices. With the standard (base) graphics environment, you can produce a variety of plots with fine details. Most of the figures in this book use the standard graphics environment. The grid graphics environment is even more flexible. It allows you to design complex layouts with nested graphs where scaling is maintained upon resizing. The lattice and ggplot2 packages use grid graphics to create more specialized graphing functions and methods. The spplot function for example is plot method built with grid graphics that you will use to create maps. The ggplot2 package is an implementation of the grammar of graphics combining advantages from the standard and lattice graphic environments. It is worth the effort to learn. We begin with the standard graphics environment. A box plot is a graph of the five-number summary. The summary function applied to data produces the sample mean along with five other statistics including the minimum, the first quartile value, the median, the third quartile value, and the maximum. The box plot graphs these numbers. This is done using the boxplot function.

Download Full-text

Major Revisions in Arthropod Phylogeny Through Improved Supermatrix, With Support for Two Possible Waves of Land Invasion by Chelicerates

Evolutionary Bioinformatics ◽

10.1177/1176934320903735 ◽

2020 ◽

Vol 16 ◽

pp. 117693432090373 ◽

Cited By ~ 4

Author(s):

Katherine E Noah ◽

Jiasheng Hao ◽

Luyan Li ◽

Xiaoyan Sun ◽

Brian Foley ◽

...

Keyword(s):

Codon Position ◽

Amino Acid Level ◽

Phylogenetic Reconstruction ◽

Original Data ◽

Data Sets ◽

Multiple Sequence ◽

Protein Coding ◽

Individual Gene ◽

The Third ◽

New Hypothesis

Deep phylogeny involving arthropod lineages is difficult to recover because the erosion of phylogenetic signals over time leads to unreliable multiple sequence alignment (MSA) and subsequent phylogenetic reconstruction. One way to alleviate the problem is to assemble a large number of gene sequences to compensate for the weakness in each individual gene. Such an approach has led to many robustly supported but contradictory phylogenies. A close examination shows that the supermatrix approach often suffers from two shortcomings. The first is that MSA is rarely checked for reliability and, as will be illustrated, can be poor. The second is that, to alleviate the problem of homoplasy at the third codon position of protein-coding genes due to convergent evolution of nucleotide frequencies, phylogeneticists may remove or degenerate the third codon position but may do it improperly and introduce new biases. We performed extensive reanalysis of one of such “big data” sets to highlight these two problems, and demonstrated the power and benefits of correcting or alleviating these problems. Our results support a new group with Xiphosura and Arachnopulmonata (Tetrapulmonata + Scorpiones) as sister taxa. This favors a new hypothesis in which the ancestor of Xiphosura and the extinct Eurypterida (sea scorpions, of which many later forms lived in brackish or freshwater) returned to the sea after the initial chelicerate invasion of land. Our phylogeny is supported even with the original data but processed with a new “principled” codon degeneration. We also show that removing the 1673 codon sites with both AGN and UCN codons (encoding serine) in our alignment can partially reconcile discrepancies between nucleotide-based and AA-based tree, partly because two sequences, one with AGN and the other with UCN, would be identical at the amino acid level but quite different at the nucleotide level.

Download Full-text

Sistema de observación para analizar la interacción en el juego de Boccia por equipos

Cuadernos de Psicología del Deporte ◽

10.6018/cpd.393821 ◽

2019 ◽

Vol 20 (1) ◽

pp. 37-47

Author(s):

Daniel Lapresa Ajamil ◽

Javier Pascual Laguna ◽

Javier Arana ◽

M. Teresa Anguera

Keyword(s):

Ad Hoc ◽

High Reliability ◽

Observation System ◽

Data Sets ◽

Antisocial Behaviors ◽

Observation Instrument ◽

The Social ◽

El Sistema ◽

Disability Group ◽

Generalizability Study

Se ha diseñado un instrumento de observación ad hoc, combinación de formato de campo y sistemas de categorías, que permite analizar la interacción social -conductas prosociales y antisociales- que tiene lugar en la competición por equipos en el juego de boccia. El registro y codificación de los datos se ha desarrollado mediante el software Lince. La validez de contenido del instrumento de observación ha quedado avalada por el equipo técnico de la Selección Española de boccia. Los resultados relativos a la concordancia entre los registros generados por tres observadores diferentes, calculada mediante el coeficiente Kappa de Cohen, indican una elevada fiabilidad de los datos obtenidos mediante el sistema de observación. En el seno de la teoría de la Generalizabilidad, mediante el software SAGT, se ha desarrollado el plan de medida [Jugador] [Categoría] / [Parcial], que ha permitido asegurar que, con el número de parciales analizados, se consigue una elevada fiabilidad de precisión de generalización. Además, se ha procedido a la optimización del plan de medida [Parciales] [Categorías] / [Jugador]. La operatividad del sistema de observación desarrollado ha quedado patente en los T-patterns detectados mediante el software Theme, versión 6. Edu. De los resultados obtenidos se desprende que el juego de boccia constituye un entorno favorable de elevado valor formativo para el colectivo de la discapacidad. The observation instrument was purpose-built and combines a field format with systems of categories. The observation instrument allows to analyze the social interaction -prosocial and antisocial behaviors- that takes place in team boccia competition. The content validity of the observation instrument has been guaranteed by the coaching staff of the Boccia Spanish Team. The data were coded with the Lince software programme. Cohen's Kappa coefficient obtained by comparing the data sets generated by three observers indicates a high reliability of the data. We also performed a generalizability study, [Player][Category]/[End], demonstrating the consistency of the data based on the Ends observed. The application of the optimization module for [End][Category]/[Player] facets showed us how many players would constitute an optimal sample in future studies. The practical application of the observation system was demonstrated by performing T-pattern analysis using Theme software programme. The results obtained show that boccia is a very favorable educational environment for the disability group. O instrumento de observação foi construído ad hoc e combina um formato de campo com sistemas de categorias. O instrumento de observação permite analisar a interação social - comportamentos anti-sociais e anti-sociais - que ocorre na competição de bocha em equipe. A validade de conteúdo do instrumento de observação foi garantida pela equipe técnica da Equipe Espanhola de Boccia. Os dados foram codificados com o programa de software Lince. Coeficiente Kappa de Cohen obtido pela comparação dos conjuntos de dados gerados por três observadores indica alta confiabilidade dos dados. Também realizamos um estudo de generalização [Jogador] [Categoria] / [Parcial], demonstrando a consistência dos dados com base nas extremidades observadas. A aplicação do módulo de otimização para as facetas [Parciales] [Categorias] / [Jogador] nos mostrou quantos jogadores seriam uma ótima amostra em estudos futuros. A aplicação prática do sistema de observação foi demonstrada através da análise do padrão T usando o programa de software Theme. Os resultados obtidos são desprezíveis que o jogo de bocha é constituído por um formulário de valor favorável para o colectivo da discapacidade.

Download Full-text

Improved retrieval of nitrogen dioxide (NO<sub>2</sub>) column densities by means of MKIV Brewer spectrophotometers

Atmospheric Measurement Techniques ◽

10.5194/amt-7-4009-2014 ◽

2014 ◽

Vol 7 (11) ◽

pp. 4009-4022 ◽

Cited By ~ 8

Author(s):

H. Diémoz ◽

A. M. Siani ◽

A. Redondas ◽

V. Savastiouk ◽

C. T. McElroy ◽

...

Keyword(s):

Nitrogen Dioxide ◽

Ad Hoc ◽

Absorption Spectrometry ◽

Atmospheric Composition ◽

Data Sets ◽

Data Set ◽

Noise Interference ◽

Atmospheric Species ◽

Mass Factor

Abstract. A new algorithm to retrieve nitrogen dioxide (NO2) column densities using MKIV ("Mark IV") Brewer spectrophotometers is described. The method includes several improvements, such as a more recent spectroscopic data set, the reduction of measurement noise, interference by other atmospheric species and instrumental settings, and a better determination of the zenith sky air mass factor. The technique was tested during an ad hoc calibration campaign at the high-altitude site of Izaña (Tenerife, Spain) and the results of the direct sun and zenith sky geometries were compared to those obtained by two reference instruments from the Network for the Detection of Atmospheric Composition Change (NDACC): a Fourier Transform Infrared Radiometer (FTIR) and an advanced visible spectrograph (RASAS-II) based on the differential optical absorption spectrometry (DOAS) technique. To determine the extraterrestrial constant, an easily implementable extension of the standard Langley technique for very clean sites without tropospheric NO2 was developed which takes into account the daytime linear drift of stratospheric nitrogen dioxide due to photochemistry. The measurement uncertainty was thoroughly determined by using a Monte Carlo technique. Poisson noise and wavelength misalignments were found to be the most influential contributors to the overall uncertainty, and possible solutions are proposed for future improvements. The new algorithm is backward-compatible, thus allowing for the reprocessing of historical data sets.

Download Full-text

Development of Multiregime Speed–Density Relationships by Cluster Analysis

Transportation Research Record Journal of the Transportation Research Board ◽

10.1177/0361198105193400107 ◽

2005 ◽

Vol 1934 (1) ◽

pp. 64-71 ◽

Cited By ~ 7

Author(s):

Lu Sun ◽

Jie Zhou

Keyword(s):

Cluster Analysis ◽

Traffic Flow ◽

Ad Hoc ◽

Data Sets ◽

Traffic Data ◽

Microscopic Traffic Simulation ◽

Car Following ◽

Data Segmentation ◽

Essential Components ◽

Real Traffic

Empirical speed–density relationships are important not only because of the central role that they play in macroscopic traffic flow theory but also because of their connection to car-following models, which are essential components of microscopic traffic simulation. Multiregime traffic speed– density relationships are more plausible than single-regime models for representing traffic flow over the entire range of density. However, a major difficulty associated with multiregime models is that the breakpoints of regimes are determined in an ad hoc and subjective manner. This paper proposes the use of cluster analysis as a natural tool for the segmentation of speed–density data. After data segmentation, regression analysis can be used to fit each data subset individually. Numerical examples with three real traffic data sets are presented to illustrate such an approach. Using cluster analysis, modelers have the flexibility to specify the number of regimes. It is shown that the K-means algorithm (where K represents the number of clusters) with original (nonstandardized) data works well for this purpose and can be conveniently used in practice.

Download Full-text

Novel Approaches to Smoothing and Comparing SELDI TOF Spectra

Cancer Informatics ◽

10.1177/117693510500100109 ◽

2005 ◽

Vol 1 ◽

pp. 117693510500100 ◽

Cited By ~ 4

Author(s):

Sreelatha Meleth ◽

Isam-Eldin Eltoum ◽

Liu Zhu ◽

Denise Oelschlager ◽

Chandrika Piyathilake ◽

...

Keyword(s):

Spectral Analysis ◽

Fourier Transforms ◽

Area Under The Curve ◽

Maximum Intensity ◽

Data Sets ◽

Intensity Level ◽

Prominent Feature ◽

Data Set ◽

The Third ◽

Novel Approaches

Background Most published literature using SELDI-TOF has used traditional techniques in Spectral Analysis such as Fourier transforms and wavelets for denoising. Most of these publications also compare spectra using their most prominent feature, ie, peaks or local maximums. Methods The maximum intensity value within each window of differentiable m/z values was used to represent the intensity level in that window. We also calculated the ‘Area under the Curve’ (AUC) spanned by each window. Results Keeping everything else constant, such as pre-processing of the data and the classifier used, the AUC performed much better as a metric of comparison than the peaks in two out of three data sets. In the third data set both metrics performed equivalently. Conclusions This study shows that the feature used to compare spectra can have an impact on the results of a study attempting to identify biomarkers using SELDI TOF data.

Download Full-text