Uneven Data Quality and the Earliest Occupation of Europe – The Case of Untermassfeld (Germany)

AbstractThe database regarding the earliest occupation of Europe has increased significantly in quantity and quality of data points over the last two decades, mainly through the addition of new sites as a result of long-term systematic excavations and large-scale prospections of Early and early Middle Pleistocene exposures. The site distribution pattern suggests an ephemeral presence of hominins in the south of Europe from around one million years ago, with occasional short northward expansions along the western coastal areas when temperate conditions permitted. From around 600,000-700,000 years ago Acheulean artefacts appear in Europe and somewhat later hominin presence seems to pick up, with more sites and now some also present in colder climatic settings. It is again only later, around 350,000 years ago, that the first sites show up in more continental, central parts of Europe, east of the Rhine. A series of recent papers on the Early Pleistocene palaeontological site of Untermassfeld (Germany) makes claims that are of great interest for studies of earliest Europe and are at odds with the described pattern: the papers suggest that Untermassfeld has yielded stone tools and humanly modified faunal remains, evidence for a one million years old hominin presence in European continental mid-latitudes, and additional evidence that hominins were well-established in Europe already around that time period. Here we evaluate these claims and demonstrate that these studies are severely flawed in terms of data on provenance of the materials studied and in the interpretation of faunal remains and lithics as testifying to a hominin presence at the site. In actual fact any reference to the Untermassfeld site as an archaeological one is unwarranted. Furthermore, it is not the only European Early Pleistocene site where inferred evidence for hominin presence is problematic. The strength of the spatiotemporal patterns of hominin presence and absence depend on the quality of the data points we work with, and data base maintenance, including critical evaluation of new sites, is crucial to advance our knowledge of the expansions and contractions of hominin ranges during the Pleistocene.

Download Full-text

Comment on “The origin of methane in the East Siberian Arctic Shelf unraveled with triple isotope analysis” by Sapart et al. (2017)

Biogeosciences ◽

10.5194/bg-15-4777-2018 ◽

2018 ◽

Vol 15 (15) ◽

pp. 4777-4779

Author(s):

Katy J. Sparrow ◽

John D. Kessler

Keyword(s):

Natural Environment ◽

Isotope Analysis ◽

Critical Evaluation ◽

Arctic Shelf ◽

Anthropogenic Contamination ◽

Sample Collection ◽

Quality Of Data ◽

Siberian Arctic ◽

Systematic Methodology

Abstract. In this comment, we outline two major concerns regarding some of the key data presented in this paper. Both of these concerns are associated with the natural abundance radiocarbon-methane (14C-CH4) data. First, no systematic methodology is presented, nor previous peer-reviewed publication referenced, for how these samples were collected, prepared, and ultimately analyzed for 14C-CH4. Not only are these procedural details missing, but the critical evaluation of them using gaseous and aqueous blanks and standards was omitted although these details are essential for any reader to evaluate the quality of data and subsequent interpretations. Second, due to the lack of methodological details, the source of the sporadic anthropogenic contamination cannot be determined and thus it is premature for the authors to suggest it was in the natural environment prior to sample collection. As the natural 14C-CH4 data are necessary for the authors' stated scientific objectives of understanding the origin of methane in the East Siberian Arctic Shelf, our comment serves to highlight that the study's objectives have not been met.

Download Full-text

Research on Segmentation Monitoring Control of IA-RWA Algorithm with Probe Flow

Journal of Optical Communications ◽

10.1515/joc-2016-0126 ◽

2018 ◽

Vol 39 (2) ◽

pp. 231-240

Author(s):

Danping Ren ◽

Kun Guo ◽

Qiuyan Yao ◽

Jijun Zhao

Keyword(s):

Large Scale ◽

Blocking Probability ◽

Wavelength Assignment ◽

Accurate Estimation ◽

Quality Of Data ◽

Network Resources ◽

Large Scale Network ◽

Assignment Algorithm ◽

Transmission Quality

AbstractThe impairment-aware routing and wavelength assignment algorithm with probe flow (P-IA-RWA) can make an accurate estimation for the transmission quality of the link when the connection request comes. But it also causes some problems. The probe flow data introduced in the P-IA-RWA algorithm can result in the competition for wavelength resources. In order to reduce the competition and the blocking probability of the network, a new P-IA-RWA algorithm with segmentation monitoring-control mechanism (SMC-P-IA-RWA) is proposed. The algorithm would reduce the holding time of network resources for the probe flow. It segments the candidate path suitably for the data transmitting. And the transmission quality of the probe flow sent by the source node will be monitored in the endpoint of each segment. The transmission quality of data can also be monitored, so as to make the appropriate treatment to avoid the unnecessary probe flow. The simulation results show that the proposed SMC-P-IA-RWA algorithm can effectively reduce the blocking probability. It brings a better solution to the competition for resources between the probe flow and the main data to be transferred. And it is more suitable for scheduling control in the large-scale network.

Download Full-text

Machine Learning for the Educational Sciences

10.31234/osf.io/3hnr6 ◽

2021 ◽

Author(s):

Sven Hilbert ◽

Stefan Coors ◽

Elisabeth Barbara Kraus ◽

Bernd Bischl ◽

Mario Frei ◽

...

Keyword(s):

Machine Learning ◽

Large Scale ◽

Decisive Role ◽

Quality Of Data ◽

Practical Applications ◽

Educational Sciences ◽

Complex Relationships ◽

The Impact ◽

Analytical Approaches

Classical statistical methods are limited in the analysis of highdimensional datasets. Machine learning (ML) provides a powerful framework for prediction by using complex relationships, often encountered in modern data with a large number of variables, cases and potentially non-linear effects. ML has turned into one of the most influential analytical approaches of this millennium and has recently become popular in the behavioral and social sciences. The impact of ML methods on research and practical applications in the educational sciences is still limited, but continuously grows as larger and more complex datasets become available through massive open online courses (MOOCs) and large scale investigations.The educational sciences are at a crucial pivot point, because of the anticipated impact ML methods hold for the field. Here, we review the opportunities and challenges of ML for the educational sciences, show how a look at related disciplines can help learning from their experiences, and argue for a philosophical shift in model evaluation. We demonstrate how the overall quality of data analysis in educational research can benefit from these methods and show how ML can play a decisive role in the validation of empirical models. In this review, we (1) provide an overview of the types of data suitable for ML, (2) give practical advice for the application of ML methods, and (3) show how ML-based tools and applications can be used to enhance the quality of education. Additionally we provide practical R code with exemplary analyses, available at https: //osf.io/ntre9/?view only=d29ae7cf59d34e8293f4c6bbde3e4ab2.

Download Full-text

MinION Analysis and Reference Consortium: Phase 1 data release and analysis

F1000Research ◽

10.12688/f1000research.7201.1 ◽

2015 ◽

Vol 4 ◽

pp. 1075 ◽

Cited By ~ 183

Author(s):

Camilla L.C. Ip ◽

Matthew Loose ◽

John R. Tyson ◽

Mariateresa de Cesare ◽

Bonnie L. Brown ◽

...

Keyword(s):

Large Scale ◽

Phase 1 ◽

Public Access ◽

Quality Of Data ◽

E Coli ◽

Oxford Nanopore ◽

Access Programme ◽

K 12 ◽

Consistency Rate

The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced. Further analysis of the Phase 1 data presented here, and additional experiments in Phase 2 of E. coli from MARC are already underway to identify ways to improve and enhance MinION performance.

Download Full-text

Biochemical Analysis Of Some Dried Sis Fishes Of The River Padma In Rajshahi

Journal of Life and Earth Science ◽

10.3329/jles.v6i0.9719 ◽

1970 ◽

Vol 6 ◽

pp. 39-43 ◽

Cited By ~ 1

Author(s):

Sabina Sultana ◽

Selina Parween ◽

M Altaf Hossain

Keyword(s):

Large Scale ◽

Biochemical Analysis ◽

Good Condition ◽

Calcium Content ◽

Mystus Vittatus ◽

Time Period ◽

Dried Fish ◽

Glossogobius Giuris ◽

Better Than

Seven different species viz. Chanda baculis, Chanda ranga, Amblypharyngodon mola, Oxygaster bacaila, Clupisoma atherinoides, Corica soborna, Mystus vittatus and a group of mixed SIS fishes viz. Mastacembelus pancalus, Xenntodon cancila, Chanda baculis and Glossogobius giuris were used for preparation of dust which can be preserved for a time period. The fishes were sun dried or oven dried, which are also method of preservation. Quality of the oven-dried fish was better than that of the sun-dried fish, but sun-drying process is easy and can be used in large scale. The fish powder remained in good condition for 7-9 months at normal room temperature, but at -18°C the powder was in good condition throughout the year. Highest quantity of powder from 1 kg of fish was obtained in case of the mixed species as 24.61% and the lowest in O. bacaila which was 20.52%. Biochemical analysis showed that the maximum calcium content was found as 1.34% in M. vittatus and minimum was 0.80% in mixed SIS fishes. Maximum phosphorus content was 2.90% in C. ranga and minimum was 1.72% in C. soborna. Maximum iron content was found as 45.20 mg/100g in mixed SIS fishes and minimum was found as 16.85 mg/100g in O. bacaila. The maximum moisture content was found in C. ranga (13.50%) and the minimum in mixed SIS fishes (11.65%). The maximum protein content was recorded in the mixed SIS fishes (72.45%) and the minimum in C. ranga (52.65%). The experiment was replicated three times and conducted from July 2005 to July 2008. DOI: http://dx.doi.org/10.3329/jles.v6i0.9719 JLES 2011 6: 39-43

Download Full-text

Influence of Bias Correcting Predictors on Statistical Downscaling Models

Journal of Applied Meteorology and Climatology ◽

10.1175/jamc-d-16-0079.1 ◽

2017 ◽

Vol 56 (1) ◽

pp. 5-26 ◽

Cited By ~ 7

Author(s):

Mathieu Vrac ◽

Pradeebane Vaittinada Ayar

Keyword(s):

Statistical Downscaling ◽

Large Scale ◽

Local Scale ◽

Statistical Properties ◽

Historical Period ◽

Climate Projections ◽

Spatial Structures ◽

Temperature And Precipitation ◽

Time Period

AbstractStatistical downscaling models (SDMs) and bias correction (BC) methods are commonly used to provide regional or debiased climate projections. However, most SDMs are utilized in a “perfect prognosis” context, meaning that they are calibrated on reanalysis predictors before being applied to GCM simulations. If the latter are biased, SDMs might suffer from discrepancies with observations and therefore provide unrealistic projections. It is then necessary to study the influence of applying bias correcting to large-scale predictors for SDMs, since it can have impacts on the local-scale simulations: such an investigation for daily temperature and precipitation is the goal of this study. Hence, four temperature and three precipitation SDMs are calibrated over a historical period. First, the SDMs are forced by historical predictors from two GCMs, corrected or not corrected. The two types of simulations are compared with reanalysis-driven SDM outputs to characterize the quality of the simulations. Second, changes in basic statistical properties of the raw GCM projections and those of the SDM simulations—driven by bias-corrected or raw predictors from GCM future projections—are compared. Third, the stationarity of the SDM changes brought by the BC of the predictors is investigated. Changes are computed over a historical (1976–2005) and future (2071–2100) time period and compared to assess the nonstationarity. Overall, BC can have impacts on the SDM simulations, although its influence varies from one SDM to another and from one GCM to another, with different spatial structures, and depends on the considered statistical properties. Nevertheless, corrected predictors generally improve the historical projections and can impact future evolutions with potentially strong nonstationary behaviors.

Download Full-text

Collective annotation patterns in learning from crowds

Intelligent Data Analysis ◽

10.3233/ida-200009 ◽

2020 ◽

Vol 24 ◽

pp. 63-86

Author(s):

Francisco Mena ◽

Ricardo Ñanculef ◽

Carlos Valle

Keyword(s):

Machine Learning ◽

Large Scale ◽

Ground Truth ◽

Experimental Results ◽

Ground Truth Data ◽

Satisfactory Performance ◽

Machine Learning Applications ◽

Data Points ◽

Confusion Matrices

The lack of annotated data is one of the major barriers facing machine learning applications today. Learning from crowds, i.e. collecting ground-truth data from multiple inexpensive annotators, has become a common method to cope with this issue. It has been recently shown that modeling the varying quality of the annotations obtained in this way, is fundamental to obtain satisfactory performance in tasks where inexpert annotators may represent the majority but not the most trusted group. Unfortunately, existing techniques represent annotation patterns for each annotator individually, making the models difficult to estimate in large-scale scenarios. In this paper, we present two models to address these problems. Both methods are based on the hypothesis that it is possible to learn collective annotation patterns by introducing confusion matrices that involve groups of data point annotations or annotators. The first approach clusters data points with a common annotation pattern, regardless the annotators from which the labels have been obtained. Implicitly, this method attributes annotation mistakes to the complexity of the data itself and not to the variable behavior of the annotators. The second approach explicitly maps annotators to latent groups that are collectively parametrized to learn a common annotation pattern. Our experimental results show that, compared with other methods for learning from crowds, both methods have advantages in scenarios with a large number of annotators and a small number of annotations per annotator.

Download Full-text

Let the Data Serve the Patient: Using Big-Scale Data Collection to Improve Small-Scale Patient Interactions

Journal of Global Oncology ◽

10.1200/jgo.18.32200 ◽

2018 ◽

Vol 4 (Supplement 2) ◽

pp. 156s-156s

Author(s):

S. Rayne ◽

J. Meyerowitz ◽

G. Even-Tov ◽

H. Rae ◽

N. Tapela ◽

...

Keyword(s):

Breast Cancer ◽

Data Collection ◽

Cancer Care ◽

Quality Data ◽

Small Scale ◽

Information Sheet ◽

Quality Of Data ◽

Data Points ◽

Database Service

Background and context: Breast cancer is one of the most common cancers in most resource-constrained environments worldwide. Although breast awareness has improved, lack of understanding of the diagnosis and management can cause patient anxiety, noncompliance and ultimately may affect survival through compromised or delayed care. South African women attending government hospitals are diverse, with differing levels of income, education and support available. Often there is a lack of access for them to appropriate information for their cancer care. Aim: A novel bioinformatics data management system was conceived through an innovative close collaboration between Wits Biomedical Informatics and Translational Science (Wits-BITS) and academic breast cancer surgeons. The aim was to develop a platform to allow acquisition of epidemiologic data but synchronously convert this into a personalised cancer plan and “take-home” information sheet for the patient. Strategy/Tactics: The concept of a clinician “customer” was used, in which the “currency” in which they rewarded the database service was accurate data. For this payment they received the “product” of an immediate personalised information sheet for their patient. Program/Policy process: A custom software module was developed to generate individualized patient letters containing a mixture of template text and information from the patient's medical record. The letter is populated with the patient's name and where they were seen, and an personalised explanation of the patient's specific cancer stage according to the TNM system. Outcomes: Through a process of continuous use with patient and clinician feedback, the quality of data in the system was improved. Patients enjoyed the personalised information sheet, allowing patient and family to comprehend and be reassured by the management plan. Clinicians found that the quality of the information sheet was instant feedback as to the comprehensiveness of their data input, and thus assured compliance and quality of data points. What was learned: Using a consumer model, through a process of cross-discipline collaboration, where there is normally poor access to appropriate patient information and poor data entry by overburdened clinicians, a low-cost model of high-quality data collection was achieved, in real-time, by clinicians best qualified to input correct data points. Patients also benefitted from participation in a database immediately, through personalised information sheets improving their understanding of their cancer care.

Download Full-text

Desarrollo costero y ambientes marino-costeros en Bahía Culebra, Guanacaste, Costa Rica

Revista de Biología Tropical ◽

10.15517/rbt.v66i1.33301 ◽

2018 ◽

Vol 66 (1-1) ◽

pp. 309

Author(s):

Celeste Sánchez-Noguera ◽

Carlos Jiménez ◽

Jorge Cortés

Keyword(s):

Costa Rica ◽

Natural Resources ◽

Large Scale ◽

Marine Ecosystems ◽

Coastal Development ◽

Anthropogenic Factors ◽

Aesthetic Quality ◽

Environmental Programs ◽

Time Period

Abstract: Coastal development and marine ecosystems in Culebra Bay, Guanacaste, Costa Rica. Culebra Bay (North Pacific, Costa Rica) is under an accelerated process of coastal development since the implementation of the large-scale tourism development Gulf of Papagayo Project. This study aims to identify changes in the health status of marine ecosystems within the bay, during a 18yr time period (1993-2011). The high sanitary and aesthetic quality of this bay has remained constant through time. However, coral reefs are degraded and dead coral accounts for more than 65 % of benthic coverage, likely due to a combination of natural and anthropogenic factors. During the study period, only one of the developers consortium has proved their commitment to accomplishing the goals of sustainable development, as established in the bay’s Master Plan, through the practice of social and environmental programs. Management of natural resources in Culebra Bay requires the implementation of specific actions to promote the ecosystems’ recovery, with inclusion of all stakeholders. It must consider the current use of natural resources and include water waste management and environmental education programs. Rev. Biol. Trop. 66(Suppl. 1): S309-S327. Epub 2018 April 01.

Download Full-text

Think Indifferent: On the Perils Of Scientific Deliberation, Without Attention For Critical Evaluation

10.31234/osf.io/j944a ◽

2017 ◽

Author(s):

Fred Hasselman ◽

S.V. Crielaard ◽

A.M.T. Bosman

Keyword(s):

Peer Review ◽

Large Scale ◽

Meta Analysis ◽

Critical Evaluation ◽

Unconscious Thought ◽

The Unconscious ◽

Psychological Phenomena ◽

The Right

Replication attempts of empirical psychological phenomena lack guidance by a larger effort to evaluate thepsychological theory that predicted the phenomena. In this paper we present a template for theory evaluationsuited for the class of theories produced by psychological science: Theories of construction. As a case study,we perform a rigorous post-publication peer-review of the theoretical core of Unconscious Thought Theory(UTT). We present several uncomplicated indices that quantify the quality of evaluation of the results andconclusions by experts who reviewed the article and the amount of interpretation bias on behalf of theauthors. The results reveal a failure of expert peer-review to detect empirical reports of sub-standard quality.The analyses reveal there is in fact hardly any empirical evidence for the predictions of UTT in importantpapers that claim its support. Our advice is to evaluate before you replicate.NOTE: This manuscript is presented here for reference only see e.g.Nieuwenstein, M. R., Wierenga, T., Morey, R. D., Wicherts, J. M., Blom, T. N., Wagenmakers, E. J., & van Rijn, H. (2015). On making the right choice: a meta-analysis and large-scale replication attempt of the unconscious thought advantage. Judgment and Decision Making, 10(1), 1. http://journal.sjdm.org/14/14321/jdm14321.html. It will not be submitted for publication in its present form.

Download Full-text