data sources Latest Research Papers

Large-scale adoption of Artificial Intelligence and Machine Learning (AI-ML) models fed by heterogeneous, possibly untrustworthy data sources has spurred interest in estimating degradation of such models due to spurious, adversarial, or low-quality data assets. We propose a quantitative estimate of the severity of classifiers’ training set degradation: an index expressing the deformation of the convex hulls of the classes computed on a held-out dataset generated via an unsupervised technique. We show that our index is computationally light, can be calculated incrementally and complements well existing ML data assets’ quality measures. As an experimentation, we present the computation of our index on a benchmark convolutional image classifier.

Download Full-text

Modeling the EU plastic footprint: Exploring data sources and littering potential

Resources Conservation and Recycling ◽

10.1016/j.resconrec.2021.106086 ◽

2022 ◽

Vol 178 ◽

pp. 106086

Author(s):

Andrea Martino Amadei ◽

Esther Sanyé-Mengual ◽

Serenella Sala

Keyword(s):

Data Sources ◽

The Eu

Download Full-text

Comparative analysis of Dimensions and Scopus bibliographic data sources: an approach to university research productivity

International Journal of Electrical and Computer Engineering (IJECE) ◽

10.11591/ijece.v12i1.pp706-720 ◽

2022 ◽

Vol 12 (1) ◽

pp. 706

Author(s):

Pachisa Kulkanjanapiban ◽

Tipawan Silwattananusarn

Keyword(s):

Bibliometric Analysis ◽

Research Productivity ◽

Citation Count ◽

Data Sources ◽

Scientific Databases ◽

Bibliographic Data ◽

Significant Comparison ◽

Citation Counts ◽

Highly Correlated ◽

Number Of Citations

<p>This paper shows a significant comparison of two primary bibliographic data sources at the document level of Scopus and Dimensions. The emphasis is on the differences in their document coverage by institution level of aggregation. The main objective is to assess whether Dimensions offers at the institutional level good new possibilities for bibliometric analysis as at the global level. The results of a comparative study of the citation count profiles of articles published by faculty members of Prince of Songkla University (PSU) in Dimensions and Scopus from the year the databases first included PSU-authored papers (1970 and 1978, respectively) through the end of June 2020. Descriptive statistics and correlation analysis of 19,846 articles indexed in Dimensions and 13,577 indexed in Scopus. The main finding was that the number of citations received by Dimensions was highly correlated with citation counts in Scopus. Spearman’s correlation between citation counts in Dimensions and Scopus was a high and mighty relationship. The findings mainly affect Dimensions’ possibilities as instruments for carrying out bibliometric analysis of university members’ research productivity. University researchers can use Dimensions to retrieve information, and the design policies can be used to evaluate research using <br />scientific databases.</p>

Download Full-text

Why are we doing this? Teacher and student perspectives on literary reading

English Teaching Practice & Critique ◽

10.1108/etpc-06-2021-0076 ◽

2022 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Allison H. Hall ◽

Susan R. Goldman

Keyword(s):

Design Methodology ◽

Data Sources ◽

Literary Texts ◽

Literary Reading ◽

Content Type ◽

Instructional Goals ◽

Making Sense ◽

Audio Recordings ◽

The World ◽

Multiple Meanings

Purpose This paper aims to examine the extent to which students’ experiences and perceptions of their literature classroom align with their teacher’s instructional goals for literary inquiry and what teachers can learn from gaining access to students’ perspectives on their classroom experiences. Design/methodology/approach Thematic analyses were used to examine the data sources: mid-year and end-of-year interviews with six students, audio recordings of the teacher’s rationale for her instructional designs and a reflective discussion with the teacher upon reading the student interviews three years later. Findings Much of what the teacher intended students to get out of her instruction was what they expressed learning and experiencing in the class, yet some understood the purpose of the class to be far from her intentions. All the interviewed students had deeply personal and varied ways of relating what they learned in class to the world and their own lives. The teacher’s reflection on the interviews highlighted the importance of making space for multiple meanings and perspectives on literary works. Originality/value This paper speaks to the importance of surfacing students’ individual and varied ways of making sense of literary texts as part of instruction that values students’ thinking as well as the epistemic commitments of literary reading.

Download Full-text

Examining sex differences in the completeness of Peruvian CRVS data and adult mortality estimates

Genus ◽

10.1186/s41118-021-00151-5 ◽

2022 ◽

Vol 78 (1) ◽

Author(s):

Helena Cruz Castanheira ◽

José Henrique Costa Monteiro da Silva

Keyword(s):

Sex Differences ◽

Sex Difference ◽

Health Surveys ◽

Adult Mortality ◽

Data Sources ◽

Demographic And Health Surveys ◽

Death Registration ◽

Male And Female ◽

Health Related ◽

Mortality Estimates

AbstractThe production, compilation, and publication of death registration records is complex and usually involves many institutions. Assessing available data and the evolution of the completeness of the data compiled based on demographic techniques and other available data sources is of great importance for countries and for having timely and disaggregated mortality estimates. In this paper, we assess whether it is reasonable, based on the available data, to assume that there is a sex difference in the completeness of male and female death records in Peru in the last 30 years. In addition, we assess how the gap may have evolved with time by applying two-census death distribution methods on health-related registries and analyzing the information from the Demographic and Health Surveys and civil registries. Our findings suggest that there is no significant sex difference in the completeness of male and female health-related registries and, consequently, the sex gap currently observed in adult mortality estimates might be overestimated.

Download Full-text

A comparison of approaches to measuring maternal mortality in Bangladesh, Mozambique, and Bolivia

Population Health Metrics ◽

10.1186/s12963-022-00281-8 ◽

2022 ◽

Vol 20 (1) ◽

Author(s):

Kavita Singh ◽

Qingfeng Li ◽

Karar Zunaid Ahsan ◽

Sian Curtis ◽

William Weiss

Keyword(s):

Maternal Mortality ◽

Household Survey ◽

Data Sources ◽

Sample Survey ◽

Survey Method ◽

Population Census ◽

Registration System ◽

Advantages And Disadvantages ◽

Mortality Ratio ◽

Mortality Survey

Abstract Background Many low- and middle-income countries cannot measure maternal mortality to monitor progress against global and country-specific targets. While the ultimate goal for these countries is to have complete civil registrations systems, other interim strategies are needed to provide timely estimates of maternal mortality. Objective The objective is to inform on potential options for measuring maternal mortality. Methods This paper uses a case study approach to compare methodologies and estimates of pregnancy-related mortality ratio (PRMR)/maternal mortality ratio (MMR) obtained from four different data sources from similar time periods in Bangladesh, Mozambique, and Bolivia—national population census; post-census mortality survey; household sample survey; and sample vital registration system (SVRS). Results For Bangladesh, PRMR from the 2011 census falls closely in line with the 2010 household survey and SVRS estimates, while SVRS’ MMR estimates are closer to the PRMR estimates obtained from the household survey. Mozambique's PRMR from household survey method is comparable and shows an upward trend between 1994 and 2011, whereas the post-census mortality survey estimated a higher MMR for 2007. Bolivia's DHS and post-census mortality survey also estimated comparable MMR during 1998–2003. Conclusions Overall all these data sources presented in this paper have provided valuable information on maternal mortality in Bangladesh, Mozambique, and Bolivia. It also outlines recommendations to estimate maternal mortality based on the advantages and disadvantages of several approaches. Contribution Recommendations in this paper can help health administrators and policy planners in prioritizing investment for collecting reliable and contemporaneous estimates of maternal mortality while progressing toward a complete civil registration system.

Download Full-text

FILER: a framework for harmonizing and querying large-scale functional genomics knowledge

NAR Genomics and Bioinformatics ◽

10.1093/nargab/lqab123 ◽

2022 ◽

Vol 4 (1) ◽

Author(s):

Pavel P Kuksa ◽

Yuk Yee Leung ◽

Prabhakaran Gangadharan ◽

Zivadin Katanic ◽

Lauren Kleidermacher ◽

...

Keyword(s):

Functional Genomics ◽

Large Scale ◽

Fold Increase ◽

Cell Types ◽

Data Sources ◽

Reproducible Research ◽

Functional Genomic ◽

Data Types ◽

Annotation Data ◽

Data Collections

ABSTRACT Querying massive functional genomic and annotation data collections, linking and summarizing the query results across data sources/data types are important steps in high-throughput genomic and genetic analytical workflows. However, these steps are made difficult by the heterogeneity and breadth of data sources, experimental assays, biological conditions/tissues/cell types and file formats. FILER (FunctIonaL gEnomics Repository) is a framework for querying large-scale genomics knowledge with a large, curated integrated catalog of harmonized functional genomic and annotation data coupled with a scalable genomic search and querying interface. FILER uniquely provides: (i) streamlined access to >50 000 harmonized, annotated genomic datasets across >20 integrated data sources, >1100 tissues/cell types and >20 experimental assays; (ii) a scalable genomic querying interface; and (iii) ability to analyze and annotate user’s experimental data. This rich resource spans >17 billion GRCh37/hg19 and GRCh38/hg38 genomic records. Our benchmark querying 7 × 109 hg19 FILER records shows FILER is highly scalable, with a sub-linear 32-fold increase in querying time when increasing the number of queries 1000-fold from 1000 to 1 000 000 intervals. Together, these features facilitate reproducible research and streamline integrating/querying large-scale genomic data within analyses/workflows. FILER can be deployed on cloud or local servers (https://bitbucket.org/wanglab-upenn/FILER) for integration with custom pipelines and is freely available (https://lisanwanglab.org/FILER).

Download Full-text

Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study

Frontiers in Pharmacology ◽

10.3389/fphar.2021.789872 ◽

2022 ◽

Vol 12 ◽

Author(s):

Lisiane Freitas Leal ◽

Claudia Garcia Serpa Osorio-de-Castro ◽

Luiz Júpiter Carneiro de Souza ◽

Felipe Ferre ◽

Daniel Marques Mota ◽

...

Keyword(s):

Adverse Event ◽

Drug Utilization ◽

Adverse Event Reporting ◽

Data Sources ◽

Event Reporting ◽

Utilization Research ◽

National Surveys ◽

Reporting Systems ◽

Data Source ◽

Drug Utilization Research

Background: In Brazil, studies that map electronic healthcare databases in order to assess their suitability for use in pharmacoepidemiologic research are lacking. We aimed to identify, catalogue, and characterize Brazilian data sources for Drug Utilization Research (DUR).Methods: The present study is part of the project entitled, “Publicly Available Data Sources for Drug Utilization Research in Latin American (LatAm) Countries.” A network of Brazilian health experts was assembled to map secondary administrative data from healthcare organizations that might provide information related to medication use. A multi-phase approach including internet search of institutional government websites, traditional bibliographic databases, and experts’ input was used for mapping the data sources. The reviewers searched, screened and selected the data sources independently; disagreements were resolved by consensus. Data sources were grouped into the following categories: 1) automated databases; 2) Electronic Medical Records (EMR); 3) national surveys or datasets; 4) adverse event reporting systems; and 5) others. Each data source was characterized by accessibility, geographic granularity, setting, type of data (aggregate or individual-level), and years of coverage. We also searched for publications related to each data source.Results: A total of 62 data sources were identified and screened; 38 met the eligibility criteria for inclusion and were fully characterized. We grouped 23 (60%) as automated databases, four (11%) as adverse event reporting systems, four (11%) as EMRs, three (8%) as national surveys or datasets, and four (11%) as other types. Eighteen (47%) were classified as publicly and conveniently accessible online; providing information at national level. Most of them offered more than 5 years of comprehensive data coverage, and presented data at both the individual and aggregated levels. No information about population coverage was found. Drug coding is not uniform; each data source has its own coding system, depending on the purpose of the data. At least one scientific publication was found for each publicly available data source.Conclusions: There are several types of data sources for DUR in Brazil, but a uniform system for drug classification and data quality evaluation does not exist. The extent of population covered by year is unknown. Our comprehensive and structured inventory reveals a need for full characterization of these data sources.

Download Full-text

Deep reinforcement learning based control for Autonomous Vehicles in CARLA

Multimedia Tools and Applications ◽

10.1007/s11042-021-11437-3 ◽

2022 ◽

Author(s):

Óscar Pérez-Gil ◽

Rafael Barea ◽

Elena López-Guillén ◽

Luis M. Bergasa ◽

Carlos Gómez-Huélamo ◽

...

Keyword(s):

Reinforcement Learning ◽

Autonomous Vehicles ◽

Autonomous Vehicle ◽

Vehicle Control ◽

Data Sources ◽

Simulation Environment ◽

Urban Simulation ◽

Policy Gradient ◽

Almost All ◽

Control Layer

AbstractNowadays, Artificial Intelligence (AI) is growing by leaps and bounds in almost all fields of technology, and Autonomous Vehicles (AV) research is one more of them. This paper proposes the using of algorithms based on Deep Learning (DL) in the control layer of an autonomous vehicle. More specifically, Deep Reinforcement Learning (DRL) algorithms such as Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG) are implemented in order to compare results between them. The aim of this work is to obtain a trained model, applying a DRL algorithm, able of sending control commands to the vehicle to navigate properly and efficiently following a determined route. In addition, for each of the algorithms, several agents are presented as a solution, so that each of these agents uses different data sources to achieve the vehicle control commands. For this purpose, an open-source simulator such as CARLA is used, providing to the system with the ability to perform a multitude of tests without any risk into an hyper-realistic urban simulation environment, something that is unthinkable in the real world. The results obtained show that both DQN and DDPG reach the goal, but DDPG obtains a better performance. DDPG perfoms trajectories very similar to classic controller as LQR. In both cases RMSE is lower than 0.1m following trajectories with a range 180-700m. To conclude, some conclusions and future works are commented.

Download Full-text

data sources
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Challenges and Opportunities of Emerging Data Sources to Estimate Network-Wide Bike Counts

Estimating Degradation of Machine Learning Data Assets

Modeling the EU plastic footprint: Exploring data sources and littering potential

Comparative analysis of Dimensions and Scopus bibliographic data sources: an approach to university research productivity

Why are we doing this? Teacher and student perspectives on literary reading

Examining sex differences in the completeness of Peruvian CRVS data and adult mortality estimates

A comparison of approaches to measuring maternal mortality in Bangladesh, Mozambique, and Bolivia

FILER: a framework for harmonizing and querying large-scale functional genomics knowledge

Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study

Deep reinforcement learning based control for Autonomous Vehicles in CARLA

Export Citation Format

data sourcesRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Challenges and Opportunities of Emerging Data Sources to Estimate Network-Wide Bike Counts

Estimating Degradation of Machine Learning Data Assets

Modeling the EU plastic footprint: Exploring data sources and littering potential

Comparative analysis of Dimensions and Scopus bibliographic data sources: an approach to university research productivity

Why are we doing this? Teacher and student perspectives on literary reading

Examining sex differences in the completeness of Peruvian CRVS data and adult mortality estimates

A comparison of approaches to measuring maternal mortality in Bangladesh, Mozambique, and Bolivia

FILER: a framework for harmonizing and querying large-scale functional genomics knowledge

Data Sources for Drug Utilization Research in Brazil—DUR-BRA Study

Deep reinforcement learning based control for Autonomous Vehicles in CARLA

data sources
Recently Published Documents