data sources
Recently Published Documents





Md. Mintu Miah ◽  
Kate Kyung Hyun ◽  
Stephen P. Mattingly ◽  
Joseph Broach ◽  
Nathan McNeil ◽  

2022 ◽  
Vol 14 (2) ◽  
pp. 1-15
Lara Mauri ◽  
Ernesto Damiani

Large-scale adoption of Artificial Intelligence and Machine Learning (AI-ML) models fed by heterogeneous, possibly untrustworthy data sources has spurred interest in estimating degradation of such models due to spurious, adversarial, or low-quality data assets. We propose a quantitative estimate of the severity of classifiers’ training set degradation: an index expressing the deformation of the convex hulls of the classes computed on a held-out dataset generated via an unsupervised technique. We show that our index is computationally light, can be calculated incrementally and complements well existing ML data assets’ quality measures. As an experimentation, we present the computation of our index on a benchmark convolutional image classifier.

2022 ◽  
Vol 178 ◽  
pp. 106086
Andrea Martino Amadei ◽  
Esther Sanyé-Mengual ◽  
Serenella Sala

Pachisa Kulkanjanapiban ◽  
Tipawan Silwattananusarn

<p>This paper shows a significant comparison of two primary bibliographic data sources at the document level of Scopus and Dimensions. The emphasis is on the differences in their document coverage by institution level of aggregation. The main objective is to assess whether Dimensions offers at the institutional level good new possibilities for bibliometric analysis as at the global level. The results of a comparative study of the citation count profiles of articles published by faculty members of Prince of Songkla University (PSU) in Dimensions and Scopus from the year the databases first included PSU-authored papers (1970 and 1978, respectively) through the end of June 2020. Descriptive statistics and correlation analysis of 19,846 articles indexed in Dimensions and 13,577 indexed in Scopus. The main finding was that the number of citations received by Dimensions was highly correlated with citation counts in Scopus. Spearman’s correlation between citation counts in Dimensions and Scopus was a high and mighty relationship. The findings mainly affect Dimensions’ possibilities as instruments for carrying out bibliometric analysis of university members’ research productivity. University researchers can use Dimensions to retrieve information, and the design policies can be used to evaluate research using <br />scientific databases.</p>

2022 ◽  
Vol ahead-of-print (ahead-of-print) ◽  
Allison H. Hall ◽  
Susan R. Goldman

Purpose This paper aims to examine the extent to which students’ experiences and perceptions of their literature classroom align with their teacher’s instructional goals for literary inquiry and what teachers can learn from gaining access to students’ perspectives on their classroom experiences. Design/methodology/approach Thematic analyses were used to examine the data sources: mid-year and end-of-year interviews with six students, audio recordings of the teacher’s rationale for her instructional designs and a reflective discussion with the teacher upon reading the student interviews three years later. Findings Much of what the teacher intended students to get out of her instruction was what they expressed learning and experiencing in the class, yet some understood the purpose of the class to be far from her intentions. All the interviewed students had deeply personal and varied ways of relating what they learned in class to the world and their own lives. The teacher’s reflection on the interviews highlighted the importance of making space for multiple meanings and perspectives on literary works. Originality/value This paper speaks to the importance of surfacing students’ individual and varied ways of making sense of literary texts as part of instruction that values students’ thinking as well as the epistemic commitments of literary reading.

Genus ◽  
2022 ◽  
Vol 78 (1) ◽  
Helena Cruz Castanheira ◽  
José Henrique Costa Monteiro da Silva

AbstractThe production, compilation, and publication of death registration records is complex and usually involves many institutions. Assessing available data and the evolution of the completeness of the data compiled based on demographic techniques and other available data sources is of great importance for countries and for having timely and disaggregated mortality estimates. In this paper, we assess whether it is reasonable, based on the available data, to assume that there is a sex difference in the completeness of male and female death records in Peru in the last 30 years. In addition, we assess how the gap may have evolved with time by applying two-census death distribution methods on health-related registries and analyzing the information from the Demographic and Health Surveys and civil registries. Our findings suggest that there is no significant sex difference in the completeness of male and female health-related registries and, consequently, the sex gap currently observed in adult mortality estimates might be overestimated.

2022 ◽  
Vol 20 (1) ◽  
Kavita Singh ◽  
Qingfeng Li ◽  
Karar Zunaid Ahsan ◽  
Sian Curtis ◽  
William Weiss

Abstract Background Many low- and middle-income countries cannot measure maternal mortality to monitor progress against global and country-specific targets. While the ultimate goal for these countries is to have complete civil registrations systems, other interim strategies are needed to provide timely estimates of maternal mortality. Objective The objective is to inform on potential options for measuring maternal mortality. Methods This paper uses a case study approach to compare methodologies and estimates of pregnancy-related mortality ratio (PRMR)/maternal mortality ratio (MMR) obtained from four different data sources from similar time periods in Bangladesh, Mozambique, and Bolivia—national population census; post-census mortality survey; household sample survey; and sample vital registration system (SVRS). Results For Bangladesh, PRMR from the 2011 census falls closely in line with the 2010 household survey and SVRS estimates, while SVRS’ MMR estimates are closer to the PRMR estimates obtained from the household survey. Mozambique's PRMR from household survey method is comparable and shows an upward trend between 1994 and 2011, whereas the post-census mortality survey estimated a higher MMR for 2007. Bolivia's DHS and post-census mortality survey also estimated comparable MMR during 1998–2003. Conclusions Overall all these data sources presented in this paper have provided valuable information on maternal mortality in Bangladesh, Mozambique, and Bolivia. It also outlines recommendations to estimate maternal mortality based on the advantages and disadvantages of several approaches. Contribution Recommendations in this paper can help health administrators and policy planners in prioritizing investment for collecting reliable and contemporaneous estimates of maternal mortality while progressing toward a complete civil registration system.

2022 ◽  
Vol 4 (1) ◽  
Pavel P Kuksa ◽  
Yuk Yee Leung ◽  
Prabhakaran Gangadharan ◽  
Zivadin Katanic ◽  
Lauren Kleidermacher ◽  

ABSTRACT Querying massive functional genomic and annotation data collections, linking and summarizing the query results across data sources/data types are important steps in high-throughput genomic and genetic analytical workflows. However, these steps are made difficult by the heterogeneity and breadth of data sources, experimental assays, biological conditions/tissues/cell types and file formats. FILER (FunctIonaL gEnomics Repository) is a framework for querying large-scale genomics knowledge with a large, curated integrated catalog of harmonized functional genomic and annotation data coupled with a scalable genomic search and querying interface. FILER uniquely provides: (i) streamlined access to &gt;50 000 harmonized, annotated genomic datasets across &gt;20 integrated data sources, &gt;1100 tissues/cell types and &gt;20 experimental assays; (ii) a scalable genomic querying interface; and (iii) ability to analyze and annotate user’s experimental data. This rich resource spans &gt;17 billion GRCh37/hg19 and GRCh38/hg38 genomic records. Our benchmark querying 7 × 109 hg19 FILER records shows FILER is highly scalable, with a sub-linear 32-fold increase in querying time when increasing the number of queries 1000-fold from 1000 to 1 000 000 intervals. Together, these features facilitate reproducible research and streamline integrating/querying large-scale genomic data within analyses/workflows. FILER can be deployed on cloud or local servers ( for integration with custom pipelines and is freely available (

2022 ◽  
Vol 12 ◽  
Lisiane Freitas Leal ◽  
Claudia Garcia Serpa Osorio-de-Castro ◽  
Luiz Júpiter Carneiro de Souza ◽  
Felipe Ferre ◽  
Daniel Marques Mota ◽  

Background: In Brazil, studies that map electronic healthcare databases in order to assess their suitability for use in pharmacoepidemiologic research are lacking. We aimed to identify, catalogue, and characterize Brazilian data sources for Drug Utilization Research (DUR).Methods: The present study is part of the project entitled, “Publicly Available Data Sources for Drug Utilization Research in Latin American (LatAm) Countries.” A network of Brazilian health experts was assembled to map secondary administrative data from healthcare organizations that might provide information related to medication use. A multi-phase approach including internet search of institutional government websites, traditional bibliographic databases, and experts’ input was used for mapping the data sources. The reviewers searched, screened and selected the data sources independently; disagreements were resolved by consensus. Data sources were grouped into the following categories: 1) automated databases; 2) Electronic Medical Records (EMR); 3) national surveys or datasets; 4) adverse event reporting systems; and 5) others. Each data source was characterized by accessibility, geographic granularity, setting, type of data (aggregate or individual-level), and years of coverage. We also searched for publications related to each data source.Results: A total of 62 data sources were identified and screened; 38 met the eligibility criteria for inclusion and were fully characterized. We grouped 23 (60%) as automated databases, four (11%) as adverse event reporting systems, four (11%) as EMRs, three (8%) as national surveys or datasets, and four (11%) as other types. Eighteen (47%) were classified as publicly and conveniently accessible online; providing information at national level. Most of them offered more than 5 years of comprehensive data coverage, and presented data at both the individual and aggregated levels. No information about population coverage was found. Drug coding is not uniform; each data source has its own coding system, depending on the purpose of the data. At least one scientific publication was found for each publicly available data source.Conclusions: There are several types of data sources for DUR in Brazil, but a uniform system for drug classification and data quality evaluation does not exist. The extent of population covered by year is unknown. Our comprehensive and structured inventory reveals a need for full characterization of these data sources.

Óscar Pérez-Gil ◽  
Rafael Barea ◽  
Elena López-Guillén ◽  
Luis M. Bergasa ◽  
Carlos Gómez-Huélamo ◽  

AbstractNowadays, Artificial Intelligence (AI) is growing by leaps and bounds in almost all fields of technology, and Autonomous Vehicles (AV) research is one more of them. This paper proposes the using of algorithms based on Deep Learning (DL) in the control layer of an autonomous vehicle. More specifically, Deep Reinforcement Learning (DRL) algorithms such as Deep Q-Network (DQN) and Deep Deterministic Policy Gradient (DDPG) are implemented in order to compare results between them. The aim of this work is to obtain a trained model, applying a DRL algorithm, able of sending control commands to the vehicle to navigate properly and efficiently following a determined route. In addition, for each of the algorithms, several agents are presented as a solution, so that each of these agents uses different data sources to achieve the vehicle control commands. For this purpose, an open-source simulator such as CARLA is used, providing to the system with the ability to perform a multitude of tests without any risk into an hyper-realistic urban simulation environment, something that is unthinkable in the real world. The results obtained show that both DQN and DDPG reach the goal, but DDPG obtains a better performance. DDPG perfoms trajectories very similar to classic controller as LQR. In both cases RMSE is lower than 0.1m following trajectories with a range 180-700m. To conclude, some conclusions and future works are commented.

Sign in / Sign up

Export Citation Format

Share Document