scholarly journals BloodPAC Data Commons for Liquid Biopsy Data

2021 ◽  
pp. 479-486
Author(s):  
Robert L. Grossman ◽  
Jonathan R. Dry ◽  
Sean E. Hanlon ◽  
Donald J. Johann ◽  
Anand Kolatkar ◽  
...  

PURPOSE The Blood Profiling Atlas in Cancer (BloodPAC) Data Commons (BPDC) is being developed and is operated by the public-private BloodPAC Consortium to support the liquid biopsy community. It is an interoperable data commons with the ultimate aim of serving as a recognized source of valid scientific evidence for liquid biopsy assays for industry, academia, and standards and regulatory stakeholders. METHODS The BPDC is implemented using the open source Gen3 data commons platform ( https://gen3.org ). In particular, the BPDC Data Exploration Portal, BPDC Data Submission Portal, the BPDC Workspace Hub, and the BloodPAC application programming interface (API) were all automatically generated from the BloodPAC Data Model using the Gen3 data commons platform. BPDC uses Gen3's implementation of the data commons framework services so that it can interoperate through secure, compliant APIs with other data commons using data commons framework service, such as National Cancer Institute's Cancer Research Data Commons. RESULTS The BPDC contains 57 studies and projects spanning more than 4,100 cases. This amounts to 5,700 aliquots (blood plasma, serum, or a contrived sample) that have been subjected to a liquid biopsy assay, quantified, and then contributed by members of the BloodPAC Consortium. In all, there are more than 31,000 files in the commons as of December 2020. We describe the BPDC, the data it manages, the process that the BloodPAC Consortium used to develop it, and some of the applications that have been developed using its API. CONCLUSION The BPDC has been the data platform used by BloodPAC during the past 4 years to manage the data for the consortium and to provide workspaces for its working groups.

2020 ◽  
Vol 12 (10) ◽  
pp. 4200 ◽  
Author(s):  
Thanh-Long Giang ◽  
Dinh-Tri Vo ◽  
Quan-Hoang Vuong

Using data from the WHO’s Situation Report on the COVID-19 pandemic from 21 January 2020 to 30 March 2020 along with other health, demographic, and macroeconomic indicators from the WHO’s Application Programming Interface and the World Bank’s Development Indicators, this paper explores the death rates of infected persons and their possible associated factors. Through the panel analysis, we found consistent results that healthcare system conditions, particularly the number of hospital beds and medical staff, have played extremely important roles in reducing death rates of COVID-19 infected persons. In addition, both the mortality rates due to different non-communicable diseases (NCDs) and rate of people aged 65 and over were significantly related to the death rates. We also found that controlling international and domestic travelling by air along with increasingly popular anti-COVID-19 actions (i.e., quarantine and social distancing) would help reduce the death rates in all countries. We conducted tests for robustness and found that the Driscoll and Kraay (1998) method was the most suitable estimator with a finite sample, which helped confirm the robustness of our estimations. Based on the findings, we suggest that preparedness of healthcare systems for aged populations need more attentions from the public and politicians, regardless of income level, when facing COVID-19-like pandemics.


2020 ◽  
Vol 49 (D1) ◽  
pp. D1515-D1522 ◽  
Author(s):  
Daniel C Berrios ◽  
Jonathan Galazka ◽  
Kirill Grigorev ◽  
Samrawit Gebre ◽  
Sylvain V Costes

Abstract The mission of NASA’s GeneLab database (https://genelab.nasa.gov/) is to collect, curate, and provide access to the genomic, transcriptomic, proteomic and metabolomic (so-called ‘omics’) data from biospecimens flown in space or exposed to simulated space stressors, maximizing their utilization. This large collection of data enables the exploration of molecular network responses to space environments using a systems biology approach. We review here the various components of the GeneLab platform, including the new data repository web interface, and the GeneLab Online Data Entry (GEODE) web portal, which will support the expansion of the database in the future to include companion non-omics assay data. We discuss our design for GEODE, particularly how it promotes investigators providing more accurate metadata, reducing the curation effort required of GeneLab staff. We also introduce here a new GeneLab Application Programming Interface (API) specifically designed to support tools for the visualization of processed omics data. We review the outreach efforts by GeneLab to utilize the spaceflight data in the repository to generate novel discoveries and develop new hypotheses, including spearheading data analysis working groups, and a high school student training program. All these efforts are aimed ultimately at supporting precision risk management for human space exploration.


2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Casper W. Andersen ◽  
Rickard Armiento ◽  
Evgeny Blokhin ◽  
Gareth J. Conduit ◽  
Shyam Dwaraknath ◽  
...  

AbstractThe Open Databases Integration for Materials Design (OPTIMADE) consortium has designed a universal application programming interface (API) to make materials databases accessible and interoperable. We outline the first stable release of the specification, v1.0, which is already supported by many leading databases and several software packages. We illustrate the advantages of the OPTIMADE API through worked examples on each of the public materials databases that support the full API specification.


2021 ◽  
Vol 11 (1) ◽  
pp. 20
Author(s):  
Mete Ercan Pakdil ◽  
Rahmi Nurhan Çelik

Geospatial data and related technologies have become an increasingly important aspect of data analysis processes, with their prominent role in most of them. Serverless paradigm have become the most popular and frequently used technology within cloud computing. This paper reviews the serverless paradigm and examines how it could be leveraged for geospatial data processes by using open standards in the geospatial community. We propose a system design and architecture to handle complex geospatial data processing jobs with minimum human intervention and resource consumption using serverless technologies. In order to define and execute workflows in the system, we also propose new models for both workflow and task definitions models. Moreover, the proposed system has new Open Geospatial Consortium (OGC) Application Programming Interface (API) Processes specification-based web services to provide interoperability with other geospatial applications with the anticipation that it will be more commonly used in the future. We implemented the proposed system on one of the public cloud providers as a proof of concept and evaluated it with sample geospatial workflows and cloud architecture best practices.


2020 ◽  
Author(s):  
Jon-Patrick Allem ◽  
Allison Dormanesh ◽  
Anuja Majmundar ◽  
Vanessa Rivera ◽  
Maya Chu ◽  
...  

BACKGROUND In response to the recent government restrictions, flavored JUUL products, which are rechargeable closed-system electronic cigarettes (e-cigarettes), are no longer available for sale. However, disposable closed-system products such as the flavored Puff Bar e-cigarette continues to be available. If e-cigarette consumers simply switch between products during the current government restrictions limited to 1 type of product over another, then such restrictions would be less effective. A step forward in this line of research is to understand how the public discusses these products by examining discourse referencing both Puff Bar and JUUL in the same conversation. Twitter data provide ample opportunity to capture such early trends that could be used to help public health researchers stay abreast of the rapidly changing e-cigarette marketplace. OBJECTIVE The goal of this study was to examine public discourse referencing both Puff Bar and JUUL products in the same conversation on Twitter. METHODS We collected data from Twitter’s streaming application programming interface between July 16, 2019, and August 29, 2020, which included both “Puff Bar” and “JUUL” (n=2632). We then used an inductive approach to become familiar with the data and generate a codebook to identify common themes. Saturation was determined to be reached with 10 themes. RESULTS Posts often mentioned flavors, dual use, design features, youth use, health risks, switching 1 product for the other, price, confusion over the differences between products, longevity of the products, and nicotine concentration. CONCLUSIONS On examining the public’s conversations about Puff Bar and JUUL products on Twitter, having described themes in posts, this study aimed to help the tobacco control community stay informed about 2 popular e-cigarette products with different device features, which can be potentially substituted for one another. Future health communication campaigns may consider targeting the health consequences of using multiple e-cigarette products or dual use to reduce exposure to high levels of nicotine among younger populations.


Analysis of structured and consistent data has seen remarkable success in past decades. Whereas, the analysis of unstructured data in the form of multimedia format remains a challenging task. YouTube is one of the most popular and used social media tool. It reveals the community feedback through comments for published videos, number of likes, dislikes, number of subscribers for a particular channel. The main objective of this work is to demonstrate by using Hadoop concepts, how data generated from YouTube can be mined and utilized to make targeted, real time and informed decisions. In our paper, we analyze the data to identify the top categories in which the most number of videos are uploaded. This YouTube data is publicly available and the YouTube data set is described below under the heading Data Set Description. The dataset will be fetched from the Google using the YouTube API (Application Programming Interface) and going to be stored in Hadoop Distributed File System (HDFS). Using MapReduce we are going to analyze the dataset to identify the video categories in which most number of videos are uploaded. The objective of this paper is to demonstrate Apache Hadoop framework concepts and how to make targeted, real-time and informed decisions using data gathered from YouTube.


10.2196/26510 ◽  
2021 ◽  
Vol 23 (7) ◽  
pp. e26510
Author(s):  
Jon-Patrick Allem ◽  
Allison Dormanesh ◽  
Anuja Majmundar ◽  
Vanessa Rivera ◽  
Maya Chu ◽  
...  

Background In response to the recent government restrictions, flavored JUUL products, which are rechargeable closed-system electronic cigarettes (e-cigarettes), are no longer available for sale. However, disposable closed-system products such as the flavored Puff Bar e-cigarette continues to be available. If e-cigarette consumers simply switch between products during the current government restrictions limited to 1 type of product over another, then such restrictions would be less effective. A step forward in this line of research is to understand how the public discusses these products by examining discourse referencing both Puff Bar and JUUL in the same conversation. Twitter data provide ample opportunity to capture such early trends that could be used to help public health researchers stay abreast of the rapidly changing e-cigarette marketplace. Objective The goal of this study was to examine public discourse referencing both Puff Bar and JUUL products in the same conversation on Twitter. Methods We collected data from Twitter’s streaming application programming interface between July 16, 2019, and August 29, 2020, which included both “Puff Bar” and “JUUL” (n=2632). We then used an inductive approach to become familiar with the data and generate a codebook to identify common themes. Saturation was determined to be reached with 10 themes. Results Posts often mentioned flavors, dual use, design features, youth use, health risks, switching 1 product for the other, price, confusion over the differences between products, longevity of the products, and nicotine concentration. Conclusions On examining the public’s conversations about Puff Bar and JUUL products on Twitter, having described themes in posts, this study aimed to help the tobacco control community stay informed about 2 popular e-cigarette products with different device features, which can be potentially substituted for one another. Future health communication campaigns may consider targeting the health consequences of using multiple e-cigarette products or dual use to reduce exposure to high levels of nicotine among younger populations.


2020 ◽  
Vol 6 (3) ◽  
pp. 205630512094070 ◽  
Author(s):  
Moreno Mancosu ◽  
Federico Vegetti

In reaction to the Cambridge Analytica scandal, Facebook has restricted the access to its Application Programming Interface (API). This new policy has damaged the possibility for independent researchers to study relevant topics in political and social behavior. Yet, much of the public information that the researchers may be interested in is still available on Facebook, and can be still systematically collected through web scraping techniques. The goal of this article is twofold. First, we discuss some ethical and legal issues that researchers should consider as they plan their collection and possible publication of Facebook data. In particular, we discuss what kind of information can be ethically gathered about the users (public information), how published data should look like to comply with privacy regulations (like the GDPR), and what consequences violating Facebook’s terms of service may entail for the researcher. Second, we present a scraping routine for public Facebook posts, and discuss some technical adjustments that can be performed for the data to be ethically and legally acceptable. The code employs screen scraping to collect the list of reactions to a Facebook public post, and performs a one-way cryptographic hash function on the users’ identifiers to pseudonymize their personal information, while still keeping them traceable within the data. This article contributes to the debate around freedom of internet research and the ethical concerns that might arise by scraping data from the social web.


2021 ◽  
Vol 23 (06) ◽  
pp. 1672-1681
Author(s):  
Vinay Balamurali ◽  
◽  
Prof. Venkatesh S ◽  

Servers are required to monitor the health of the various I/O cards connected to it to alert the required personnel to service these cards. The Data Collection Unit (DCU) is responsible for detecting the I/O cards, sending their inventory as well as monitoring their health. Currently, the keys required to detect these I/O cards are manually coded into the source code. Such a task is highly laborious and time-consuming. To eliminate this manual work, a Software Pluggable Module was devised which would read the I/O card-related information from the I/O component list. This software design aims at using Data Science and OOPS concepts to automate certain tasks on server systems. The proposed methodology is implemented on a Linux system. The software design is modular in nature and extensible to accommodate future requirements. Such an automation framework can be used to track information maintained in Excel Spreadsheets and access them using an Application Programming Interface (API).


2019 ◽  
Vol 8 (3) ◽  
pp. 6996-7001

Data Mining is a method that requires analyzing and exploring large blocks of data to glean meaningful trends and patterns. In today’s period, every person on earth relies on allopathic treatments and medicines. Data mining techniques can be applied to medical databases that have a vast scope of opportunity for textual as well as visual data. In medical services, there are myriad obscure data that needs to be scrutinized and data mining is the key to gain useful knowledge from these data. This paper provides an application programming interface to recommend drugs to users suffering from a particular disease which would also be diagnosed by the framework through analyzing the user's symptoms by the means of machine learning algorithms. We utilize some insightful information here related to mining procedure to figure out most precise sickness that can be related with symptoms. The patient can without much of a stretch recognize the diseases. The patients can undoubtedly recognize the disease by simply ascribing their issues and the application interface produces what malady the user might be tainted with. The framework will demonstrate complaisant in critical situations where the patient can't achieve a doctor's facility or when there are situations, when professional are accessible in the territory. Predictive analysis would be performed on the disease that would result in recommending drugs to the user by taking into account various features in the database. The experimental results can also be used in further research work and for Healthcare tools.


Sign in / Sign up

Export Citation Format

Share Document