scholarly journals Integration of EGA secure data access into Galaxy

F1000Research ◽  
2016 ◽  
Vol 5 ◽  
pp. 2841 ◽  
Author(s):  
Youri Hoogstrate ◽  
Chao Zhang ◽  
Alexander Senf ◽  
Jochem Bijlard ◽  
Saskia Hiltemann ◽  
...  

High-throughput molecular profiling techniques are routinely generating vast amounts of data for translational medicine studies. Secure access controlled systems are needed to manage, store, transfer and distribute these data due to its personally identifiable nature. The European Genome-phenome Archive (EGA) was created to facilitate access and management to long-term archival of bio-molecular data. Each data provider is responsible for ensuring a Data Access Committee is in place to grant access to data stored in the EGA. Moreover, the transfer of data during upload and download is encrypted. ELIXIR, a European research infrastructure for life-science data, initiated a project (2016 Human Data Implementation Study) to understand and document the ELIXIR requirements for secure management of controlled-access data. As part of this project, a full ecosystem was designed to connect archived raw experimental molecular profiling data with interpreted data and the computational workflows, using the CTMM Translational Research IT (CTMM-TraIT) infrastructure http://www.ctmm-trait.nl as an example. Here we present the first outcomes of this project, a framework to enable the download of EGA data to a Galaxy server in a secure way. Galaxy provides an intuitive user interface for molecular biologists and bioinformaticians to run and design data analysis workflows. More specifically, we developed a tool -- ega_download_streamer - that can download data securely from EGA into a Galaxy server, which can subsequently be further processed. This tool will allow a user within the browser to run an entire analysis containing sensitive data from EGA, and to make this analysis available for other researchers in a reproducible manner, as shown with a proof of concept study.  The tool ega_download_streamer is available in the Galaxy tool shed: https://toolshed.g2.bx.psu.edu/view/yhoogstrate/ega_download_streamer.

2009 ◽  
Vol 4 (3) ◽  
pp. 33-34
Author(s):  
Laura Zayatz

Several organizations in the United States have a major interest in creating, testing, and using methods of data presentation that respect privacy and assure confidentiality. The following are among those that do so, and provide up-to-date information on these topics for the benefit of others who conduct human research: (1) The Committee on Privacy and Confidentiality of the American Statistical Association; (2) an interagency committee of the federal government, the Federal Committee on Statistical Methodology, and its subcommittees, the Confidentiality and Data Access Committee and the Committee on Privacy; (3) the Inter-university Consortium for Political and Social Research (University of Michigan), whose core mission is to archive important social science data, provide open and equitable access to data, and promote the effective use of data; and (4) Carnegie Mellon University's Department of Statistics, which has created an open-access online journal, the Journal on Privacy and Confidentiality. These resources are described, and URLs are provided to give readers web access to these resources.


Logistics ◽  
2021 ◽  
Vol 5 (3) ◽  
pp. 46
Author(s):  
Houssein Hellani ◽  
Layth Sliman ◽  
Abed Ellatif Samhat ◽  
Ernesto Exposito

Data transparency is essential in the modern supply chain to improve trust and boost collaboration among partners. In this context, Blockchain is a promising technology to provide full transparency across the entire supply chain. However, Blockchain was originally designed to provide full transparency and uncontrolled data access. This leads many market actors to avoid Blockchain as they fear for their confidentiality. In this paper, we highlight the requirements and challenges of supply chain transparency. We then investigate a set of supply chain projects that tackle data transparency issues by utilizing Blockchain in their core platform in different manners. Furthermore, we analyze the projects’ techniques and the tools utilized to customize transparency. As a result of the projects’ analyses, we identified that further enhancements are needed to set a balance between the data transparency and process opacity required by different partners, to ensure the confidentiality of their processes and to control access to sensitive data.


2020 ◽  
Vol 22 (Supplement_3) ◽  
pp. iii317-iii317
Author(s):  
Emily Owens Pickle ◽  
Ana Aguilar-Bonilla ◽  
Amy Smith

Abstract The current consensus is that diagnosis and treatment of ependymoma should be based upon clinical and molecular classification. As we move into this paradigm, it is important all ependymoma cases undergo tumor collection, preservation, and molecular profiling at diagnosis. Our group of 6 sites gathered data on a cohort of 72 ependymoma cases. Sites were asked to report known molecular findings; 60/68 eligible cases (88%) did not include genetic findings. The low number of cases with molecular findings was surprising and since cases were diagnosed from as early as 2004, we asked collaborators to share their current practice in profiling (e.g., how frequently; in what setting were ependymomas sent for testing) to try and better understand current practice at sites. Since the publication of ependymoma molecular data, sites with a neuro-oncology program report sending almost all newly diagnosed ependymomas for molecular testing, whereas current practices at sites without dedicated neuro-oncology were less consistent. Profiling in the setting of relapse was more frequently reported at all centers. The implementation of molecular testing at diagnosis may need support at sites without dedicated neuro-oncology. Lead investigators for upcoming ependymoma clinical trials will need to think carefully about the logistics of profiling at centers where this is not standard practice at diagnosis.


Plants ◽  
2021 ◽  
Vol 10 (8) ◽  
pp. 1556
Author(s):  
Dimitrios Evangelos Miliordos ◽  
Georgios Merkouropoulos ◽  
Charikleia Kogkou ◽  
Spyridon Arseniou ◽  
Anastasios Alatzas ◽  
...  

Wines produced from autochthonous Vitis vinifera varieties have an essential financial impact on the national economy of Greece. However, scientific data regarding characteristics and quality aspects of these wines is extremely limited. The aim of the current study is to define the molecular profile and to describe chemical and sensory characteristics of the wines produced by two autochthonous red grapevine varieties—“Karnachalades” and “Bogialamades”—grown in the wider area of Soufli (Thrace, Greece). We used seven microsatellites to define the molecular profile of the two varieties, and then we compared their profile to similar molecular data from other autochthonous as well as international varieties. Grape berries were harvested at optimum technological maturity from a commercial vineyard for two consecutive vintages (2017–2018) and vilification was performed using a common vinification protocol: the 2017 vintage provided wines, from both varieties, with greater rates of phenolics and anthocyanins than 2018, whereas regarding the sensory analysis, “Bogialamades” wine provided a richer profile than “Karnachalades”. To our knowledge, this is the first study that couples both molecular profiling and exploration of the enological potential of the rare Greek varieties “Karnachalades” and “Bogialamades”; they represent two promising varieties for the production of red wines in the historic region of Thrace.


1999 ◽  
Vol 33 (3) ◽  
pp. 55-66 ◽  
Author(s):  
L. Charles Sun

An interactive data access and retrieval system, developed at the U.S. National Oceanographic Data Genter (NODG) and available at <ext-link ext-link-type="uri" href="http://www.node.noaa.gov">http://www.node.noaa.gov</ext-link>, is presented in this paper. The purposes of this paper are: (1) to illustrate the procedures of quality control and loading oceanographic data into the NODG ocean databases and (2) to describe the development of a system to manage, visualize, and disseminate the NODG data holdings over the Internet. The objective of the system is to provide ease of access to data that will be required for data assimilation models. With advances in scientific understanding of the ocean dynamics, data assimilation models require the synthesis of data from a variety of resources. Modern intelligent data systems usually involve integrating distributed heterogeneous data and information sources. As the repository for oceanographic data, NOAA’s National Oceanographic Data Genter (NODG) is in a unique position to develop such a data system. In support of the data assimilation needs, NODG has developed a system to facilitate browsing of the oceanographic environmental data and information that is available on-line at NODG. Users may select oceanographic data based on geographic areas, time periods and measured parameters. Once the selection is complete, users may produce a station location plot, produce plots of the parameters or retrieve the data.


2020 ◽  
Vol 6 (1) ◽  
pp. 103-110
Author(s):  
Sidik Sidik ◽  
Ade Sudaryana ◽  
Rame Santoso

Computer networks have become an important point in companies that have many branch offices to coordinate the transfer of data. PT Indo Matra Lestari's connection uses a VPN system using the PPTP method. Data Center is used as a VPN server, the client is the Head Office and Citereup Branch Offices. Between the Head Office and the Citereup Branch Office there is no direct connection so access to data made between the Head Office and the Citereup Branch Office is slow, because the data must pass through the Data Center before reaching its destination. Moreover, the data accessed is private to the company and only accessed on the local network. The solution used to create a direct and secure network path between the Head Office and Branch Offices is to use the EoIP Tunnel on the proxy router. Tunneling method in EoIP can make network bridging between proxy devices, EoIP Tunnel will change to Virtual Interface on the proxy router so that it is as if the proxy router is connected locally. Tunnel ID on the EoIP Tunnel functions as a tunneling path security. The application of the EoIP Tunnel makes the point to point connection point between Mikrotik devices faster in data access because the data access is directed to the destination. In order for this EoIP Tunnel connection to run optimally and well, a network management is needed in managing internet bandwidth usage


2021 ◽  
Author(s):  
Mark Howison ◽  
Mintaka Angell ◽  
Michael Hicklen ◽  
Justine S. Hastings

A Secure Data Enclave is a system that allows data owners to control data access and ensure data security while facilitating approved uses of data by other parties. This model of data use offers additional protections and technical controls for the data owner compared to the more commonly used approach of transferring data from the owner to another party through a data sharing agreement. Under the data use model, the data owner retains full transparency and auditing over the other party’s access, which can be difficult to achieve in practice with even the best legal instrument for data sharing. We describe the key technical requirements for a Secure Data Enclave and provide a reference architecture for its implementation on the Amazon Web Services platform using managed cloud services.


2014 ◽  
Vol 8 (2) ◽  
pp. 13-24 ◽  
Author(s):  
Arkadiusz Liber

Introduction: Medical documentation ought to be accessible with the preservation of its integrity as well as the protection of personal data. One of the manners of its protection against disclosure is anonymization. Contemporary methods ensure anonymity without the possibility of sensitive data access control. it seems that the future of sensitive data processing systems belongs to the personalized method. In the first part of the paper k-Anonymity, (X,y)- Anonymity, (α,k)- Anonymity, and (k,e)-Anonymity methods were discussed. these methods belong to well - known elementary methods which are the subject of a significant number of publications. As the source papers to this part, Samarati, Sweeney, wang, wong and zhang’s works were accredited. the selection of these publications is justified by their wider research review work led, for instance, by Fung, Wang, Fu and y. however, it should be noted that the methods of anonymization derive from the methods of statistical databases protection from the 70s of 20th century. Due to the interrelated content and literature references the first and the second part of this article constitute the integral whole.Aim of the study: The analysis of the methods of anonymization, the analysis of the methods of protection of anonymized data, the study of a new security type of privacy enabling device to control disclosing sensitive data by the entity which this data concerns.Material and methods: Analytical methods, algebraic methods.Results: Delivering material supporting the choice and analysis of the ways of anonymization of medical data, developing a new privacy protection solution enabling the control of sensitive data by entities which this data concerns.Conclusions: In the paper the analysis of solutions for data anonymization, to ensure privacy protection in medical data sets, was conducted. the methods of: k-Anonymity, (X,y)- Anonymity, (α,k)- Anonymity, (k,e)-Anonymity, (X,y)-Privacy, lKc-Privacy, l-Diversity, (X,y)-linkability, t-closeness, confidence Bounding and Personalized Privacy were described, explained and analyzed. The analysis of solutions of controlling sensitive data by their owner was also conducted. Apart from the existing methods of the anonymization, the analysis of methods of the protection of anonymized data was included. In particular, the methods of: δ-Presence, e-Differential Privacy, (d,γ)-Privacy, (α,β)-Distributing Privacy and protections against (c,t)-isolation were analyzed. Moreover, the author introduced a new solution of the controlled protection of privacy. the solution is based on marking a protected field and the multi-key encryption of sensitive value. The suggested way of marking the fields is in accordance with Xmlstandard. For the encryption, (n,p) different keys cipher was selected. to decipher the content the p keys of n were used. The proposed solution enables to apply brand new methods to control privacy of disclosing sensitive data.


Author(s):  
Shirley Wong ◽  
Victoria Schuckel ◽  
Simon Thompson ◽  
David Ford ◽  
Ronan Lyons ◽  
...  

IntroductionThere is no power for change greater than a community discovering what it cares about.1 The Health Data Platform (HDP) will democratize British Columbia’s (population of approximately 4.6 million) health sector data by creating common enabling infrastructure that supports cross-organization analytics and research used by both decision makers and cademics. HDP will provide streamlined, proportionate processes that provide timelier access to data with increased transparency for the data consumer and provide shared data related services that elevate best practices by enabling consistency across data contributors, while maintaining continued stewardship of their data. HDP will be built in collaboration with Swansea University following an agile pragmatic approach starting with a minimum viable product. Objectives and ApproachBuild a data sharing environment that harnesses the data and the understanding and expertise about health data across academe, decision makers, and clinicians in the province by: Enabling a common harmonized approach across the sector on: Data stewardship Data access Data security and privacy Data management Data standards To: Enhance data consumer data access experience Increase process consistency and transparency Reduce burden of liberating data from a data source Build trust in the data and what it is telling us and therefore the decisions made Increase data accessibility safely and responsibly Working within the jurisdiction’s existing legislation, the Five Safes Privacy and Security Framework will be implemented, tailored to address the requirements of data contributors. ResultsThe minimum viable product will provide the necessary enabling infrastructure including governance to enable timelier access, safely to administrative data to a limited set of data consumers. The MVP will be expanded with another release planned for early 2021. Conclusion / ImplicationsCollaboration with Swansea University has enabled BC to accelerate its journey to increasing timelier access to data, safely and increasing the maturity of analytics by creating the enabling infrastructure that promotes collaboration and sharing of data and data approaches. 1 Margaret Wheatley


2020 ◽  
Author(s):  
N Goonasekera ◽  
A Mahmoud ◽  
J Chilton ◽  
E Afgan

AbstractSummaryThe existence of more than 100 public Galaxy servers with service quotas is indicative of the need for an increased availability of compute resources for Galaxy to use. The GalaxyCloudRunner enables a Galaxy server to easily expand its available compute capacity by sending user jobs to cloud resources. User jobs are routed to the acquired resources based on a set of configurable rules and the resources can be dynamically acquired from any of 4 popular cloud providers (AWS, Azure, GCP, or OpenStack) in an automated fashion.Availability and implementationGalaxyCloudRunner is implemented in Python and leverages Docker containers. The source code is MIT licensed and available at https://github.com/cloudve/galaxycloudrunner. The documentation is available at http://gcr.cloudve.org/.ContactEnis Afgan ([email protected])Supplementary informationNone


Sign in / Sign up

Export Citation Format

Share Document