Combining data sets of organochlorines (OCs) in human plasma for the Russian Arctic

2009 ◽  
Vol 407 (19) ◽  
pp. 5216-5222 ◽  
Author(s):  
T.M. Sandanger ◽  
E.E. Anda ◽  
A.A. Dudarev ◽  
E. Nieboer ◽  
A.V. Konoplev ◽  
...  
Author(s):  
Zeynep Baskurt ◽  
Scott Mastromatteo ◽  
Jiafen Gong ◽  
Richard F Wintle ◽  
Stephen W Scherer ◽  
...  

Abstract Integration of next generation sequencing data (NGS) across different research studies can improve the power of genetic association testing by increasing sample size and can obviate the need for sequencing controls. If differential genotype uncertainty across studies is not accounted for, combining data sets can produce spurious association results. We developed the Variant Integration Kit for NGS (VikNGS), a fast cross-platform software package, to enable aggregation of several data sets for rare and common variant genetic association analysis of quantitative and binary traits with covariate adjustment. VikNGS also includes a graphical user interface, power simulation functionality and data visualization tools. Availability The VikNGS package can be downloaded at http://www.tcag.ca/tools/index.html. Supplementary information Supplementary data are available at Bioinformatics online.


1998 ◽  
Vol 54 (6) ◽  
pp. 1450-1452 ◽  
Author(s):  
Ajay Saxena ◽  
Anna Gries ◽  
Robert Schwarzenbacher ◽  
Gerhard M. Kostner ◽  
Peter Laggner ◽  
...  

Apolipoprotein-H (Apo-H, Mw ≃ 50 kDa) is a carbohydrate-rich human-plasma protein which exists in blood serum in the free form as well as distributed between several classes of lipoproteins. Single crystals of apo-H have been obtained and crystallographic data sets have been collected. The crystals belong to the orthorhombic space group C2221, with cell dimensions a = 158.47, b = 169.25, c = 113.28 Å (at 100 K). The data indicate that the crystallographic asymmetric unit contains one tetramer of the protein.


2020 ◽  
Author(s):  
Christopher Kadow ◽  
David Hall ◽  
Uwe Ulbrich

<p>Nowadays climate change research relies on climate information of the past. Historic climate records of temperature observations form global gridded datasets like HadCRUT4, which is investigated e.g. in the IPCC reports. However, record combining data-sets are sparse in the past. Even today they contain missing values. Here we show that machine learning technology can be applied to refill these missing climate values in observational datasets. We found that the technology of image inpainting using partial convolutions in a CUDA accelerated deep neural network can be trained by large Earth system model experiments from NOAA reanalysis (20CR) and the Coupled Model Intercomparison Project phase 5 (CMIP5). The derived deep neural networks are capable to independently refill added missing values of these experiments. The analysis shows a very high degree of reconstruction even in the cross-reconstruction of the trained networks on the other dataset. The network reconstruction reaches a better evaluation than other typical methods in climate science. In the end we will show the new reconstructed observational dataset HadCRUT4 and discuss further investigations.</p>


2018 ◽  
Author(s):  
Linh Nguyen ◽  
Stefan Naulaerts ◽  
Alexandra Bomane ◽  
Alejandra Bruna ◽  
Ghita Ghislat ◽  
...  

ABSTRACTInter-tumour heterogeneity is one of cancer’s most fundamental features. Patient stratification based on drug response prediction is hence needed for effective anti-cancer therapy. However, lessons from the past indicate that single-gene markers of response are rare and/or often fail to achieve a significant impact in clinic. In this context, Machine Learning (ML) is emerging as a particularly promising complementary approach to precision oncology. Here we leverage comprehensive Patient-Derived Xenograft (PDX) pharmacogenomic data sets with dimensionality-reducing ML algorithms with this purpose. Results show that combining multiple gene alterations via ML leads to better discrimination between sensitive and resistant PDXs in 19 of the 26 analysed cases. Highly predictive ML models employing concise gene lists were found for three cases: Paclitaxel (breast cancer), Binimetinib (breast cancer) and Cetuximab (colorectal cancer). Interestingly, each of these ML models identify some responsive PDXs not harbouring the best actionable mutation for that case (such PDXs were missed by those single-gene markers). Moreover, ML multi-gene predictors generally retrieve a much higher proportion of treatment-sensitive PDXs than the corresponding single-gene marker. As PDXs often recapitulate clinical outcomes, these results suggest that many more patients could benefit from precision oncology if multiple ML algorithms were applied to existing clinical pharmacogenomics data, especially those algorithms generating classifiers combining data-selected gene alterations.


10.2196/22624 ◽  
2020 ◽  
Vol 22 (10) ◽  
pp. e22624 ◽  
Author(s):  
Ranganathan Chandrasekaran ◽  
Vikalp Mehta ◽  
Tejali Valkunde ◽  
Evangelos Moustakas

Background With restrictions on movement and stay-at-home orders in place due to the COVID-19 pandemic, social media platforms such as Twitter have become an outlet for users to express their concerns, opinions, and feelings about the pandemic. Individuals, health agencies, and governments are using Twitter to communicate about COVID-19. Objective The aims of this study were to examine key themes and topics of English-language COVID-19–related tweets posted by individuals and to explore the trends and variations in how the COVID-19–related tweets, key topics, and associated sentiments changed over a period of time from before to after the disease was declared a pandemic. Methods Building on the emergent stream of studies examining COVID-19–related tweets in English, we performed a temporal assessment covering the time period from January 1 to May 9, 2020, and examined variations in tweet topics and sentiment scores to uncover key trends. Combining data from two publicly available COVID-19 tweet data sets with those obtained in our own search, we compiled a data set of 13.9 million English-language COVID-19–related tweets posted by individuals. We use guided latent Dirichlet allocation (LDA) to infer themes and topics underlying the tweets, and we used VADER (Valence Aware Dictionary and sEntiment Reasoner) sentiment analysis to compute sentiment scores and examine weekly trends for 17 weeks. Results Topic modeling yielded 26 topics, which were grouped into 10 broader themes underlying the COVID-19–related tweets. Of the 13,937,906 examined tweets, 2,858,316 (20.51%) were about the impact of COVID-19 on the economy and markets, followed by spread and growth in cases (2,154,065, 15.45%), treatment and recovery (1,831,339, 13.14%), impact on the health care sector (1,588,499, 11.40%), and governments response (1,559,591, 11.19%). Average compound sentiment scores were found to be negative throughout the examined time period for the topics of spread and growth of cases, symptoms, racism, source of the outbreak, and political impact of COVID-19. In contrast, we saw a reversal of sentiments from negative to positive for prevention, impact on the economy and markets, government response, impact on the health care industry, and treatment and recovery. Conclusions Identification of dominant themes, topics, sentiments, and changing trends in tweets about the COVID-19 pandemic can help governments, health care agencies, and policy makers frame appropriate responses to prevent and control the spread of the pandemic.


2021 ◽  
Vol 40 (2) ◽  
pp. 160-160
Author(s):  
Andrew Geary

Lucy MacGregor highlights her 2021 Distinguished Lecture, “Multiphysics analysis: Extracting the most from diverse data sets.” She discusses how combining data sets can compensate for weaknesses in each, how utilizing gravity data improves the seismic image, the biggest obstacle in utilizing data sets, and more. Hear the full episode at https://seg.org/podcast/post/11276 .


2021 ◽  
Author(s):  
Sergey Assonov

<p>For stable isotope data sets to be compared or combined in biogeochemical studies, their compatibility must be well understood. For δ13C measurements in greenhouse gases, the WMO GAW program has set compatibility targets of 0.010 ‰ for atmospheric CO2 and 0.020 ‰ for atmospheric methane (in background air studies [1, 2]). The direct comparison of samples between laboratories can provide limited information, such as a snapshot for a specific time period, but combining data sets produced over decades requires more efforts. To produce high quality data, reliable calibrations must be made, mutually consistent values of reference materials (RMs) must be used, and a traceability scheme that ensures low uncertainty must be implemented.</p><p>The VPDB δ13C scale provides example of approaches developed recently. Several problems with the existing implementation of the VPDB scale have been identified between 2009-2016 [3]: the primary reference material (RM) NBS19 was exhausted and needed to be replaced; the δ13C of LSVEC (used to anchor the VPDB scale at negative δ13C) was found to be drifting and its use as a RM for δ13C was discontinued [4]; other RMs that were available in 2016 (e.g., NBS18) were not able to be used to develop new RMs as their uncertainties were too large. Given that the VPDB scale is artefact-based and not supported by absolute ratio measurements with uncertainty as low as required, the principles of value assignments on the VPDB scale were needed to be revised.</p><p>To ensure that a revised scheme did not encounter similar problems (with dependence on a single scale-anchor), several fundamental metrological principles were considered: (i) traceability of measurement results to the primary RM, (ii) a hierarchy of calibrators and (iii) comprehensive understanding of measurement method(s) [5]. The revised VPDB scheme [3] was applied to the new primary RM [6] and three RMs covering a large δ13C range (to negative values) [7]. Values were assigned in a mutually consistent way, with uncertainties ranging from 0.010 to 0.015 ‰, depending on the assigned δ13C. Each RM value has an uncertainty assigned that includes all known instrumental corrections, potential alterations due to storage, and inhomogeneity assessment [6,7]. The scheme allows for the δ13C range to be expanded by developing new carbonate RMs, and to be extended to matrix-based RMs.</p><p>The revised VPDB δ13C scale realization should lead to a robust basis for improving data compatibility. The developed framework can be applied to other measurements of biogeochemical interest, such as small 17O variations (in H2O, carbonates and other samples), clumped isotopes, and various paleoclimate reconstructions. Notably, the traceability principle is helpful in realistic uncertainty estimations which provide a tool to understand constrains and limiting steps in data comparisons.</p><p>REFERENCES:  [1]. WMO, GAW Report No.229. 2016. [2]. WMO, GAW Report No.242. 2018. [3]. Assonov, S. et al., RCM, 2021. https://doi.org/10.1002/rcm.9018. [4]. IUPAC, Press release of the IUPAC meeting in 2017, https://iupac.org/standard-atomic-weights-of-14-chemical-elements-revised/. [5]. De Bievre, P. et al., PURE APPL CHEM, 2011. <strong>83</strong>(10): p. 1873-1935. [6]. Assonov, S., et al., RCM, 2020: p. https://doi.org/10.1002/rcm.8867. [7]. Assonov, S. et al., RCM, 2021. https://doi.org/10.1002/rcm.9014</p>


2018 ◽  
Vol 10 (9) ◽  
pp. 1340 ◽  
Author(s):  
Dennis Helder ◽  
Brian Markham ◽  
Ron Morfitt ◽  
Jim Storey ◽  
Julia Barsi ◽  
...  

Combining data from multiple sensors into a single seamless time series, also known as data interoperability, has the potential for unlocking new understanding of how the Earth functions as a system. However, our ability to produce these advanced data sets is hampered by the differences in design and function of the various optical remote-sensing satellite systems. A key factor is the impact that calibration of these instruments has on data interoperability. To address this issue, a workshop with a panel of experts was convened in conjunction with the Pecora 20 conference to focus on data interoperability between Landsat and the Sentinel 2 sensors. Four major areas of recommendation were the outcome of the workshop. The first was to improve communications between satellite agencies and the remote-sensing community. The second was to adopt a collections-based approach to processing the data. As expected, a third recommendation was to improve calibration methodologies in several specific areas. Lastly, and the most ambitious of the four, was to develop a comprehensive process for validating surface reflectance products produced from the data sets. Collectively, these recommendations have significant potential for improving satellite sensor calibration in a focused manner that can directly catalyze efforts to develop data that are closer to being seamlessly interoperable.


Sign in / Sign up

Export Citation Format

Share Document