scholarly journals dbTMM: an integrated database of large-scale cohort, genome and clinical data for the Tohoku Medical Megabank Project

2021 ◽  
Vol 8 (1) ◽  
Author(s):  
Soichi Ogishima ◽  
Satoshi Nagaie ◽  
Satoshi Mizuno ◽  
Ryosuke Ishiwata ◽  
Keita Iida ◽  
...  

AbstractTo reveal gene-environment interactions underlying common diseases and estimate the risk for common diseases, the Tohoku Medical Megabank (TMM) project has conducted prospective cohort studies and genomic and multiomics analyses. To establish an integrated biobank, we developed an integrated database called “dbTMM” that incorporates both the individual cohort/clinical data and the genome/multiomics data of 157,191 participants in the Tohoku Medical Megabank project. To our knowledge, dbTMM is the first database to store individual whole-genome data on a variant-by-variant basis as well as cohort/clinical data for over one hundred thousand participants in a prospective cohort study. dbTMM enables us to stratify our cohort by both genome-wide genetic factors and environmental factors, and it provides a research and development platform that enables prospective analysis of large-scale data from genome cohorts.

2021 ◽  
Author(s):  
Runqing Yang ◽  
Yuxin Song ◽  
Li Jiang ◽  
Zhiyu Hao ◽  
Runqing Yang

Abstract Complex computation and approximate solution hinder the application of generalized linear mixed models (GLMM) into genome-wide association studies. We extended GRAMMAR to handle binary diseases by considering genomic breeding values (GBVs) estimated in advance as a known predictor in genomic logit regression, and then controlled polygenic effects by regulating downward genomic heritability. Using simulations and case analyses, we showed in optimizing GRAMMAR, polygenic effects and genomic controls could be evaluated using the fewer sampling markers, which extremely simplified GLMM-based association analysis in large-scale data. In addition, joint analysis for quantitative trait nucleotide (QTN) candidates chosen by multiple testing offered significant improved statistical power to detect QTNs over existing methods.


2019 ◽  
Author(s):  
Sankar Subramanian ◽  
Umayal Ramasamy ◽  
David Chen

In the past decades a number of software programs have been developed to deduce the phylogenetic relationship between populations. However, these programs are not suited for large-scale whole genome data. Recently, a few standalone or web applications have been developed to handle genome-wide data, but they were either computationally intensive, dependent on third party software or required significant time and resource of a web server. In the post-genomic era, researchers are able to obtain bioinformatically processed high-quality publication-ready whole genome data for many individuals in a population from next generation sequencing companies due to the reduction in the cost of sequencing and analysis. Such genotype data is typically presented in the Variant Call Format (VCF) and there is no simple software available that uses this data to construct the phylogeny of populations in a short time. To address this limitation, we have developed a one-click user-friendly software, VCF2PopTree that uses gnome-wide SNPs to construct and display phylogenetic trees in seconds to minutes. For example, it reads a 1 GB VCF file and draws a tree in less than 5 minutes. VCF2PopTree accepts genotype data from a local machine, constructs a tree using UPGMA and Neighbour-Joining algorithms and displays it on a web-browser. It also produces pairwise-diversity matrix in MEGA and PHYLIP file formats as well as trees in the Newick format which could be directly used by other popular phylogenetic software programs. The software including the source code, a test VCF input file and short documentation are available at: https://github.com/sansubs/vcf2pop.


2019 ◽  
Author(s):  
Sankar Subramanian ◽  
Umayal Ramasamy ◽  
David Chen

In the past decades a number of software programs have been developed to deduce the phylogenetic relationship between populations. However, these programs are not suited for large-scale whole genome data. Recently, a few standalone or web applications have been developed to handle genome-wide data, but they were either computationally intensive, dependent on third party software or required significant time and resource of a web server. In the post-genomic era, researchers are able to obtain bioinformatically processed high-quality publication-ready whole genome data for many individuals in a population from next generation sequencing companies due to the reduction in the cost of sequencing and analysis. Such genotype data is typically presented in the Variant Call Format (VCF) and there is no simple software available that uses this data to construct the phylogeny of populations in a short time. To address this limitation, we have developed a one-click user-friendly software, VCF2PopTree that uses gnome-wide SNPs to construct and display phylogenetic trees in seconds to minutes. For example, it reads a 1 GB VCF file and draws a tree in less than 5 minutes. VCF2PopTree accepts genotype data from a local machine, constructs a tree using UPGMA and Neighbour-Joining algorithms and displays it on a web-browser. It also produces pairwise-diversity matrix in MEGA and PHYLIP file formats as well as trees in the Newick format which could be directly used by other popular phylogenetic software programs. The software including the source code, a test VCF input file and short documentation are available at: http://sankarsubramanian.net/dat/index.html.


2015 ◽  
Author(s):  
Liya Wang ◽  
Peter Van Buren ◽  
Doreen Ware

Over the past few years, cloud-based platforms have been proposed to address storage, management, and computation of large-scale data, especially in the field of genomics. However, for collaboration efforts involving multiple institutes, data transfer and management, interoperability and standardization among different platforms have imposed new challenges. This paper proposes a distributed bioinformatics platform that can leverage local clusters with remote computational clusters for genomic analysis using the unified bioinformatics workflow. The platform is built with a data server configured with iRODS, a computation cluster authenticated with iPlant Agave system, and web server to interact with the platform. A Genome-Wide Association Study workflow is integrated to validate the feasibility of the proposed approach.


2020 ◽  
Vol 33 (3-4) ◽  
pp. 160-174 ◽  
Author(s):  
Jacy L. Young

In the late 19th century, the questionnaire was one means of taking the case study into the multitudes. This article engages with Forrester’s idea of thinking in cases as a means of interrogating questionnaire-based research in early American psychology. Questionnaire research was explicitly framed by psychologists as a practice involving both natural historical and statistical forms of scientific reasoning. At the same time, questionnaire projects failed to successfully enact the latter aspiration in terms of synthesizing masses of collected data into a coherent whole. Difficulties in managing the scores of descriptive information questionnaires generated ensured the continuing presence of individuals in the results of this research, as the individual case was excerpted and discussed alongside a cast of others. As a consequence, questionnaire research embodied an amalgam of case, natural historical, and statistical thinking. Ultimately, large-scale data collection undertaken with questionnaires failed in its aim to construct composite exemplars or ‘types’ of particular kinds of individuals; to produce the singular from the multitudes.


Author(s):  
Ronald M. Baecker

Safety is often confused with security. A system or an environment may be secure, but if its normal operation does not achieve the intended goals, it may not be safe. Events will not progress as intended, and could go horribly wrong, even to the extent of grave injuries and loss of life. The more society relies upon digital technologies, the more we count on software to assure our safety. The issue of safety arises in a great variety of circumstances. Our discussion will start with dangers to the individual, then we will widen our focus to the organization, to society, and, finally, to the world. The digital divide that discourages internet use among older adults is due in part to threats posed to safe use of computers by ‘evil’ software such as programs that ‘phish’ for personal information, thereby gaining access to finances and committing identity theft, as we have discussed in the previous chapter. We shall enlarge upon this discussion by speaking of another risk—computer rage, which is caused by frustration when users cannot understand or manage the technology. Such instances are especially dangerous for senior citizens. We shall also discuss two ways in which the internet may not be safe for younger people: cyberbullying and revenge porn. We then examine a topic that arises in daily life: safety threats caused to pedestrians, bicyclists, and drivers by the continual use of distracting mobile devices. Our inability to control the costs of large-scale data processing implementations is a threat to the safety and health of organizations and governments, as is our inability to understand, modify, and fix large software systems that are no longer maintained by their creators. We shall describe several software disasters, both during their development and after they have been deployed and used. These include the software crisis at the turn of the century—the Y2K threat—which actually was averted, and several cases in which up to billions of dollars or pounds were wasted, including the decades-long saga of air traffic control in the USA.


Author(s):  
S. Ghatan ◽  
A. Costantini ◽  
R. Li ◽  
C. De Bruin ◽  
N. M. Appelman-Dijkstra ◽  
...  

Abstract Purpose of Review Fractures are frequently encountered in paediatric practice. Although recurrent fractures in children usually unveil a monogenic syndrome, paediatric fracture risk could be shaped by the individual genetic background influencing the acquisition of bone mineral density, and therefore, the skeletal fragility as shown in adults. Here, we examine paediatric fractures from the perspective of monogenic and complex trait genetics. Recent Findings Large-scale genome-wide studies in children have identified ~44 genetic loci associated with fracture or bone traits whereas ~35 monogenic diseases characterized by paediatric fractures have been described. Summary Genetic variation can predispose to paediatric fractures through monogenic risk variants with a large effect and polygenic risk involving many variants of small effects. Studying genetic factors influencing peak bone attainment might help in identifying individuals at higher risk of developing early-onset osteoporosis and discovering drug targets to be used as bone restorative pharmacotherapies to prevent, or even reverse, bone loss later in life.


F1000Research ◽  
2019 ◽  
Vol 7 ◽  
pp. 620 ◽  
Author(s):  
Parashkev Nachev ◽  
Geraint Rees ◽  
Richard Frackowiak

Translation in cognitive neuroscience remains beyond the horizon, brought no closer by supposed major advances in our understanding of the brain. Unless our explanatory models descend to the individual level—a cardinal requirement for any intervention—their real-world applications will always be limited. Drawing on an analysis of the informational properties of the brain, here we argue that adequate individualisation needs models of far greater dimensionality than has been usual in the field. This necessity arises from the widely distributed causality of neural systems, a consequence of the fundamentally adaptive nature of their developmental and physiological mechanisms. We discuss how recent advances in high-performance computing, combined with collections of large-scale data, enable the high-dimensional modelling we argue is critical to successful translation, and urge its adoption if the ultimate goal of impact on the lives of patients is to be achieved.


2020 ◽  
Vol 8 (3) ◽  
pp. 305-319 ◽  
Author(s):  
Dániel Hegedűs

The web 2.0 phenomenon and social media – without question – have reshaped our everyday experiences. These changes that they have generated affect how we consume, communicate and present ourselves, just to name a few aspects of life, and moreover, opened up new perspectives for sociology. Though many social practices persist in a somewhat altered form, brand new types of entities have emerged on different social media platforms: one of them is the video blogger. These actors have gained great visibility through so-called micro-celebrity practices and have become potential large-scale distributors of ideas, values and knowledge. Celebrities, in this case micro-celebrities (video bloggers), may disseminate such cognitive patterns through their constructed discourse which is objectified in the online space through a peculiar digital face (a social media profile) where fans can react, share and comment according to the affordances of the digital space. Most importantly, all of these interactions are accessible for scholars to examine the fan and celebrity practices of our era. This research attempts to reconstruct these discursive interactions on the Facebook pages of ten top Hungarian video bloggers. All findings are based on a large-scale data collection using the Netvizz application. As part of the interpretation of the results, a further consideration was that celebrity discourses may be a sort of disciplinary force in (post)modern society, which normalizes the individual to some extent by providing adequate schemas of attitude, mentality and ways of consumption.


Sign in / Sign up

Export Citation Format

Share Document