Sketching and Sublinear Data Structures in Genomics

2019 ◽  
Vol 2 (1) ◽  
pp. 93-118 ◽  
Author(s):  
Guillaume Marçais ◽  
Brad Solomon ◽  
Rob Patro ◽  
Carl Kingsford

Large-scale genomics demands computational methods that scale sublinearly with the growth of data. We review several data structures and sketching techniques that have been used in genomic analysis methods. Specifically, we focus on four key ideas that take different approaches to achieve sublinear space usage and processing time: compressed full-text indices, approximate membership query data structures, locality-sensitive hashing, and minimizers schemes. We describe these techniques at a high level and give several representative applications of each.

2019 ◽  
Author(s):  
Adrian Concepcion Leon ◽  
Markus Endler

Blockchain and Tangle are data structures used to create an immutable public record of data insured by a network of peer-to-peer participants who maintain a set of constantly growing data records known as ledgers. Blockchain and Tangle technologies are a decentralized solution that guarantees the exchange of large amounts of trusted messages, among billions of connected IoT devices, which are very valuable as they are valid and complete. This highly encrypted and secure peer-to-peer messaging mechanism is adopted in this project to manage the processing of IoT transactions and the coordination between the devices that interact with the process. To maintain private transactions, secure and trustless, the distributed consensus algorithms are responsible for validating and choosing transactions and recording them in the global ledger. The results showed that the speed of the consensus algorithms can affect the creation in real time of reliable stories that track the events of the IoT networks. After incorporating Complex Event Processing that allows selecting only those high level events, it is possible to get an improvement in many situations. The result is a Middleware system that provides a framework for the construction of large-scale computer applications that use Complex Events Processing and different decentralized ledgers such as the blockchain of Ethereum or IOTA Tangle, for secure data storage.


2019 ◽  
Author(s):  
S Arredondo-Alonso ◽  
J Top ◽  
AC Schürch ◽  
A McNally ◽  
S Puranen ◽  
...  

AbstractEnterococcus faecium is a gut commensal of many mammals but is also recognized as a major nosocomial human pathogen, as it is listed on the WHO global priority list of multi-drug resistant organisms. Previous research has suggested that nosocomial strains have multiple zoonotic origins and are only distantly related to those involved in human commensal colonization. Here we present the first comprehensive population-wide joint genomic analysis of hospital, commensal and animal isolates using both short- and long-read sequencing techniques. This enabled us to investigate the population plasmidome, core genome variation and genome architecture in detail, using a combination of machine learning, population genomics and genome-wide co-evolution analysis. We observed a high level of genome plasticity with large-scale inversions and heterogeneous chromosome sizes, collectively painting a high-resolution picture of the adaptive landscape of E. faecium, and identified plasmids as the main indicator for host-specificity. Given the increasing availability of long-read sequencing technologies, our approach could be widely applied to other human and animal pathogen populations to unravel fine-scale mechanisms of their evolution.


Author(s):  
Georgi Derluguian

The author develops ideas about the origin of social inequality during the evolution of human societies and reflects on the possibilities of its overcoming. What makes human beings different from other primates is a high level of egalitarianism and altruism, which contributed to more successful adaptability of human collectives at early stages of the development of society. The transition to agriculture, coupled with substantially increasing population density, was marked by the emergence and institutionalisation of social inequality based on the inequality of tangible assets and symbolic wealth. Then, new institutions of warfare came into existence, and they were aimed at conquering and enslaving the neighbours engaged in productive labour. While exercising control over nature, people also established and strengthened their power over other people. Chiefdom as a new type of polity came into being. Elementary forms of power (political, economic and ideological) served as a basis for the formation of early states. The societies in those states were characterised by social inequality and cruelties, including slavery, mass violence and numerous victims. Nowadays, the old elementary forms of power that are inherent in personalistic chiefdom are still functioning along with modern institutions of public and private bureaucracy. This constitutes the key contradiction of our time, which is the juxtaposition of individual despotic power and public infrastructural one. However, society is evolving towards an ever more efficient combination of social initiatives with the sustainability and viability of large-scale organisations.


Genetics ◽  
2001 ◽  
Vol 159 (4) ◽  
pp. 1765-1778
Author(s):  
Gregory J Budziszewski ◽  
Sharon Potter Lewis ◽  
Lyn Wegrich Glover ◽  
Jennifer Reineke ◽  
Gary Jones ◽  
...  

Abstract We have undertaken a large-scale genetic screen to identify genes with a seedling-lethal mutant phenotype. From screening ~38,000 insertional mutant lines, we identified >500 seedling-lethal mutants, completed cosegregation analysis of the insertion and the lethal phenotype for >200 mutants, molecularly characterized 54 mutants, and provided a detailed description for 22 of them. Most of the seedling-lethal mutants seem to affect chloroplast function because they display altered pigmentation and affect genes encoding proteins predicted to have chloroplast localization. Although a high level of functional redundancy in Arabidopsis might be expected because 65% of genes are members of gene families, we found that 41% of the essential genes found in this study are members of Arabidopsis gene families. In addition, we isolated several interesting classes of mutants and genes. We found three mutants in the recently discovered nonmevalonate isoprenoid biosynthetic pathway and mutants disrupting genes similar to Tic40 and tatC, which are likely to be involved in chloroplast protein translocation. Finally, we directly compared T-DNA and Ac/Ds transposon mutagenesis methods in Arabidopsis on a genome scale. In each population, we found only about one-third of the insertion mutations cosegregated with a mutant phenotype.


1979 ◽  
Vol 6 (2) ◽  
pp. 70-72
Author(s):  
T. A. Coffelt ◽  
F. S. Wright ◽  
J. L. Steele

Abstract A new method of harvesting and curing breeder's seed peanuts in Virginia was initiated that would 1) reduce the labor requirements, 2) maintain a high level of germination, 3) maintain varietal purity at 100%, and 4) reduce the risk of frost damage. Three possible harvesting and curing methods were studied. The traditional stack-pole method satisfied the latter 3 objectives, but not the first. The windrow-combine method satisfied the first 2 objectives, but not the last 2. The direct harvesting method satisfied all four objectives. The experimental equipment and curing procedures for direct harvesting had been developed but not tested on a large scale for seed harvesting. This method has been used in Virginia to produce breeder's seed of 3 peanut varieties (Florigiant, VA 72R and VA 61R) during five years. Compared to the stackpole method, labor requirements have been reduced, satisfactory levels of germination and varietal purity have been obtained, and the risk of frost damage has been minimized.


2012 ◽  
Vol 33 (07) ◽  
pp. 649-656 ◽  
Author(s):  
Mark Holodniy ◽  
Gina Oda ◽  
Patricia L. Schirmer ◽  
Cynthia A. Lucero ◽  
Yury E. Khudyakov ◽  
...  

Objective.To determine whether improper high-level disinfection practices during endoscopy procedures resulted in bloodborne viral infection transmission.Design.Retrospective cohort study.Setting.Four Veterans Affairs medical centers (VAMCs).Patients.Veterans who underwent colonoscopy and laryngoscopy (ear, nose, and throat [ENT]) procedures from 2003 to 2009.Methods.Patients were identified through electronic health record searches and serotested for human immunodeficiency virus (HIV), hepatitis C virus (HCV), and hepatitis B virus (HBV). Newly discovered case patients were linked to a potential source with known identical infection, whose procedure occurred no more than 1 day prior to the case patient's procedure. Viral genetic testing was performed for case/proximate pairs to determine relatedness.Results.Of 10,737 veterans who underwent endoscopy at 4 VAMCs, 9,879 patients agreed to viral testing. Of these, 90 patients were newly diagnosed with 1 or more viral bloodborne pathogens (BBPs). There were no case/proximate pairings found for patients with either HIV or HBV; 24 HCV case/proximate pairings were found, of which 7 case patients and 8 proximate patients had sufficient viral load for further genetic testing. Only 2 of these cases, both of whom underwent laryngoscopy, and their 4 proximates agreed to further testing. None of the 4 remaining proximate patients who underwent colonoscopy agreed to further testing. Mean genetic distance between the 2 case patients and 4 proximate patients ranged from 13.5% to 19.1%.Conclusions.Our investigation revealed that exposure to improperly reprocessed ENT endoscopes did not result in viral transmission in those patients who had viral genetic analysis performed. Any potential transmission of BBPs from colonoscopy remains unknown.


Author(s):  
Brian M Forde ◽  
Andrew Henderson ◽  
Elliott G Playford ◽  
David Looke ◽  
Belinda C Henderson ◽  
...  

Abstract Background Diphtheria is a potentially fatal respiratory disease caused by toxigenic Corynebacterium diphtheriae. Although resistance to erythromycin has been recognized, β-lactam resistance in toxigenic diphtheria has not been described. Here, we report a case of fatal respiratory diphtheria caused by toxigenic C. diphtheriae resistant to penicillin and all other β-lactam antibiotics, and describe a novel mechanism of inducible carbapenem resistance associated with the acquisition of a mobile resistance element. Methods Long-read whole-genome sequencing was performed using Pacific Biosciences Single Molecule Real-Time sequencing to determine the genome sequence of C. diphtheriae BQ11 and the mechanism of β-lactam resistance. To investigate the phenotypic inducibility of meropenem resistance, short-read sequencing was performed using an Illumina NextSeq500 sequencer on the strain both with and without exposure to meropenem. Results BQ11 demonstrated high-level resistance to penicillin (benzylpenicillin minimum inhibitory concentration [MIC] ≥ 256 μg/ml), β-lactam/β-lactamase inhibitors and cephalosporins (amoxicillin/clavulanic acid MIC ≥ 256 μg/mL; ceftriaxone MIC ≥ 8 μg/L). Genomic analysis of BQ11 identified acquisition of a novel transposon carrying the penicillin-binding protein (PBP) Pbp2c, responsible for resistance to penicillin and cephalosporins. When strain BQ11 was exposed to meropenem, selective pressure drove amplification of the transposon in a tandem array and led to a corresponding change from a low-level to a high-level meropenem-resistant phenotype. Conclusions We have identified a novel mechanism of inducible antibiotic resistance whereby isolates that appear to be carbapenem susceptible on initial testing can develop in vivo resistance to carbapenems with repeated exposure. This phenomenon could have significant implications for the treatment of C. diphtheriae infection, and may lead to clinical failure.


2019 ◽  
Vol 9 (1) ◽  
Author(s):  
Mohamed A. Farag ◽  
Moamen M. Elmassry ◽  
Masahiro Baba ◽  
Renée Friedman

Abstract Previous studies have shown that the Ancient Egyptians used malted wheat and barley as the main ingredients in beer brewing, but the chemical determination of the exact recipe is still lacking. To investigate the constituents of ancient beer, we conducted a detailed IR and GC-MS based metabolite analyses targeting volatile and non-volatile metabolites on the residues recovered from the interior of vats in what is currently the world’s oldest (c. 3600 BCE) installation for large-scale beer production located at the major pre-pharaonic political center at Hierakonpolis, Egypt. In addition to distinguishing the chemical signatures of various flavoring agents, such as dates, a significant result of our analysis is the finding, for the first time, of phosphoric acid in high level probably used as a preservative much like in modern beverages. This suggests that the early brewers had acquired the knowledge needed to efficiently produce and preserve large quantities of beer. This study provides the most detailed chemical profile of an ancient beer using modern spectrometric techniques and providing evidence for the likely starting materials used in beer brewing.


Author(s):  
Lucas Meyer de Freitas ◽  
Oliver Schuemperlin ◽  
Milos Balac ◽  
Francesco Ciari

This paper shows an application of the multiagent, activity-based transport simulation MATSim to evaluate equity effects of a congestion charging scheme. A cordon pricing scheme was set up for a scenario of the city of Zurich, Switzerland, to conduct such an analysis. Equity is one of the most important barriers toward the implementation of a congestion charging system. After the challenges posed by equity evaluations are examined, it is shown that agent-based simulations with heterogeneous values of time allow for an increased level of detail in such evaluations. Such detail is achieved through a high level of disaggregation and with a 24-h simulation period. An important difference from traditional large-scale models is the low degree of correlation between travel time savings and welfare change. While traditional equity analysis is based on travel time savings, MATSim shows that choice dimensions not included in traditional models, such as departure time changes, can also play an important role in equity effects. The analysis of the results in light of evidence from the literature shows that agent-based models are a promising tool to conduct more complete equity evaluations not only of congestion charges but also of transport policies in general.


2015 ◽  
Vol 28 (17) ◽  
pp. 6743-6762 ◽  
Author(s):  
Catherine M. Naud ◽  
Derek J. Posselt ◽  
Susan C. van den Heever

Abstract The distribution of cloud and precipitation properties across oceanic extratropical cyclone cold fronts is examined using four years of combined CloudSat radar and CALIPSO lidar retrievals. The global annual mean cloud and precipitation distributions show that low-level clouds are ubiquitous in the postfrontal zone while higher-level cloud frequency and precipitation peak in the warm sector along the surface front. Increases in temperature and moisture within the cold front region are associated with larger high-level but lower mid-/low-level cloud frequencies and precipitation decreases in the cold sector. This behavior seems to be related to a shift from stratiform to convective clouds and precipitation. Stronger ascent in the warm conveyor belt tends to enhance cloudiness and precipitation across the cold front. A strong temperature contrast between the warm and cold sectors also encourages greater post-cold-frontal cloud occurrence. While the seasonal contrasts in environmental temperature, moisture, and ascent strength are enough to explain most of the variations in cloud and precipitation across cold fronts in both hemispheres, they do not fully explain the differences between Northern and Southern Hemisphere cold fronts. These differences are better explained when the impact of the contrast in temperature across the cold front is also considered. In addition, these large-scale parameters do not explain the relatively large frequency in springtime postfrontal precipitation.


Sign in / Sign up

Export Citation Format

Share Document