Working with batches of PDF files

AbstractRead alignment is central to many aspects of modern genomics. Most aligners use heuristics to accelerate processing, but these heuristics can fail to find the optimal alignments of reads. Alignment accuracy is typically measured through simulated reads; however, the simulated location may not be the (only) location with the optimal alignment score. Vargas implements a heuristic-free algorithm guaranteed to find the highest-scoring alignment for real sequencing reads to a linear or graph genome. With semiglobal and local alignment modes and affine gap and quality-scaled mismatch penalties, it can implement the scoring functions of commonly used aligners to calculate optimal alignments. While this is computationally intensive, Vargas uses multi-core parallelization and vectorized (SIMD) instructions to make it practical to optimally align large numbers of reads, achieving a maximum speed of 456 billion cell updates per second. We demonstrate how these “gold standard” Vargas alignments can be used to improve heuristic alignment accuracy by optimizing command-line parameters in Bowtie 2, BWA-MEM, and vg to align more reads correctly. Source code implemented in C++ and compiled binary releases are available at https://github.com/langmead-lab/vargas under the MIT license.

Download Full-text

The Design of Eman, an Experiment Manager

Prague Bulletin of Mathematical Linguistics ◽

10.2478/pralin-2013-0003 ◽

2013 ◽

Vol 99 (1) ◽

pp. 39-56 ◽

Cited By ~ 1

Author(s):

Ondřej Bojar ◽

Aleš Tamchyna

Keyword(s):

Machine Translation ◽

Command Line ◽

Tips And Tricks ◽

Field Of Study ◽

The Core ◽

Large Numbers ◽

Experiment Management ◽

Computational Research ◽

The Many ◽

High Level

Abstract We present eman, a tool for managing large numbers of computational experiments. Over the years of our research in machine translation (MT), we have collected a couple of ideas for efficient experimenting. We believe these ideas are generally applicable in (computational) research of any field. We incorporated them into eman in order to make them available in a command-line Unix environment. The aim of this article is to highlight the core of the many ideas. We hope the text can serve as a collection of experiment management tips and tricks for anyone, regardless their field of study or computer platform they use. The specific examples we provide in eman’s current syntax are less important but they allow us to use concrete terms. The article thus also fills the gap in eman documentation by providing some high-level overview.

Download Full-text

Vargas: heuristic-free alignment for assessing linear and graph read aligners

Bioinformatics ◽

10.1093/bioinformatics/btaa265 ◽

2020 ◽

Vol 36 (12) ◽

pp. 3712-3718

Author(s):

Charlotte A Darby ◽

Ravi Gaddipati ◽

Michael C Schatz ◽

Ben Langmead

Keyword(s):

Alignment Accuracy ◽

Supplementary Information ◽

Local Alignment ◽

Maximum Speed ◽

Command Line ◽

Scoring Functions ◽

Exact Match ◽

Large Numbers ◽

Computationally Intensive ◽

Optimal Alignments

Abstract Motivation Read alignment is central to many aspects of modern genomics. Most aligners use heuristics to accelerate processing, but these heuristics can fail to find the optimal alignments of reads. Alignment accuracy is typically measured through simulated reads; however, the simulated location may not be the (only) location with the optimal alignment score. Results Vargas implements a heuristic-free algorithm guaranteed to find the highest-scoring alignment for real sequencing reads to a linear or graph genome. With semiglobal and local alignment modes and affine gap and quality-scaled mismatch penalties, it can implement the scoring functions of commonly used aligners to calculate optimal alignments. While this is computationally intensive, Vargas uses multi-core parallelization and vectorized (SIMD) instructions to make it practical to optimally align large numbers of reads, achieving a maximum speed of 456 billion cell updates per second. We demonstrate how these ‘gold standard’ Vargas alignments can be used to improve heuristic alignment accuracy by optimizing command-line parameters in Bowtie 2, BWA-maximal exact match and vg to align more reads correctly. Availability and implementation Source code implemented in C++ and compiled binary releases are available at https://github.com/langmead-lab/vargas under the MIT license. Supplementary information Supplementary data are available at Bioinformatics online.

Download Full-text

A2G2: A Python wrapper to perform very large alignments in semi-conserved regions

10.1101/2020.05.21.109009 ◽

2020 ◽

Cited By ~ 1

Author(s):

Jose Sergio Hleap ◽

Melania E. Cristescu ◽

Dirk Steinke

Keyword(s):

Supplementary Information ◽

Command Line ◽

Reference Region ◽

Consensus Sequences ◽

Link Type ◽

Large Numbers ◽

Conserved Genes ◽

Local Reference ◽

Supplementary Material ◽

Efficient Parallelization

AbstractSummaryAmplicons to Global Gene (A2G2) is a Python wrapper that uses MAFFT and an “Amplicon to Gene” strategy to align very large numbers of sequences while improving alignment accuracy. It is specially developed to deal with conserved genes, where traditional aligners introduce a significant amount of gaps. A2G2 leverages the add sequences option of MAFFT to align the sequences to a global reference gene and a local reference region. Both of these references can be consensus sequences of trusted sources. Efficient parallelization of these tasks allows A2G2 to align a very large number of sequences (> 500K) in a reasonable amount of time. A2G2 can be imported in Python for easier integration with other software, or can be run via command line.AvailabilityA2G2 is implemented in Python 3 (3.6) and depends on MAFFT availability. Other package requirements can be found in the requirements.txt file at https://github.com/jshleap/A2G. A2G2 is also available via PyPi (https://pypi.org/project/A2G). It is licensed under the LGPLv3.Supplementary informationSupplementary material is available at github as jupyter notebook.

Download Full-text

Cytoplasmic Granules in Lymphocytes from Rats Dosed with SK&F 14336-D

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100066838 ◽

1971 ◽

Vol 29 ◽

pp. 556-557

Author(s):

T. G. Merrill ◽

B. J. Payne ◽

A. J. Tousimis

Keyword(s):

Lipid Storage ◽

Pokeweed Mitogen ◽

Electron Microscopic ◽

Cytoplasmic Granules ◽

Storage Diseases ◽

Tay Sachs Disease ◽

Large Numbers ◽

Microscopic Studies ◽

Neurovisceral Lipidosis

Rats given SK&F 14336-D (9-[3-Dimethylamino propyl]-2-chloroacridane), a tranquilizing drug, developed an increased number of vacuolated lymphocytes as observed by light microscopy. Vacuoles in peripheral blood of rats and humans apparently are rare and are not usually reported in differential counts. Transforming agents such as phytohemagglutinin and pokeweed mitogen induce similar vacuoles in in vitro cultures of lymphocytes. These vacuoles have also been reported in some of the lipid-storage diseases of humans such as amaurotic familial idiocy, familial neurovisceral lipidosis, lipomucopolysaccharidosis and sphingomyelinosis. Electron microscopic studies of Tay-Sachs' disease and of chloroquine treated swine have demonstrated large numbers of “membranous cytoplasmic granules” in the cytoplasm of neurons, in addition to lymphocytes. The present study was undertaken with the purpose of characterizing the membranous inclusions and developing an experimental animal model which may be used for the study of lipid storage diseases.

Download Full-text

Prolonged Fixation Studies For Spaceflight

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100072101 ◽

1973 ◽

Vol 31 ◽

pp. 332-333

Author(s):

Robert Corbett ◽

Delbert E. Philpott ◽

Sam Black

Keyword(s):

High Pressure ◽

Living Systems ◽

Prolonged Storage ◽

Time Intervals ◽

Early Signs ◽

Large Numbers

Observation of subtle or early signs of change in spaceflight induced alterations on living systems require precise methods of sampling. In-flight analysis would be preferable but constraints of time, equipment, personnel and cost dictate the necessity for prolonged storage before retrieval. Because of this, various tissues have been stored in fixatives and combinations of fixatives and observed at various time intervals. High pressure and the effect of buffer alone have also been tried.Of the various tissues embedded, muscle, cartilage and liver, liver has been the most extensively studied because it contains large numbers of organelles common to all tissues (Fig. 1).

Download Full-text

Analysis of the Anterior Secretory Cells of Onchidohs Muricata (Nudibranchia)

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100099490 ◽

1981 ◽

Vol 39 ◽

pp. 614-615

Author(s):

Roy Skidmore

Keyword(s):

Endoplasmic Reticulum ◽

Secretory Granules ◽

Golgi Body ◽

The Body ◽

Secretory Cells ◽

Secretory Vesicles ◽

High Magnification ◽

Large Numbers ◽

Membrane Bound ◽

Sole Of The Foot

The long-necked secretory cells in Onchidoris muricata are distributed in the anterior sole of the foot. These cells are interspersed among ciliated columnar and conical cells as well as short-necked secretory gland cells. The long-necked cells contribute a significant amount of mucoid materials to the slime on which the nudibranch travels. The body of these cells is found in the subepidermal tissues. A long process extends across the basal lamina and in between cells of the epidermis to the surface of the foot. The secretory granules travel along the process and their contents are expelled by exocytosis at the foot surface.The contents of the cell body include the nucleus, some endoplasmic reticulum, and an extensive Golgi body with large numbers of secretory vesicles (Fig. 1). The secretory vesicles are membrane bound and contain a fibrillar matrix. At high magnification the similarity of the contents in the Golgi saccules and the secretory vesicles becomes apparent (Fig. 2).

Download Full-text

Selective Sampling and Orientation of Non-Homogeneous Tissues

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100100238 ◽

1968 ◽

Vol 26 ◽

pp. 98-99

Author(s):

C. C. Clawson ◽

L. W. Anderson ◽

R. A. Good

Keyword(s):

Electron Microscopy ◽

Electron Microscope ◽

Light Microscopy ◽

Random Sampling ◽

New Method ◽

Microscope Examination ◽

Small Areas ◽

Selective Sampling ◽

Large Numbers ◽

Specific Orientation

Investigations which require electron microscope examination of a few specific areas of non-homogeneous tissues make random sampling of small blocks an inefficient and unrewarding procedure. Therefore, several investigators have devised methods which allow obtaining sample blocks for electron microscopy from region of tissue previously identified by light microscopy of present here techniques which make possible: 1) sampling tissue for electron microscopy from selected areas previously identified by light microscopy of relatively large pieces of tissue; 2) dehydration and embedding large numbers of individually identified blocks while keeping each one separate; 3) a new method of maintaining specific orientation of blocks during embedding; 4) special light microscopic staining or fluorescent procedures and electron microscopy on immediately adjacent small areas of tissue.

Download Full-text

Microanalysis of precipitates in a ferritic steel

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100110234 ◽

1978 ◽

Vol 36 (1) ◽

pp. 618-619

Author(s):

J.M. Titchmarsh

Keyword(s):

Ferritic Steel ◽

Particle Identification ◽

Energy Dispersive ◽

Objective Lens ◽

Ferritic Steels ◽

Replica Technique ◽

X Ray ◽

Large Numbers ◽

Scanning Transmission ◽

Made In

The advances in recent years in the microanalytical capabilities of conventional TEM's fitted with probe forming lenses allow much more detailed investigations to be made of the microstructures of complex alloys, such as ferritic steels, than have been possible previously. In particular, the identification of individual precipitate particles with dimensions of a few tens of nanometers in alloys containing high densities of several chemically and crystallographically different precipitate types is feasible. The aim of the investigation described in this paper was to establish a method which allowed individual particle identification to be made in a few seconds so that large numbers of particles could be examined in a few hours.A Philips EM400 microscope, fitted with the scanning transmission (STEM) objective lens pole-pieces and an EDAX energy dispersive X-ray analyser, was used at 120 kV with a thermal W hairpin filament. The precipitates examined were extracted using a standard C replica technique from specimens of a 2¼Cr-lMo ferritic steel in a quenched and tempered condition.

Download Full-text

An SEM study of raphide crystal initials in the leaves of vitis (grape)

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100141299 ◽

1995 ◽

Vol 53 ◽

pp. 984-985

Author(s):

H. J. Arnott ◽

M. A. Webb ◽

L. E. Lopez

Keyword(s):

Calcium Oxalate ◽

Water Soluble ◽

Calcium Oxalate Crystals ◽

Calcium Oxalate Crystal ◽

Oxalate Crystals ◽

Large Numbers ◽

Sem Study ◽

The Matrix ◽

Crystal Cells ◽

Crystal Idioblast

Many papers have been published on the structure of calcium oxalate crystals in plants, however, few deal with the early development of crystals. Large numbers of idioblastic calcium oxalate crystal cells are found in the leaves of Vitis mustangensis, V. labrusca and V. vulpina. A crystal idioblast, or raphide cell, will produce 150-300 needle-like calcium oxalate crystals within a central vacuole. Each raphide crystal is autonomous, having been produced in a separate membrane-defined crystal chamber; the idioblast''s crystal complement is collectively embedded in a water soluble glycoprotein matrix which fills the vacuole. The crystals are twins, each having a pointed and a bidentate end (Fig 1); when mature they are about 0.5-1.2 μn in diameter and 30-70 μm in length. Crystal bundles, i.e., crystals and their matrix, can be isolated from leaves using 100% ETOH. If the bundles are treated with H2O the matrix surrounding the crystals rapidly disperses.

Download Full-text