scholarly journals A family-based phasing algorithm for sequence data

2018 ◽  
Author(s):  
Mara Battagin ◽  
Serap Gonen ◽  
Roger Ros-Freixedes ◽  
Andrew Whalen ◽  
Gregor Gorjanc ◽  
...  

This paper describes a family-based phasing algorithm, for variable-coverage sequence data, that first minimises phasing errors and then maximises the proportion of alleles phased. This algorithm is one of the essential tools that underpin an overall strategy for generating highly accurate sequence data on whole populations at low cost. The algorithm is called AlphaFamSeq. It uses sequence data on the focal individual and at least two generations of ancestors to phase alleles. In the first step, AlphaFamSeq calculates allele probabilities using iterative peeling. In subsequent steps, the alleles are phased using heuristics deriving information from the sequence data of parents, grandparents and progenies and, if available, from other families in the pedigree. AlphaFamSeq was tested on a range of simulated data sets. AlphaFamSeq gives low phasing error rates and, if there is sufficient sequence information and haplotype sharing amongst individuals, it can give a high yield of correctly phased alleles. The allele threshold had a large effect and window size had a small effect on performance. When all individuals in a single family were sequenced at different coverages the highest correctly phased alleles reached 90% of the possible maximum (98.9%) at ~1/6 of the maximum aggregate coverage. Adding sequence information from other related individuals increased the percentage of correctly phased alleles. Imputation performance was high across all allele frequencies (average correlation by marker of 0.94), except for a slight decrease at very low frequencies (≤0.01 MAF). Within an overall strategy for generating highly accurate sequence data on whole populations at low cost the role of AlphaFamSeq is to provide very accurately phased haplotypes on focal individuals, who are individuals whose haplotypes are very common in the population.


2017 ◽  
Author(s):  
Andrew Whalen ◽  
Roger Ros-Freixedes ◽  
David L Wilson ◽  
Gregor Gorjanc ◽  
John M Hickey

AbstractIn this paper we extend multi-locus iterative peeling to be a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. Our method, called hybrid peeling, uses multi-locus iterative peeling to estimate shared chromosome segments between parents and their offspring, and then uses single-locus iterative peeling to aggregate genomic information across multiple generations. Using a synthetic dataset, we first analysed the performance of hybrid peeling for calling and phasing alleles in disconnected families, families which contained only a focal individual and its parents and grandparents. Second, we analysed the performance of hybrid peeling for calling and phasing alleles in the context of the full pedigree. Third, we analysed the performance of hybrid peeling for imputing whole genome sequence data to the remaining individuals in the population. We found that hybrid peeling substantially increase the number of genotypes that were called and phased by leveraging sequence information on related individuals. The calling rate and accuracy increased when the full pedigree was used compared to a reduced pedigree of just parents and grandparents. Finally, hybrid peeling accurately imputed whole genome sequence information to non-sequenced individuals. We believe that this algorithm will enable the generation of low cost and high accuracy whole genome sequence data in many pedigreed populations. We are making this algorithm available as a standalone program called AlphaPeel.



2014 ◽  
Vol 8 (Suppl 1) ◽  
pp. S27 ◽  
Author(s):  
Jing Huang ◽  
Yong Chen ◽  
Michael D Swartz ◽  
Iuliana Ionita-Laza


2019 ◽  
Author(s):  
Christina Nieuwoudt ◽  
Angela Brooks-Wilson ◽  
Jinko Graham

1AbstractSummaryFamily-based studies have several advantages over case-control studies for finding causal rare variants for a disease; these include increased power, smaller sample size requirements, and improved detection of sequencing errors. However, collecting suitable families and compiling their data is time-consuming and expensive. To evaluate methodology to identify causal rare variants in family-based studies, one can use simulated data. For this purpose we present the R package SimRVSequences. Users supply a sample of pedigrees and single-nucleotide variant data from a sample of unrelated individuals representing the pedigree founders. Users may also model genetic heterogeneity among families. For ease of use, SimRVSequences offers methods to import and format single-nucleotide variant data and pedigrees from existing software.Availability and ImplementationSimRVSequences is available as a library for R≥ 3.5.0 on the comprehensive R archive network.



2019 ◽  
Vol 16 (8) ◽  
pp. 676-682
Author(s):  
Ankusab Noorahmadsab Nadaf ◽  
Kalegowda Shivashankar

The polycyclic dihydropyridine nucleus represents the heterocyclic system of invaluable core motifs with wide applications in chemical, biological and physical properties. Although this kind of compounds have been extensively synthesized by other groups, the synthesis of these compounds under CFL light intensity were not explored. The synthesis of polycyclic dihydropyridine derivatives were achieved through the reaction of 4-hydroxycoumarin, aromatic aldehydes and ammonium acetate under CFL light irradiation conditions. A series of polycyclic dihydropyridine derivatives were prepared under CFL light irradiation conditions with high yield, short reaction time, ambient condition and without the use of catalyst. The results displayed an efficient method for the synthesis of polycyclic dihydropyridine derivatives. Clean profile, short reaction time, low cost and use of CFL light intensity instead of catalyst making it a genuinely green protocol.



2019 ◽  
Vol 9 (2) ◽  
pp. 157-160
Author(s):  
Ali Hasani

Background: Laser ablation method has high-yield and pure SWCNHs. On the other hand, arc discharge methods have low-cost production of SWCNHs. However, these techniques have more desirable features, they need special expertness to use high power laser or high current discharge that either of them produces very high temperature. As for the researches, the temperatures of these techniques are higher than 4727°C to vaporize the graphite. So, to become aware of the advantages of SWCNHs, it is necessary to find a new way to synthesize SWCNHs at a lower temperature. In other words, reaction field can be expandable at a moderate temperature. This paper reports a new way to synthesize SWCNHs at an extremely reduced temperature. Methods: According to this study, the role of N2 is the protection of the copper holder supporting the graphite rod by increasing heat transfer from the holder. After the current of 70 A was supplied to the system, the temperature of graphite rod was raised to 1600°C. It is obvious that this temperature is somehow higher than the melting point of palladium, 1555°C, and much lower than graphite melting point, 3497°C. Results: Based on the results, there are transitional precursors simultaneous with the SWCNHs. This composition can be created by distortion of the primary SWCNTs at the higher temperature. Subsequently, each SWCNTs have a tendency to be broken into individual horns. With increasing the concentration of the free horns, bud-like SWCNHs can be produced. Moreover, there are individual horns almost separated from the mass of single wall carbon nanohorns. This structure is not common in SWCNHs synthesized by the usual method such as arc discharge or laser ablation. Through these regular techniques, SWCNHs are synthesized as cumulative particles with diameters about 30-150 nm. Conclusion: A simple heating is needed for SWCNTs transformation to SWCNHs with the presence of palladium as catalyst. The well-thought-out mechanism for this transformation is that SWCNTs were initially changed to highly curled shape, and after that were formed into small independent horns. The other rout to synthesize SWCNHs is the pyrolysis of palm olein at 950°C with the assistance of zinc nitrate and ferrocene. Palm olein was used as a promising, bio-renewable and inexpensive carbon source for the production of carbon nanohorns.



2021 ◽  
Vol 22 (3) ◽  
pp. 1124
Author(s):  
Mafalda Giovanna Reccia ◽  
Floriana Volpicelli ◽  
Eirkiur Benedikz ◽  
Åsa Fex Svenningsen ◽  
Luca Colucci-D’Amato

Neural stem cells represent a powerful tool to study molecules involved in pathophysiology of Nervous System and to discover new drugs. Although they can be cultured and expanded in vitro as a primary culture, their use is hampered by their heterogeneity and by the cost and time needed for their preparation. Here we report that mes-c-myc A1 cells (A1), a neural cell line, is endowed with staminal properties. Undifferentiated/proliferating and differentiated/non-proliferating A1 cells are able to generate neurospheres (Ns) in which gene expression parallels the original differentiation status. In fact, Ns derived from undifferentiated A1 cells express higher levels of Nestin, Kruppel-like factor 4 (Klf4) and glial fibrillary protein (GFAP), markers of stemness, while those obtained from differentiated A1 cells show higher levels of the neuronal marker beta III tubulin. Interestingly, Ns differentiation, by Epidermal Growth Factors (EGF) and Fibroblast Growth Factor 2 (bFGF) withdrawal, generates oligodendrocytes at high-yield as shown by the expression of markers, Galactosylceramidase (Gal-C) Neuron-Glial antigen 2 (NG2), Receptor-Interacting Protein (RIP) and Myelin Basic Protein (MBP). Finally, upon co-culture, Ns-A1-derived oligodendrocytes cause a redistribution of contactin-associated protein (Caspr/paranodin) protein on neuronal cells, as primary oligodendrocytes cultures, suggesting that they are able to form compact myelin. Thus, Ns-A1-derived oligodendrocytes may represent a time-saving and low-cost tool to study the pathophysiology of oligodendrocytes and to test new drugs.



2020 ◽  
Vol 13 (1) ◽  
pp. 235
Author(s):  
Fernando Martín-Consuegra ◽  
Fernando de Frutos ◽  
Ignacio Oteiza ◽  
Carmen Alonso ◽  
Borja Frutos

This study quantified the improvement in energy efficiency following passive renovation of the thermal envelope in highly inefficient residential complexes on the outskirts of the city of Madrid. A case study was conducted of a single-family terrace housing, representative of the smallest size subsidized dwellings built in Spain for workers in the nineteen fifties and sixties. Two units of similar characteristics, one in its original state and the other renovated, were analyzed in detail against their urban setting with an experimental method proposed hereunder for simplified, minimal monitoring. The dwellings were compared on the grounds of indoor environment quality parameters recorded over a period covering both winter and summer months. That information was supplemented with an analysis of the energy consumption metered. The result was a low-cost, reasonably accurate measure of the improvements gained in the renovated unit. The monitoring output data were entered in a theoretical energy efficiency model for the entire neighborhood to obtain an estimate of the potential for energy savings if the entire urban complex were renovated.



2021 ◽  
pp. 2101036
Author(s):  
Hengyi Lu ◽  
Wen Shi ◽  
Fei Zhao ◽  
Wenjing Zhang ◽  
Peixin Zhang ◽  
...  


Energies ◽  
2021 ◽  
Vol 14 (9) ◽  
pp. 2500
Author(s):  
Abdulrahman Alanezi ◽  
Kevin P. Hallinan ◽  
Kefan Huang

Smart WiFi thermostats, when they first reached the market, were touted as a means for achieving substantial heating and cooling energy cost savings. These savings did not materialize until additional features, such as geofencing, were added. Today, average savings from these thermostats of 10–12% in heating and 15% in cooling for a single-family residence have been reported. This research aims to demonstrate additional potential benefit of these thermostats, namely as a potential instrument for conducting virtual energy audits on residences. In this study, archived smart WiFi thermostat measured temperature data in the form of a power spectrum, corresponding historical weather and energy consumption data, building geometry characteristics, and occupancy data were integrated in order to train a machine learning model to predict attic and wall R-Values, furnace efficiency, and air conditioning seasonal energy efficiency ratio (SEER), all of which were known for all residences in this study. The developed model was validated on residences not used for model development. Validation R-squared values of 0.9408, 0.9421, 0.9536, and 0.9053 for predicting attic and wall R-values, furnace efficiency, and AC SEER, respectively, were realized. This research demonstrates promise for low-cost data-based energy auditing of residences reliant upon smart WiFi thermostats.



Author(s):  
Manjunath K. E. ◽  
Srinivasa Raghavan K. M. ◽  
K. Sreenivasa Rao ◽  
Dinesh Babu Jayagopi ◽  
V. Ramasubramanian

In this study, we evaluate and compare two different approaches for multilingual phone recognition in code-switched and non-code-switched scenarios. First approach is a front-end Language Identification (LID)-switched to a monolingual phone recognizer (LID-Mono), trained individually on each of the languages present in multilingual dataset. In the second approach, a common multilingual phone-set derived from the International Phonetic Alphabet (IPA) transcription of the multilingual dataset is used to develop a Multilingual Phone Recognition System (Multi-PRS). The bilingual code-switching experiments are conducted using Kannada and Urdu languages. In the first approach, LID is performed using the state-of-the-art i-vectors. Both monolingual and multilingual phone recognition systems are trained using Deep Neural Networks. The performance of LID-Mono and Multi-PRS approaches are compared and analysed in detail. It is found that the performance of Multi-PRS approach is superior compared to more conventional LID-Mono approach in both code-switched and non-code-switched scenarios. For code-switched speech, the effect of length of segments (that are used to perform LID) on the performance of LID-Mono system is studied by varying the window size from 500 ms to 5.0 s, and full utterance. The LID-Mono approach heavily depends on the accuracy of the LID system and the LID errors cannot be recovered. But, the Multi-PRS system by virtue of not having to do a front-end LID switching and designed based on the common multilingual phone-set derived from several languages, is not constrained by the accuracy of the LID system, and hence performs effectively on code-switched and non-code-switched speech, offering low Phone Error Rates than the LID-Mono system.



Sign in / Sign up

Export Citation Format

Share Document