assignment accuracy
Recently Published Documents


TOTAL DOCUMENTS

14
(FIVE YEARS 7)

H-INDEX

5
(FIVE YEARS 0)

2021 ◽  
pp. 153-162
Author(s):  
Lukasz Chmielowski ◽  
Michal Kucharzak


2021 ◽  
Vol 12 (1) ◽  
Author(s):  
Xiao Wang ◽  
Eman Alnabati ◽  
Tunde W. Aderinwale ◽  
Sai Raghavendra Maddhuri Venkata Subramaniya ◽  
Genki Terashi ◽  
...  

AbstractAn increasing number of density maps of macromolecular structures, including proteins and DNA/RNA complexes, have been determined by cryo-electron microscopy (cryo-EM). Although lately maps at a near-atomic resolution are routinely reported, there are still substantial fractions of maps determined at intermediate or low resolutions, where extracting structure information is not trivial. Here, we report a new computational method, Emap2sec+, which identifies DNA or RNA as well as the secondary structures of proteins in cryo-EM maps of 5 to 10 Å resolution. Emap2sec+ employs the deep Residual convolutional neural network. Emap2sec+ assigns structural labels with associated probabilities at each voxel in a cryo-EM map, which will help structure modeling in an EM map. Emap2sec+ showed stable and high assignment accuracy for nucleotides in low resolution maps and improved performance for protein secondary structure assignments than its earlier version when tested on simulated and experimental maps.



Author(s):  
Zachary Gold ◽  
Emily Curd ◽  
Kelly Goodwin ◽  
Emma Choi ◽  
Benjamin Frable ◽  
...  

DNA metabarcoding is an important tool for molecular ecology. However, its effectiveness hinges on the quality of reference sequence databases and classification parameters employed. Here we evaluate the performance of MiFish 12S taxonomic assignments using a case study of California Current Large Marine Ecosystem fishes to determine best practices for metabarcoding. Specifically, we use a taxonomy cross-validation by identity framework to compare classification performance between a global database comprised of all available sequences and a curated database that only includes sequences of fishes from the California Current Large Marine Ecosystem. We demonstrate that the curated, regional database provides higher assignment accuracy than the comprehensive global database. We also document a tradeoff between accuracy and misclassification across a range of taxonomic cutoff scores, highlighting the importance of parameter selection for taxonomic classification. Furthermore, we compared assignment accuracy with and without the inclusion of additionally generated reference sequences. To this end, we sequenced tissue from 605 species using the MiFish 12S primers, adding 253 species to GenBank’s existing 550 California Current Large Marine Ecosystem fish sequences. We then compared species and reads identified from seawater environmental DNA samples using global databases with and without our generated references, and the regional database. The addition of new references allowed for the identification of 16 native taxa and 17.0% of total reads from eDNA samples, including species with vast ecological and economic value. Together these results demonstrate the importance of comprehensive and curated reference databases for effective metabarcoding and the need for locus-specific validation efforts.



2020 ◽  
Author(s):  
Xiao Wang ◽  
Eman Alnabati ◽  
Tunde W. Aderinwale ◽  
Sai Raghavendra Maddhuri Venkata Subramaniya ◽  
Genki Terashi ◽  
...  

AbstractAn increasing number of density maps of macromolecular structures, including proteins and protein and DNA/RNA complexes, have been determined by cryo-electron microscopy (cryo-EM). Although lately maps at a near-atomic resolution are routinely reported, there are still substantial fractions of maps determined at intermediate or low resolutions, where extracting structure information is not trivial. Here, we report a new computational method, Emap2sec+, which identifies DNA or RNA as well as the secondary structures of proteins in cryo-EM maps of 5 to 10 Å resolution. Emap2sec+ employs the deep Residual convolutional neural network. Emap2sec+ assigns structural labels with associated probabilities at each voxel in a cryo-EM map, which will help structure modeling in an EM map. Emap2sec+ showed stable and high assignment accuracy for nucleotides in low resolution maps and improved performance for protein secondary structure assignments than its earlier version when tested on simulated and experimental maps.



2020 ◽  
Author(s):  
Tom Bodenheimer ◽  
Mahantesh Halappanavar ◽  
Stuart Jefferys ◽  
Ryan Gibson ◽  
Siyao Liu ◽  
...  

AbstractCurrent single-cell experiments can produce datasets with millions of cells. Unsupervised clustering can be used to identify cell populations in single-cell analysis but often leads to interminable computation time at this scale. This problem has previously been mitigated by subsampling cells, which greatly reduces accuracy. We built on the graph-based algorithm PhenoGraph and developed FastPG which has the same cell assignment accuracy but is on average 27x faster in our tests. FastPG also has higher cell assignment accuracy than two other fast clustering methods, FlowSOM and PARC.AvailabilityFastPG is available here: https://github.com/sararselitsky/FastPG



2020 ◽  
Author(s):  
Qing-Long Fu ◽  
Manabu Fujii ◽  
Thomas Riedel

Increasing number of application of ultrahigh-resolution mass spectrometry (UHR-MS) to natural organic matter (NOM) characterization requires an efficient and accurate formula assignment from a number of mass data. Herein, we newly developed two automated batch codes (namely TRFu and FuJHA) and assessed their formula assignment accuracy together with frequently used open access algorithms (i.e., Formularity and WHOI). Overall assignment accuracy for 8,719 NOM-like emerging chemicals with known molecular formulae (mass range from 68 Da to 1,000 Da) was highest (94%) for TRFu. TRFu also showed the highest formula assignment rate (98.6%) for totally 76,880 UHR-MS peaks from 35 types of NOM (e.g., aquatic, soil/sediment, biochar). Therefore, as a reliable and practically feasible tool, the automated batch TRFu (freely available at ChemRxiv, DOI:10.26434/chemrxiv.9917399) can precisely characterize UHR-MS spectra of various NOM and could be extended to non-target screening of NOM-like emerging chemicals in natural and engineered environments including drinking water sources and wastewater effluents.<br>



2020 ◽  
Author(s):  
Qing-Long Fu ◽  
Manabu Fujii ◽  
Thomas Riedel

Increasing number of application of ultrahigh-resolution mass spectrometry (UHR-MS) to natural organic matter (NOM) characterization requires an efficient and accurate formula assignment from a number of mass data. Herein, we newly developed two automated batch codes (namely TRFu and FuJHA) and assessed their formula assignment accuracy together with frequently used open access algorithms (i.e., Formularity and WHOI). Overall assignment accuracy for 8,719 NOM-like emerging chemicals with known molecular formulae (mass range from 68 Da to 1,000 Da) was highest (94%) for TRFu. TRFu also showed the highest formula assignment rate (98.6%) for totally 76,880 UHR-MS peaks from 35 types of NOM (e.g., aquatic, soil/sediment, biochar). Therefore, as a reliable and practically feasible tool, the automated batch TRFu (freely available at ChemRxiv, DOI:10.26434/chemrxiv.9917399) can precisely characterize UHR-MS spectra of various NOM and could be extended to non-target screening of NOM-like emerging chemicals in natural and engineered environments including drinking water sources and wastewater effluents.<br>



2019 ◽  
Author(s):  
Qing-Long Fu ◽  
Manabu Fujii ◽  
Thomas Riedel

Increasing number of application of ultrahigh-resolution mass spectrometry (UHR-MS) to natural organic matter (NOM) characterization requires an efficient and accurate formula assignment from a number of mass data. Herein, we newly developed two automated batch codes (namely TRFu and FuJHA) and assessed their formula assignment accuracy together with frequently used open access algorithms (i.e., Formularity and WHOI). The overall assignment accuracy for the NOM-like 8,717 chemicals with known molecular formulae (mass range from 68 Da to 1,000 Da) was highest (94%) for TRFu. Comparative examination using 35 NOM mass spectrum data sets (totally 78,482 peaks with m/z range of 69 to 999) revealed that TRFu, FuJHA and Formularity had superior performance (e.g., high formula assignment ratios and lower mass errors) compared to WHOI, though the performance was depending on mass values and molecular compositions. Moreover, among all methods, TRFu showed smallest deviation from certified data in the 13C-formula assignment analysis. Therefore, as a reliable and practically feasible tool, the automated batch TRFu can precisely characterize UHR-MS spectra of various NOM and could be extended to the non-target screening of NOM-like emerging chemicals in natural and engineered environments.<br>



Author(s):  
Ainara Imaz Agirre

This paper reports on an experiment investigating the processing of accurate gender assignment in canonical and non-canonical inanimate nouns in Spanish by native speakers of Basque with nativelike proficiency in Spanish. 33 Basque/Spanish bilinguals and 32 native speakers of Spanish completed an online and an offline gender assignment task. Participants assigned gender to inanimate nouns with canonical (-o; -a) and non-canonical word endings (-e; consonants). The results revealed that the Basque/Spanish bilingual group obtained high accuracy scores in both tasks, similar to the Spanish native speaker group. Interestingly, unlike the Spanish group, the Basque speakers showed faster reaction times with feminine nouns than masculine ones. Canonicity seems to be a strong cue for both groups, since all participants were more accurate and faster with canonical word endings. Even though quantitatively Basque/Spanish bilinguals and Spanish monolinguals’ gender assignment accuracy rates do not differ, qualitatively, the Basque/Spanish bilinguals’ assignment patterns seem to differ somewhat from those of the native Spanish speakers.



Sign in / Sign up

Export Citation Format

Share Document