bioinformatics software
Recently Published Documents

Among viruses, human immunodeficiency virus (HIV) presents the greatest challenge to humans. Here, we retrieved genome sequences from NCBI and were then run through LALIGN bioinformatics software to compute the E value, bit score, Waterman eggert score, and percent identity, which are four important indicators of how similar the sequences are. The E value was 3.1 x 10^-9, the percent identity was 54.4 percent, and the bit score was 51.9. It was also sensed that bases 1600 to 1990 in HIV and bases 800 to 910 in FIV have a higher than normal similarity. This reflects that while the DNA sequences of the gag region of both the HIV and FIV genomes are rather similar, it is unlikely that this similarity is due to random chance; therefore, there are a noticeable number of differences. A better understanding of the level of similarity and differences in the gag region of the genome sequence would facilitate our understanding of structural and cellular behavioral differences between FIV and HIV, and in the long term, it will provide new insights into the differences observed in previous studies or even facilitate the development of an effective HIV treatment.

Download Full-text

Assessing and assuring interoperability of a genomics file format

10.1101/2022.01.07.475366 ◽

2022 ◽

Author(s):

Yi Nian Niu ◽

Eric G. Roberts ◽

Danielle Denisko ◽

Michael M. Hoffman

Keyword(s):

Formal Specification ◽

Poor Performance ◽

Multiple Root ◽

Interval Data ◽

Test Suite ◽

File Format ◽

Software Packages ◽

File Formats ◽

Wide Range ◽

Bioinformatics Software

Background: Bioinformatics software tools operate largely through the use of specialized genomics file formats. Often these formats lack formal specification, and only rarely do the creators of these tools robustly test them for correct handling of input and output. This causes problems in interoperability between different tools that, at best, wastes time and frustrates users. At worst, interoperability issues could lead to undetected errors in scientific results. Methods: We sought (1) to assess the interoperability of a wide range of bioinformatics software using a shared genomics file format and (2) to provide a simple, reproducible method for enhancing interoperability. As a focus, we selected the popular BED file format for genomic interval data. Based on the file format's original documentation, we created a formal specification. We developed a new verification system, Acidbio (https://github.com/hoffmangroup/acidbio), which tests for correct behavior in bioinformatics software packages. We crafted tests to unify correct behavior when tools encounter various edge cases—potentially unexpected inputs that exemplify the limits of the format. To analyze the performance of existing software, we tested the input validation of 80 Bioconda packages that parsed the BED format. We also used a fuzzing approach to automatically perform additional testing. Results: Of 80 software packages examined, 75 achieved less than 70% correctness on our test suite. We categorized multiple root causes for the poor performance of different types of software. Fuzzing detected other errors that the manually designed test suite could not. We also created a badge system that developers can use to indicate more precisely which BED variants their software accepts and to advertise the software's performance on the test suite. Discussion: Acidbio makes it easy to assess interoperability of software using the BED format, and therefore to identify areas for improvement in individual software packages. Applying our approach to other file formats would increase the reliability of bioinformatics software and data.

Download Full-text

Improving bioinformatics software quality through incorporation of software engineering practices

PeerJ Computer Science ◽

10.7717/peerj-cs.839 ◽

2022 ◽

Vol 8 ◽

pp. e839

Author(s):

Adeeb Noor

Keyword(s):

Software Engineering ◽

Software Development ◽

Software Quality ◽

Formal Education ◽

Scientific Software ◽

Cultural Changes ◽

Bioinformatics Software ◽

Software Engineers ◽

Engineering Practices

Background Bioinformatics software is developed for collecting, analyzing, integrating, and interpreting life science datasets that are often enormous. Bioinformatics engineers often lack the software engineering skills necessary for developing robust, maintainable, reusable software. This study presents review and discussion of the findings and efforts made to improve the quality of bioinformatics software. Methodology A systematic review was conducted of related literature that identifies core software engineering concepts for improving bioinformatics software development: requirements gathering, documentation, testing, and integration. The findings are presented with the aim of illuminating trends within the research that could lead to viable solutions to the struggles faced by bioinformatics engineers when developing scientific software. Results The findings suggest that bioinformatics engineers could significantly benefit from the incorporation of software engineering principles into their development efforts. This leads to suggestion of both cultural changes within bioinformatics research communities as well as adoption of software engineering disciplines into the formal education of bioinformatics engineers. Open management of scientific bioinformatics development projects can result in improved software quality through collaboration amongst both bioinformatics engineers and software engineers. Conclusions While strides have been made both in identification and solution of issues of particular import to bioinformatics software development, there is still room for improvement in terms of shifts in both the formal education of bioinformatics engineers as well as the culture and approaches of managing scientific bioinformatics research and development efforts.

Download Full-text

A quantitative analysis of the FIV and HIV genome using bioinformatics software

10.14293/s2199-1006.1.sor-.pphvwfm.v1 ◽

2022 ◽

Author(s):

Charlotte Siu ◽

Xiao Wen Cheng ◽

Meredith Horn

Keyword(s):

Human Immunodeficiency Virus ◽

Dna Sequences ◽

Illumina Miseq ◽

Hiv Treatment ◽

Behavioral Differences ◽

Percent Identity ◽

Immunodeficiency Virus ◽

Bioinformatics Software ◽

Human Kinds

Among viruses, the human immunodeficiency virus (HIV) presented the greatest challenge to human kinds. the HIV and FIV gag genome was sequenced using the Illumina MiSeq Benchtop next-generation sequencer.The DNA sequences obtained were then run through the LALIGN bioinformatics software to compute the E value, bit score, waterman eggert score, percent identity,which are four important indicators of how similar the sequences are. The E value was 3.1 x 10 ^-9, the percent identity was 54.4 percent and the bit score was 51.9. It was also sensed that base 1600 to 1990 in HIV and base 800 to 910 in FIV have a higher than normal similarity. This reflects that while the DNA sequences of the gag region of both the HIV and FIV genome are rather similar and it is unlikely that this similarity is due to random chance, there are a noticeable amount of differences. A better understanding of the level of similarity and differences in the gag region of the genome sequence would facilitate our understanding of structural and cellular behavioral differences between FIV and HIV, and in the long term it prevides new explanations to differences observed in previous studies, or even facilitate the development of an effective HIV treatment.

Download Full-text

A novel mutation KCNJ11 R136C caused KCNJ11-MODY

Diabetology & Metabolic Syndrome ◽

10.1186/s13098-021-00708-6 ◽

2021 ◽

Vol 13 (1) ◽

Author(s):

Yaning Chen ◽

Xiaodong Hu ◽

Jia Cui ◽

Mingwei Zhao ◽

Hebin Yao

Keyword(s):

Diabetes Mellitus ◽

Amino Acid ◽

Exome Sequencing ◽

Female Patient ◽

Whole Exome Sequencing ◽

Novel Mutation ◽

Young Female ◽

Whole Exome ◽

Bioinformatics Software

AbstractA young female patient, diagnosed with diabetes mellitus at the age of 28 years old in 2009, carries KCNJ11 R136C by whole exome sequencing and her daughter doesn’t carry this mutation. Bioinformatics software predicted that the 136th amino acid is highly conservative and the mutation is deleterious. KCNJ11 R136C can result in the change of channel port structure of KATP channel. So she was diagnosed as KCNJ11-MODY.

Download Full-text

RetroScan: An Easy-to-Use Pipeline for Retrocopy Annotation and Visualization

Frontiers in Genetics ◽

10.3389/fgene.2021.719204 ◽

2021 ◽

Vol 12 ◽

Author(s):

Zhaoyuan Wei ◽

Jiahe Sun ◽

Qinhui Li ◽

Ting Yao ◽

Haiyue Zeng ◽

...

Keyword(s):

Gene Duplication ◽

False Positive ◽

Annotation Tool ◽

Biological Functions ◽

Analysis Pipeline ◽

Shiny App ◽

Bioinformatics Software ◽

New Gene ◽

Visual Interface ◽

Positive Results

Retrocopies, which are considered “junk genes,” are occasionally formed via the insertion of reverse-transcribed mRNAs at new positions in the genome. However, an increasing number of recent studies have shown that some retrocopies exhibit new biological functions and may contribute to genome evolution. Hence, the identification of retrocopies has become very meaningful for studying gene duplication and new gene generation. Current pipelines identify retrocopies through complex operations using alignment programs and filter scripts in a step-by-step manner. Therefore, there is an urgent need for a simple and convenient retrocopy annotation tool. Here, we report the development of RetroScan, a publicly available and easy-to-use tool for scanning, annotating and displaying retrocopies, consisting of two components: an analysis pipeline and a visual interface. The pipeline integrates a series of bioinformatics software programs and scripts for identifying retrocopies in just one line of command. Compared with previous methods, RetroScan increases accuracy and reduces false-positive results. We also provide a Shiny app for visualization. It displays information on retrocopies and their parental genes that can be used for the study of retrocopy structure and evolution. RetroScan is available at https://github.com/Vicky123wzy/RetroScan.

Download Full-text

Phenotype and Genotype Study of Chinese POMT2-Related α-Dystroglycanopathy

Frontiers in Genetics ◽

10.3389/fgene.2021.692479 ◽

2021 ◽

Vol 12 ◽

Author(s):

Xiao-Yu Chen ◽

Dan-Yu Song ◽

Li Jiang ◽

Dan-Dan Tan ◽

Yi-Dan Liu ◽

...

Keyword(s):

Muscular Dystrophy ◽

Congenital Muscular Dystrophy ◽

Genetic Characteristics ◽

Apparent Lack ◽

Missense Variants ◽

Novel Variants ◽

Bioinformatics Software ◽

Uncertain Significance ◽

Genetic Features

ObjectiveAlpha-dystroglycanopathy (α-DGP) is a subtype of muscular dystrophy caused by defects in the posttranslational glycosylation of α-dystroglycan (α-DG). Our study aimed to summarize the clinical and genetic features of POMT2-related α-DGP in a cohort of patients in China.MethodsPedigrees, clinical data, and laboratory tests of patients diagnosed with POMT2-related α-DGP were analyzed retrospectively. The pathogenicity of variants in POMT2 were predicted by bioinformatics software. The variants with uncertain significance were verified by further analysis.ResultsThe 11 patients, comprising eight males and three females, were from nine non-consanguineous families. They exhibited different degrees of muscle weakness, ambulation, and intellectual impairment. Among them, three had a muscle-eye-brain disease (MEB)-like phenotype, five presented congenital muscular dystrophy with intellectual disability (CMD-ID), and three presented limb-girdle muscular dystrophy (LGMD). Overall, nine novel variants of POMT2, including two non-sense, one frameshift and six missense variants, were identified. The pathogenicity of two missense variants, c.1891G > C and c.874G > C, was uncertain based on bioinformatics software prediction. In vitro minigene analysis showed that c.1891G > C affects the splicing of POMT2. Immunofluorescence staining with the IIH6C4 antibody of muscle biopsy from the patient carrying the c.874G > C variant showed an apparent lack of expression.ConclusionThis study summarizes the clinical and genetic characteristics of a cohort of POMT2-related α-DGP patients in China for the first time, expanding the mutational spectrum of the disease. Further study of the pathogenicity of some missense variants based on enzyme activity detection is needed.

Download Full-text

Maternal Phylogenetic Relationships and Genetic Variation among Rare, Phenotypically Similar Donkey Breeds

Genes ◽

10.3390/genes12081109 ◽

2021 ◽

Vol 12 (8) ◽

pp. 1109

Author(s):

Andrea Mazzatenta ◽

Massimo Vignoli ◽

Maurizio Caputo ◽

Giorgio Vignola ◽

Roberto Tamburro ◽

...

Keyword(s):

Nucleotide Diversity ◽

Total Population ◽

Maternal Inheritance ◽

Genetic Material ◽

Breeding Programs ◽

Phylogenetic Relations ◽

Bioinformatics Software ◽

D Loop ◽

Components Analysis ◽

Relationship Of

The mitochondrial DNA (mtDNA) D-loop of endangered and critically endangered breeds has been studied to identify maternal lineages, characterize genetic inheritance, reconstruct phylogenetic relations among breeds, and develop biodiversity conservation and breeding programs. The aim of the study was to determine the variability remaining and the phylogenetic relationship of Martina Franca (MF, with total population of 160 females and 36 males), Ragusano (RG, 344 females and 30 males), Pantesco (PT, 47 females and 15 males), and Catalonian (CT) donkeys by collecting genetic data from maternal lineages. Genetic material was collected from saliva, and a 350 bp fragment of D-loop mtDNA was amplified and sequenced. Sequences were aligned and evaluated using standard bioinformatics software. A total of 56 haplotypes including 33 polymorphic sites were found in 77 samples (27 MF, 22 RG, 8 PT, 19 CT, 1 crossbred). The breed nucleotide diversity value (π) for all the breeds was 0.128 (MF: 0.162, RG: 0.132, PT: 0.025, CT: 0.038). Principal components analysis grouped most of the haplogroups into two different clusters, I (including all haplotypes from PT and CT, together with haplotypes from MF and RG) and II (including haplotypes from MF and RG only). In conclusion, we found that the primeval haplotypes, haplogroup variability, and a large number of maternal lineages were preserved in MF and RG; thus, these breeds play putative pivotal roles in the phyletic relationships of donkey breeds. Maternal inheritance is indispensable genetic information required to evaluate inheritance, variability, and breeding programs.

Download Full-text

GENETIC DIVERSITY OF EGYPTIAN ARABIAN HORSES FROM EL-ZAHRAA STUD BASED ON 14 TKY MICROSATELLITE MARKERS

Slovenian Veterinary Research ◽

10.26873/svr-1041-2020 ◽

2021 ◽

Vol 58 (2) ◽

Author(s):

Mary Sargious ◽

Ragab El-Shawarby ◽

Mohamed Abo-Salem ◽

Elham EL-Shewy ◽

Hanaa Ahmed ◽

...

Keyword(s):

Genetic Diversity ◽

Microsatellite Markers ◽

Polymorphic Information Content ◽

Parentage Assignment ◽

Expected Heterozygosity ◽

Bioinformatics Software ◽

Pcr Products ◽

Arabian Population ◽

Genetic Analyzer

The objectives of this study were, firstly, to conduct genetic characterization of Egyptian Arabian horses based on 14 TKY microsatellite markers, secondly, to investigate the powerfulness of these 14 TKY markers for parentage assignment of Arabian horses. A total of 101 horse samples including (Arabian = 71, Thoroughbred = 19 and Nooitgedacht = 11) were analysed by 14 TKY microsatellite markers. The PCR products were electrophoresed on Genetic analyzer 3500 with the aid of Liz standard. The basic measures of the allele’s size and genetic diversity were computed using bioinformatics software. The polymorphism of the TKY markers across the Arabian population showed moderate values for genetic diversity parameters; number of allele (NA) =8.143, effective number of allele (Ne) = 3.694, observed heterozygosity (HO) = 0.599, expected heterozygosity (HE) = 0.691, polymorphic Information Content (PIC) = 0.636 and Inbreeding coefficient (FIS)= 0.128. The combined probability of exclusion (CPE) value of the 14 TKY microsatellite loci of our Arabian horses was 0.9999. The results from current study confirm the applicability and efficiency of TKY microsatellite panel for evaluating the genetic diversity and parentage assignment of Egyptian Arabian horses.Key words: Arabian horses; genetic diversity; microsatellite; TKY markers GENSKA RAZNOVRSTNOST EGIPČANSKIH KONJ ARABSKE PASME IZ KOBILARNE EL-ZAHRAA NA PODLAGI 14 MIKROSATELITSKIH OZNAK TKY Izvleček: Nameni raziskave so bili genetska karakterizacija egipčanskih konj arabske pasme na podlagi 14 mikrosatelitskih označevalecv TKY ter raziskava moči 14 označevalcev TKY za dodelitev staršev arabskih konj. S pomočjo 14 mikrosatelitskih označevalcev TKY je bilo analiziranih 101 vzorcev konj (arabski = 71, čistokrvni = 19 in konji Nooitgedacht = 11). Produkte PCR so analizirali s pomočjo elektroforeze na genskem analizatorju 3500 s pomočjo Liz standarda. Osnovne mere velikosti alela in genske raznovrstnosti so bile izračunane s pomočjo programske opreme za bioinformatiko. Polimorfizem označevalcev TKY v arabski populaciji je pokazal zmerne vrednosti za parametre genske raznolikosti; število alelov (NA) = 8,143, efektivno število alelov (Ne) = 3,694, opazovana heterozigotnost (HO) = 0,599, pričakovana heterozigotnost (HE) = 0,691, polimorfna informacijska vsebina (PIC) = 0,636 in Inbriding koeficient (FIS) = 0,128. Skupna vrednost verjetnosti izključitve (CPE) 14 mikrosatelitskih lokusov TKY njihovih arabskih konj je bila 0,9999. Rezultati te raziskave potrjujejo uporabnost in učinkovitost mikrosatelitske plošče TKY za oceno genetske raznovrstnosti in starševske pripadnosti egipčanskih arabskih konj.Ključne besede: arabski konji; genska raznolikost; mikrosatelit; markerji TKY

Download Full-text

Putative Digenic GJB2/MYO7A Inheritance of Hearing Loss Detected in a Patient with 48,XXYY Klinefelter Syndrome

Human Heredity ◽

10.1159/000516854 ◽

2021 ◽

pp. 1-7

Author(s):

Qin Zhang ◽

Tiantian Qin ◽

Wenmu Hu ◽

Muhammad Usman Janjua ◽

Ping Jin

Keyword(s):

Hearing Loss ◽

Exome Sequencing ◽

Whole Exome Sequencing ◽

Klinefelter Syndrome ◽

Pathogenic Variants ◽

Digenic Inheritance ◽

Whole Exome ◽

Bioinformatics Software ◽

Pedigree Verification ◽

Hereditary Hearing Impairment

Objectives: Nonsyndromic hearing loss (NSHL) is the most frequent type of hereditary hearing impairment. Here, we explored the underlying genetic cause of NSHL in a three-generation family using whole-exome sequencing. The proband had concomitant NSHL and rare 48,XXYY Klinefelter syndrome. Material and Methods: Genomic DNA was extracted from the peripheral blood of the proband and their family members. Sanger sequencing and pedigree verification were performed on the pathogenic variants filtered by whole-exome sequencing. The function of the variants was analyzed using bioinformatics software. Results: The proband was digenic heterozygous for p.V37I in the GJB2 gene and p.L347I in the MYO7A gene. The proband’s mother had normal hearing and did not have any variant. The proband’s father and uncle both had NSHL and were compound for the GJB2 p.V37I and MYO7A p.L347I variants, thus indicating a possible GJB2/MYO7A digenic inheritance of NSHL. 48,XXYY Klinefelter syndrome was discovered in the proband after the karyotype analysis, while his parents both had normal karyotypes. Conclusions: Our findings reported a putative GJB2/MYO7A digenic inheritance form of hearing loss, expanding the genotype and phenotype spectrum of NSHL. In addition, this is the first report of concomitant NSHL and 48,XXYY syndrome.

Download Full-text

bioinformatics softwareRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

A quantitative analysis of the similarities and differences of the HIV/FIV gag genome

Assessing and assuring interoperability of a genomics file format

Improving bioinformatics software quality through incorporation of software engineering practices

A quantitative analysis of the FIV and HIV genome using bioinformatics software

A novel mutation KCNJ11 R136C caused KCNJ11-MODY

RetroScan: An Easy-to-Use Pipeline for Retrocopy Annotation and Visualization

Phenotype and Genotype Study of Chinese POMT2-Related α-Dystroglycanopathy

Maternal Phylogenetic Relationships and Genetic Variation among Rare, Phenotypically Similar Donkey Breeds

GENETIC DIVERSITY OF EGYPTIAN ARABIAN HORSES FROM EL-ZAHRAA STUD BASED ON 14 TKY MICROSATELLITE MARKERS

Putative Digenic GJB2/MYO7A Inheritance of Hearing Loss Detected in a Patient with 48,XXYY Klinefelter Syndrome

bioinformatics software
Recently Published Documents