SeqEditor: an application for primer design and sequence analysis with or without GTF/GFF files

Author(s):  
Ahmed Hafez ◽  
Ricardo Futami ◽  
Amir Arastehfar ◽  
Farnaz Daneshnia ◽  
Ana Miguel ◽  
...  

Abstract Motivation Sequence analyses oriented to investigate specific features, patterns and functions of protein and DNA/RNA sequences usually require tools based on graphic interfaces whose main characteristic is their intuitiveness and interactivity with the user’s expertise, especially when curation or primer design tasks are required. However, interface-based tools usually pose certain computational limitations when managing large sequences or complex datasets, such as genome and transcriptome assemblies. Having these requirments in mind we have developed SeqEditor an interactive software tool for nucleotide and protein sequences’ analysis. Result SeqEditor is a cross-platform desktop application for the analysis of nucleotide and protein sequences. It is managed through a Graphical User Interface and can work either as a graphical sequence browser or as a fasta task manager for multi-fasta files. SeqEditor has been optimized for the management of large sequences, such as contigs, scaffolds or even chromosomes, and includes a GTF/GFF viewer to visualize and manage annotation files. In turn, this allows for content mining from reference genomes and transcriptomes with similar efficiency to that of command line tools. SeqEditor also incorporates a set of tools for singleplex and multiplex PCR primer design and pooling that uses a newly optimized and validated search strategy for target and species-specific primers. All these features make SeqEditor a flexible application that can be used to analyses complex sequences, design primers in PCR assays oriented for diagnosis, and/or manage, edit and personalize reference sequence datasets. Availabilityand implementation SeqEditor was developed in Java using Eclipse Rich Client Platform and is publicly available at https://gpro.biotechvana.com/download/SeqEditor as binaries for Windows, Linux and Mac OS. The user manual and tutorials are available online at https://gpro.biotechvana.com/tool/seqeditor/manual. Supplementary information Supplementary data are available at Bioinformatics online.

2020 ◽  
Vol 36 (9) ◽  
pp. 2690-2696
Author(s):  
Jarkko Toivonen ◽  
Pratyush K Das ◽  
Jussi Taipale ◽  
Esko Ukkonen

Abstract Motivation Position-specific probability matrices (PPMs, also called position-specific weight matrices) have been the dominating model for transcription factor (TF)-binding motifs in DNA. There is, however, increasing recent evidence of better performance of higher order models such as Markov models of order one, also called adjacent dinucleotide matrices (ADMs). ADMs can model dependencies between adjacent nucleotides, unlike PPMs. A modeling technique and software tool that would estimate such models simultaneously both for monomers and their dimers have been missing. Results We present an ADM-based mixture model for monomeric and dimeric TF-binding motifs and an expectation maximization algorithm MODER2 for learning such models from training data and seeds. The model is a mixture that includes monomers and dimers, built from the monomers, with a description of the dimeric structure (spacing, orientation). The technique is modular, meaning that the co-operative effect of dimerization is made explicit by evaluating the difference between expected and observed models. The model is validated using HT-SELEX and generated datasets, and by comparing to some earlier PPM and ADM techniques. The ADM models explain data slightly better than PPM models for 314 tested TFs (or their DNA-binding domains) from four families (bHLH, bZIP, ETS and Homeodomain), the ADM mixture models by MODER2 being the best on average. Availability and implementation Software implementation is available from https://github.com/jttoivon/moder2. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 35 (15) ◽  
pp. 2686-2689
Author(s):  
Asa Thibodeau ◽  
Dong-Guk Shin

Abstract Summary Current approaches for pathway analyses focus on representing gene expression levels on graph representations of pathways and conducting pathway enrichment among differentially expressed genes. However, gene expression levels by themselves do not reflect the overall picture as non-coding factors play an important role to regulate gene expression. To incorporate these non-coding factors into pathway analyses and to systematically prioritize genes in a pathway we introduce a new software: Triangulation of Perturbation Origins and Identification of Non-Coding Targets. Triangulation of Perturbation Origins and Identification of Non-Coding Targets is a pathway analysis tool, implemented in Java that identifies the significance of a gene under a condition (e.g. a disease phenotype) by studying graph representations of pathways, analyzing upstream and downstream gene interactions and integrating non-coding regions that may be regulating gene expression levels. Availability and implementation The TriPOINT open source software is freely available at https://github.uconn.edu/ajt06004/TriPOINT under the GPL v3.0 license. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 35 (14) ◽  
pp. 2492-2494
Author(s):  
Tania Cuppens ◽  
Thomas E Ludwig ◽  
Pascal Trouvé ◽  
Emmanuelle Genin

Abstract Summary When analyzing sequence data, genetic variants are considered one by one, taking no account of whether or not they are found in the same individual. However, variant combinations might be key players in some diseases as variants that are neutral on their own can become deleterious when associated together. GEMPROT is a new analysis tool that allows, from a phased vcf file, to visualize the consequences of the genetic variants on the protein. At the level of an individual, the program shows the variants on each of the two protein sequences and the Pfam functional protein domains. When data on several individuals are available, GEMPROT lists the haplotypes found in the sample and can compare the haplotype distributions between different sub-groups of individuals. By offering a global visualization of the gene with the genetic variants present, GEMPROT makes it possible to better understand the impact of combinations of genetic variants on the protein sequence. Availability and implementation GEMPROT is freely available at https://github.com/TaniaCuppens/GEMPROT. An on-line version is also available at http://med-laennec.univ-brest.fr/GEMPROT/. Supplementary information Supplementary data are available at Bioinformatics online.


2018 ◽  
Vol 55 ◽  
pp. 612-624 ◽  
Author(s):  
Hamza Djelouat ◽  
Amine Ait Si Ali ◽  
Abbes Amira ◽  
Faycal Bensaali

2017 ◽  
Vol 15 (2) ◽  
pp. 140
Author(s):  
Yatnita Parama Cita ◽  
Dwi Hilda Putri

Tuberculosis (TB) is a serious disesase in the world. According to the WHO, it is estimated more than 3 million people die every year as a result of this infectious disease. One factor that causes diffi culty handling TB chemoteraphy is not effective against the bacteria Mycobacterium tuberculosis that causes TB . Effectiveness of treatment is often hampered by the emergence of bacterial resistance against M. Tuberculosis chemotherapy agents are given. From some research found that bacterial resistance may occur in more one type of chemotherapy agent also known as multi-drug resistance (MDR). Mycobacterium tuberculosis develop resistance mechanisms that are different from other bacteria in general. In prokaryotes, resistance is generally due to the transfer of genetic, either through plasmids,transposons and other. Reference sequence beta sub unit of RNAP protein M. Tuberculosis with accession number NP_215181.1 and M. tucerculosis rpoB gene with accession number NC_000962.3 used to obtain preliminary information from the data base www.ncbi.nlm.gov and www.uniprot.org . Mutation done according to several studies literature. Analysis of the composition, profi le, location and structure of protein using www.expasy.org, TMHMM and http://bioinf.cs.ucl.ac.uk/psipred. The primer design is done with Primer Design Program. Based on the analysis of mutation in the beta subunit of RNAP protein M. Tuberculosis, codon 531 (Ser ->Leu), it is known that mutations cause changes in some properties and structure of proteins. Possible changes affecting the nature of bacterial resistance to antibiotics rifampicin. However, further analysis needs to be done with the analysis of the docking technique.


Author(s):  
Mitchell J Sullivan ◽  
Nouri L Ben Zakour ◽  
Brian M Forde ◽  
Mitchell Stanton-Cook ◽  
Scott A Beatson

Contiguity is an interactive software for the visualization and manipulation of de novo genome assemblies. Contiguity creates and displays information on contig adjacency which is contextualized by the simultaneous display of a comparison between assembled contigs and reference sequence. Where scaffolders allow unambiguous connections between contigs to be resolved into a single scaffold, Contiguity allows the user to create all potential scaffolds in ambiguous regions of the genome. This enables the resolution of novel sequence or structural variants from the assembly. In addition, Contiguity provides a sequencing and assembly agnostic approach for the creation of contig adjacency graphs. To maximize the number of contig adjacencies determined, Contiguity combines information from read pair mappings, sequence overlap and De Bruijn graph exploration. We demonstrate how highly sensitive graphs can be achieved using this method. Contig adjacency graphs allow the user to visualize potential arrangements of contigs in unresolvable areas of the genome. By combining adjacency information with comparative genomics, Contiguity provides an intuitive approach for exploring and improving sequence assemblies. It is also useful in guiding manual closure of long read sequence assemblies. Contiguity is an open source application, implemented using Python and the Tkinter GUI package that can run on any Unix, OSX and Windows operating system. It has been designed and optimized for bacterial assemblies. Contiguity is available at http://mjsull.github.io/Contiguity .


Author(s):  
Mochammad Rajasa Mukti Negara ◽  
Ita Krissanti ◽  
Gita Widya Pradini

BACKGROUND Nucleocapsid (N) protein is one of four structural proteins of SARS-CoV-2  which is known to be more conserved than spike protein and is highly immunogenic. This study aimed to analyze the variation of the SARS-CoV-2 N protein sequences in ASEAN countries, including Indonesia. METHODS Complete sequences of SARS-CoV-2 N protein from each ASEAN country were obtained from Global Initiative on Sharing All Influenza Data (GISAID), while the reference sequence was obtained from GenBank. All sequences collected from December 2019 to March 2021 were grouped to the clade according to GISAID, and two representative isolates were chosen from each clade for the analysis. The sequences were aligned by MUSCLE, and phylogenetic trees were built using MEGA-X software based on the nucleotide and translated AA sequences. RESULTS 98 isolates of complete N protein genes from ASEAN countries were analyzed. The nucleotides of all isolates were 97.5% conserved. Of 31 nucleotide changes, 22 led to amino acid (AA) substitutions; thus, the AA sequences were 94.5% conserved. The phylogenetic tree of nucleotide and AA sequences shows similar branches. Nucleotide variations in clade O (C28311T); clade GR (28881–28883 GGG>AAC); and clade GRY (28881–28883 GGG>AAC and C28977T) lead to specific branches corresponding to the clade within both trees. CONCLUSIONS The N protein sequences of SARS-CoV-2 across ASEAN countries are highly conserved. Most isolates were closely related to the reference sequence originating from China, except the isolates representing clade O, GR, and GRY which formed specific branches in the phylogenetic tree.


2012 ◽  
Vol 45 (3) ◽  
pp. 164-169 ◽  
Author(s):  
Sebastián Dormido ◽  
Manuel Beschi ◽  
José Sánchez ◽  
Antonio Visioli

Author(s):  
Rastislav Róka

With the emerging applications and needs of ever increasing bandwidth, it is anticipated that the Next-Generation Passive Optical Network (NG-PON) with much higher bandwidth is a natural path forward to satisfy these demands and for network operators to develop valuable access networks. NG-PON systems present optical access infrastructures to support various applications of many service providers. Therefore, some general requirements for NG-PON networks are characterized and specified. Hybrid Passive Optical Networks (HPON) present a necessary phase of the future transition between PON classes with TDM or WDM multiplexing techniques utilized on the optical transmission medium – the optical fiber. Therefore, some specific requirements for HPON networks are characterized and presented. For developing hybrid passive optical networks, there exist various architectures and directions. They are also specified with emphasis on their basic characteristics and distinctions. Finally, the HPON network configurator as the interactive software tool is introduced in this chapter. Its main aim is helping users, professional workers, network operators and system analysts to design, configure, analyze, and compare various variations of possible hybrid passive optical networks. Some of the executed analysis is presented in detail.


Sign in / Sign up

Export Citation Format

Share Document