Working with DNA sequences and other character data.

Author(s):  
Donald L. J. Quicke ◽  
Buntika A. Butcher ◽  
Rachel A. Kruft Welton

Abstract This chapter describes the use of an internet-based statistical analysis (R) for manipulating nucleotide sequences such as reverse-complementing and complementing, translating DNA to RNA and converting base sequences to amino acids. A few examples that introduce some more useful R functions are also included.

Author(s):  
Donald L. J. Quicke ◽  
Buntika A. Butcher ◽  
Rachel A. Kruft Welton

Abstract This chapter describes the use of an internet-based statistical analysis (R) for manipulating nucleotide sequences such as reverse-complementing and complementing, translating DNA to RNA and converting base sequences to amino acids. A few examples that introduce some more useful R functions are also included.


2021 ◽  
Vol 10 (7) ◽  
pp. 1467
Author(s):  
Olga Begou ◽  
Antigoni Pavlaki ◽  
Olga Deda ◽  
Alexander Bollenbach ◽  
Kathrin Drabert ◽  
...  

Congenital anomalies of the urinary tract, and particularly of obstructive nephropathy such as ureteropelvic junction obstruction (UPJO) in infants, can later lead to chronic kidney disease and hypertension. Fundamental questions regarding underlying mechanisms remain unanswered. The aim of the present study was to quantitate the systemic amino acids metabolome in 21 UPJO infants requiring surgery (Group A) and 21 UPJO infants under conservative treatment (Group B). Nineteen healthy age-matched infants served as controls (Group C). Serum amino acids involved in several pathways and representative metabolites, including the L-arginine-derived nitric oxide (NO) metabolites nitrite and nitrate and the lipid peroxidation biomarker malondialdehyde (MDA) were measured by gas chromatography–mass spectrometry (GC–MS) methods using their stable-isotope labeled analogs as internal standards after derivatization to their methyl esters N-pentafluoropropionic amides (amino acids) and to their pentafluorobenzyl derivatives (nitrite, nitrate, MDA). The concentrations of the majority of the biomarkers were found to be lower in Group A compared to Group B. Statistical analysis revealed clear differentiation between the examined study groups. Univariate statistical analysis highlighted serum homoarginine (q = 0.006), asymmetric dimethylarginine (q = 0.05) and malondialdehyde (q = 0.022) as potential biomarkers for UPJO infants requiring surgery. Group A also differed from Group B with respect to the diameter of the preoperative anterior–posterior renal pelvis (AP) as well as regarding the number and extent of inverse correlations between AP and the serum concentrations of the biomarkers. In Group A, but not in Group B, the AP diameter strongly correlated with hydroxy-proline (r = −0.746, p = 0.0002) and MDA (r = −0.754, p = 0.002). Our results indicate a diminished amino acids metabolome in the serum of UPJO infants requiring surgery comparing to a conservative group.


Foods ◽  
2021 ◽  
Vol 10 (6) ◽  
pp. 1377
Author(s):  
Song-Hui Soung ◽  
Sunmin Lee ◽  
Seung-Hwa Lee ◽  
Hae-Jin Kim ◽  
Na-Rae Lee ◽  
...  

Numerous varieties of doenjang are manufactured by many food companies using different ingredients and fermentation processes, and thus, the qualities such as taste and flavor are very different. Therefore, in this study, we compared many products, specifically, 19 traditional doenjang (TD) and 17 industrial doenjang (ID). Subsequently, we performed non-targeted metabolite profiling, and multivariate statistical analysis to discover distinct metabolites in two types of doenjang. Amino acids, organic acids, isoflavone aglycones, non-DDMP (2,3-dihydro-2,5-dihydroxy-6-methyl-4H-pyran-4- one) soyasaponins, hydroxyisoflavones, and biogenic amines were relatively abundant in TD. On the contrary, contents of dipeptides, lysophospholipids, isoflavone glucosides and DDMP-conjugated soyasaponin, precursors of the above-mentioned metabolites, were comparatively higher in ID. We also observed relatively higher antioxidant, protease, and β-glucosidase activities in TD. Our results may provide valuable information on doenjang to consumers and manufacturers, which can be used while selecting and developing new products.


2009 ◽  
Vol 43 (1) ◽  
pp. 203-205 ◽  
Author(s):  
Chetan Kumar ◽  
K. Sekar

The identification of sequence (amino acids or nucleotides) motifs in a particular order in biological sequences has proved to be of interest. This paper describes a computing server,SSMBS, which can locate and display the occurrences of user-defined biologically important sequence motifs (a maximum of five) present in a specific order in protein and nucleotide sequences. While the server can efficiently locate motifs specified using regular expressions, it can also find occurrences of long and complex motifs. The computation is carried out by an algorithm developed using the concepts of quantifiers in regular expressions. The web server is available to users around the clock at http://dicsoft1.physics.iisc.ernet.in/ssmbs/.


2021 ◽  
Author(s):  
Chitral Chatterjee ◽  
Soneya Majumdar ◽  
Sachin Deshpande ◽  
Deepak Pant ◽  
Saravanan Matheshwaran

Transcriptional repressor, LexA, regulates the “SOS” response, an indispensable bacterial DNA damage repair machinery.  Compared to its E.coli ortholog, LexA from Mycobacterium tuberculosis (Mtb) possesses a unique N-terminal extension of additional 24 amino acids in its DNA binding domain (DBD) and 18 amino acids insertion at its hinge region that connects the DBD to the C-terminal dimerization/autoproteolysis domain. Despite the importance of LexA in “SOS” regulation, Mtb LexA remains poorly characterized and the functional importance of its additional amino acids remained elusive. In addition, the lack of data on kinetic parameters of Mtb LexA-DNA interaction prompted us to perform kinetic analyses of Mtb LexA and its deletion variants using Bio-layer Interferometry (BLI). Mtb LexA is seen to bind to different “SOS” boxes, DNA sequences present in the operator regions of damage-inducible genes, with comparable nanomolar affinity. Deletion of 18 amino acids from the linker region is found to affect DNA binding unlike the deletion of the N-terminal stretch of extra 24 amino acids. The conserved RKG motif has been found to be critical for DNA binding. Overall, this study provides insights into the kinetics of the interaction between Mtb LexA and its target “SOS” boxes. The kinetic parameters obtained for DNA binding of Mtb LexA would be instrumental to clearly understand the mechanism of “SOS” regulation and activation in Mtb.


1997 ◽  
Vol 61 (4) ◽  
pp. 393-410
Author(s):  
M T Gallegos ◽  
R Schleif ◽  
A Bairoch ◽  
K Hofmann ◽  
J L Ramos

The ArC/XylS family of prokaryotic positive transcriptional regulators includes more than 100 proteins and polypeptides derived from open reading frames translated from DNA sequences. Members of this family are widely distributed and have been found in the gamma subgroup of the proteobacteria, low- and high-G + C-content gram-positive bacteria, and cyanobacteria. These proteins are defined by a profile that can be accessed from PROSITE PS01124. Members of the family are about 300 amino acids long and have three main regulatory functions in common: carbon metabolism, stress response, and pathogenesis. Multiple alignments of the proteins of the family define a conserved stretch of 99 amino acids usually located at the C-terminal region of the regulator and connected to a nonconserved region via a linker. The conserved stretch contains all the elements required to bind DNA target sequences and to activate transcription from cognate promoters. Secondary analysis of the conserved region suggests that it contains two potential alpha-helix-turn-alpha-helix DNA binding motifs. The first, and better-fitting motif is supported by biochemical data, whereas existing biochemical data neither support nor refute the proposal that the second region possesses this structure. The phylogenetic relationship suggests that members of the family have recruited the nonconserved domain(s) into a series of existing domains involved in DNA recognition and transcription stimulation and that this recruited domain governs the role that the regulator carries out. For some regulators, it has been demonstrated that the nonconserved region contains the dimerization domain. For the regulators involved in carbon metabolism, the effector binding determinants are also in this region. Most regulators belonging to the AraC/XylS family recognize multiple binding sites in the regulated promoters. One of the motifs usually overlaps or is adjacent to the -35 region of the cognate promoters. Footprinting assays have suggested that these regulators protect a stretch of up to 20 bp in the target promoters, and multiple alignments of binding sites for a number of regulators have shown that the proteins recognize short motifs within the protected region.


2020 ◽  
Author(s):  
Kuba Nowak ◽  
Paweł Błażej ◽  
Małgorzata Wnetrzak ◽  
Dorota Mackiewicz ◽  
Paweł Mackiewicz

1AbstractReprogramming of the standard genetic code in order to include non-canonical amino acids (ncAAs) opens a new perspective in medicine, industry and biotechnology. There are several methods of engineering the code, which allow us for storing new genetic information in DNA sequences and transmitting it into the protein world. Here, we investigate the problem of optimal genetic code extension from theoretical perspective. We assume that the new coding system should encode both canonical and new ncAAs using 64 classical codons. What is more, the extended genetic code should be robust to point nucleotide mutation and minimize the possibility of reversion from new to old information. In order to do so, we follow graph theory to study the properties of optimal codon sets, which can encode 20 canonical amino acids and stop coding signal. Finally, we describe the set of vacant codons that could be assigned to new amino acids. Moreover, we discuss the optimal number of the newly incorporated ncAAs and also the optimal size of codon blocks that are assigned to ncAAs.


Genetics ◽  
1994 ◽  
Vol 138 (1) ◽  
pp. 227-234 ◽  
Author(s):  
D L Hartl ◽  
E N Moriyama ◽  
S A Sawyer

Abstract The patterns of nonrandom usage of synonymous codons (codon bias) in enteric bacteria were analyzed. Poisson random field (PRF) theory was used to derive the expected distribution of frequencies of nucleotides differing from the ancestral state at aligned sites in a set of DNA sequences. This distribution was applied to synonymous nucleotide polymorphisms and amino acid polymorphisms in the gnd and putP genes of Escherichia coli. For the gnd gene, the average intensity of selection against disfavored synonymous codons was estimated as approximately 7.3 x 10(-9); this value is significantly smaller than the estimated selection intensity against selectively disfavored amino acids in observed polymorphisms (2.0 x 10(-8)), but it is approximately of the same order of magnitude. The selection coefficients for optimal synonymous codons estimated from PRF theory were consistent with independent estimates based on codon usage for threonine and glycine. Across 118 genes in E. coli and Salmonella typhimurium, the distribution of estimated selection coefficients, expressed as multiples of the effective population size, has a mean and standard deviation of 0.5 +/- 0.4. No significant differences were found in the degree of codon bias between conserved positions and replacement positions, suggesting that translational misincorporation is not an important selective constraint among synonymous polymorphic codons in enteric bacteria. However, across the first 100 codons of the genes, conserved amino acids with identical codons have significantly greater codon bias than that of either synonymous or nonidentical codons, suggesting that there are unique selective constraints, perhaps including mRNA secondary structures, in this part of the coding region.


1986 ◽  
Vol 6 (2) ◽  
pp. 645-652 ◽  
Author(s):  
J Geliebter ◽  
R A Zeff ◽  
D H Schulze ◽  
L R Pease ◽  
E H Weiss ◽  
...  

Genetic interaction as a mechanism for the generation of mutations is suggested by recurrent, multiple nucleotide substitutions that are identical to nucleotide sequences elsewhere in the genome. We have sequenced the mutant K gene from the bm6 mouse, which is one of a series of eight closely related, yet independently occurring mutants known collectively as the "bg series." Two changes from the Kb gene are found, positioned 15 nucleotides apart: an A-to-T change and a T-to-C change in the codons corresponding to amino acids 116 and 121, resulting in Tyr-to-Phe and Cys-to-Arg substitutions, respectively. Hybridization analysis with an oligonucleotide specific for the altered Kbm6 sequence identifies one donor gene, Q4, located in the Qa region of the H-2 complex. The two altered nucleotides that differentiate Kbm6 and Kb are present in Q4 in a region where Kb and Q4 are otherwise identical for 95 nucleotides, delineating the maximum genetic transfer between the two genes. Because the Kbm6 mutation arose in an homozygous mouse these data indicate that the Q4 gene contains the only donor sequence and demonstrates that Q-region gene sequences can interact with the Kb gene to generate variant K molecules.


Sign in / Sign up

Export Citation Format

Share Document