perl script
Recently Published Documents


TOTAL DOCUMENTS

23
(FIVE YEARS 5)

H-INDEX

4
(FIVE YEARS 0)

2021 ◽  
Author(s):  
Loïc Meunier ◽  
Denis Baurain ◽  
Luc Cornet

AbstractSummaryTo support small and large-scale genome annotation projects, we present AMAW (Automated MAKER2 Annotation Wrapper), a program devised to annotate non-model unicellular eukaryotic genomes by automating the acquisition of evidence data (transcripts and proteins) and facilitating the use of MAKER2, a widely adopted software suite for the annotation of eukaryotic genomes. Moreover, AMAW exists as a Singularity container recipe easy to deploy on a grid computer, thereby overcoming the tricky installation of MAKER2.AvailabilityAMAW is released both as a Singularity container recipe and a standalone Perl script (https://bitbucket.org/phylogeno/amaw/)[email protected] or [email protected] informationSupplementary data are available at Bioinformatics online.


2021 ◽  
Vol 6 (2) ◽  
pp. 31-38
Author(s):  
Duy Long Ta ◽  
Huy Hiep Nguyen ◽  
Tuan Khai Nguyen ◽  
Vinh Thanh Tran ◽  
Huu Tiep Nguyen

This paper presents a computational scheme using MCNP5 and COBRA-EN for coupling neutronics/thermal hydraulics calculation of a VVER-1000 fuel assembly. A master program was written using the PERL script language to build the corresponding inputs for the MCNP5 and COBRA-EN calculations and to manage the coupling scheme. The hexagonal coolant channels have been used in the thermal hydraulics model using CORBRA-EN to simplify the coupling scheme. The results of two successive iterations were compared with an assigned convergence criterion and the loop calculation can be broken when the convergence criterion is satisfied. Numerical calculation has been performed based on a UO2fuel assembly of the VVER-1000 reactor.


2021 ◽  
Author(s):  
Kyle Fletcher ◽  
Rongkui Han ◽  
Diederik Smilde ◽  
Richard Michelmore

Polyploidy and heterokaryosis are common and consequential genetic phenomena that increase the number of haplotypes in an organism and complicate whole-genome sequence analysis. Allele balance has been used to infer polyploidy and heterokaryosis in diverse organisms using read sets sequenced to greater than 50x whole-genome coverage. However, Sequencing to adequate depth is costly if applied to multiple individuals or large genomes. We developed VCFvariance.pl to utilize the variance of allele balance to infer polyploidy and/or heterokaryosis at low sequence coverage. This analysis requires as little as 10x whole-genome coverage and reduces the allele balance profile down to a single value, which can be used to determine if an individual has two or more haplotypes. This approach was validated on simulated, synthetic, and authentic read sets from an oomycete, fungus, and plant. The approach was deployed to ascertain the genome status of multiple isolates of Bremia lactucae and Phytophthora infestans. VCFvariance.pl is a Perl script available at https://github.com/kfletcher88/VCFvariance.


2019 ◽  
Vol 4 (1) ◽  
pp. 16-26
Author(s):  
Iman Hazwam Bin Abd Halim ◽  
Nur Muhammad Irfan Bin Abu Hassan ◽  
Tajul Rosli Razak ◽  
Muhammad Nabil Fikri bin Jamaluddin ◽  
Mohammad Hafiz Bin Ismail

Honeypot is a decoy computer system that is used to attract and monitor hackers’ activities in the network. The honeypot aims to collect information from the hackers in order to create a more secure system. However, the log file generated by honeypot can grow very large when heavy traffic occurred in the system, such as Distributed Denial of Services’ (DDoS) attack. The DDoS possesses difficulty when it is being processed and analyzed by the network administrator as it required a lot of time and resources. Therefore, in this paper, we propose an approach to decrease the log size that is by using a Cron job that will run with a Perl-script. This approach parses the collected data into the database periodically to decrease the log size. Three DDoS attack cases were conducted in this study to show the increasing of the log size by sending a different amount of packet per second for 8 hours in each case. The results have shown that by utilizing the Cron job with Perl-script, the log size has been significantly reduced, the disk space used in the system has also decreased. Consequently, this approach capable of speeding up the process of parsing the log file into the database and thus, improving the overall system performance. This study contributes to providing a pathway in reducing honeypot log storage using the Cron job with Perl-Script. 


2018 ◽  
Vol 11 (1) ◽  
Author(s):  
Thomas Sauvage ◽  
Sophie Plouviez ◽  
William E. Schmidt ◽  
Suzanne Fredericq

2016 ◽  
Vol 3 (2) ◽  
pp. 207 ◽  
Author(s):  
Asheesh Shanker

Simple sequence repeats (SSRs) consist of short repeat motifs of 1-6 nucleotides and are found in DNA sequences.The present study was conducted to detect SSRs in chloroplast genome of Tetraphis pellucida (Accession number: NC_024291), downloaded from the National Center for Biotechnology Information (NCBI). The sequence was mined with the help of MISA, a Perl script, to detect SSRs. The length of SSRs defined as ≥12 for mono, di, tri and tetranucleotide, ≥15 for pentanucleotide and ≥18 for hexanucleotide repeats. In total, 41 perfect microsatellites were identified in 127.489 kb sequence mined. An average length of 13.56 bp was calculated for mined SSRs with a density of 1 SSR/3.04 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 20 nt. Dinucleotides (14, 34.15%) were the most frequent repeat type, followed by tetranucleotides (10, 24.39%), trinucleotides (7, 17.07%), mononucleotides (6, 14.63%) and pentanucleotide (4, 9.76%) repeats. Hexanucleotide repeats were completely absent in chloroplast genome of Tetraphis pellucida. The mined SSRs can be used to develop molecular markers and genetic diversity studies in Tetraphis species.


2014 ◽  
Author(s):  
Kevin Weitemier ◽  
Shannon C. K. Straub ◽  
Mark Fishbein ◽  
Aaron Liston

Despite knowledge that concerted evolution of high-copy loci is often imperfect, few studies investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA) across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual's consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the “noncoding” ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming).


2014 ◽  
Author(s):  
Kevin Weitemier ◽  
Shannon C. K. Straub ◽  
Mark Fishbein ◽  
Aaron Liston

Despite knowledge that concerted evolution of high-copy loci is often imperfect, few studies investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA) across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual's consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the “noncoding” ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming).


2014 ◽  
Vol 3 (3) ◽  
pp. 50-58
Author(s):  
Asheesh Shanker

Microsatellites also known as simple sequence repeats (SSRs) are found in DNA sequences. These repeats consist of short motifs of 1-6 bp and play important role in population genetics, phylogenetics and also in the development of molecular markers. In this study chloroplastic SSRs (cpSSRs) in the chloroplast genome of Ptilidium pulcherrimum, downloaded from the National Center for Biotechnology Information (NCBI), were detected. The chloroplast genome sequence of P. pulcherrimum was mined with the help of a Perl script named MISA. A total of 23 perfect cpSSRs were detected in 119.007 kb sequence mined showing density of 1 SSR/5.17 kb. Depending on the repeat units, the length of SSRs found to be 12 bp for mono and tri, 12 to “22 bp for di, 12 to 16 bp for tetra nucleotide repeats. Penta and hexanucleotide repeats were completely absent in chloroplast genome of P. pulcherrimum. Dinucleotide repeats were the most frequent repeat type (47.83%) followed by tri (21.74%) and tetranucleotide (21.74%) repeats. Out of 23 SSRs detected, PCR primers were successfully designed for 22 (95.65%) cpSSRs. DOI: http://dx.doi.org/10.3126/ije.v3i3.11063 International Journal of Environment Vol.3(3) 2014: 50-58


Sign in / Sign up

Export Citation Format

Share Document