perl script Latest Research Papers

AMAW: automated gene annotation for non-model eukaryotic genomes

10.1101/2021.12.07.471566 ◽

2021 ◽

Author(s):

Loïc Meunier ◽

Denis Baurain ◽

Luc Cornet

Keyword(s):

Genome Annotation ◽

Large Scale ◽

Gene Annotation ◽

Supplementary Information ◽

Supplementary Data ◽

Software Suite ◽

Perl Script ◽

Link Type ◽

Eukaryotic Genomes

AbstractSummaryTo support small and large-scale genome annotation projects, we present AMAW (Automated MAKER2 Annotation Wrapper), a program devised to annotate non-model unicellular eukaryotic genomes by automating the acquisition of evidence data (transcripts and proteins) and facilitating the use of MAKER2, a widely adopted software suite for the annotation of eukaryotic genomes. Moreover, AMAW exists as a Singularity container recipe easy to deploy on a grid computer, thereby overcoming the tricky installation of MAKER2.AvailabilityAMAW is released both as a Singularity container recipe and a standalone Perl script (https://bitbucket.org/phylogeno/amaw/)[email protected] or [email protected] informationSupplementary data are available at Bioinformatics online.

Coulped neutronics/thermal-hydraulics calculation of VVER-1000 fuel assembly

Nuclear Science and Technology ◽

10.53747/jnst.v6i2.153 ◽

2021 ◽

Vol 6 (2) ◽

pp. 31-38

Author(s):

Duy Long Ta ◽

Huy Hiep Nguyen ◽

Tuan Khai Nguyen ◽

Vinh Thanh Tran ◽

Huu Tiep Nguyen

Keyword(s):

Numerical Calculation ◽

Fuel Assembly ◽

Computational Scheme ◽

Convergence Criterion ◽

Coupling Scheme ◽

Thermal Hydraulics ◽

Script Language ◽

Master Program ◽

Perl Script

This paper presents a computational scheme using MCNP5 and COBRA-EN for coupling neutronics/thermal hydraulics calculation of a VVER-1000 fuel assembly. A master program was written using the PERL script language to build the corresponding inputs for the MCNP5 and COBRA-EN calculations and to manage the coupling scheme. The hexagonal coolant channels have been used in the thermal hydraulics model using CORBRA-EN to simplify the coupling scheme. The results of two successive iterations were compared with an assigned convergence criterion and the loop calculation can be broken when the convergence criterion is satisfied. Numerical calculation has been performed based on a UO2fuel assembly of the VVER-1000 reactor.

Variance of allele balance calculated from low coverage sequencing data infers departure from a diploid state.

10.1101/2021.09.14.460322 ◽

2021 ◽

Author(s):

Kyle Fletcher ◽

Rongkui Han ◽

Diederik Smilde ◽

Richard Michelmore

Keyword(s):

Sequence Analysis ◽

Whole Genome Sequence ◽

Whole Genome ◽

Bremia Lactucae ◽

Sequencing Data ◽

Sequence Coverage ◽

Perl Script ◽

Genome Sequence Analysis ◽

Genome Coverage ◽

Low Coverage

Polyploidy and heterokaryosis are common and consequential genetic phenomena that increase the number of haplotypes in an organism and complicate whole-genome sequence analysis. Allele balance has been used to infer polyploidy and heterokaryosis in diverse organisms using read sets sequenced to greater than 50x whole-genome coverage. However, Sequencing to adequate depth is costly if applied to multiple individuals or large genomes. We developed VCFvariance.pl to utilize the variance of allele balance to infer polyploidy and/or heterokaryosis at low sequence coverage. This analysis requires as little as 10x whole-genome coverage and reduces the allele balance profile down to a single value, which can be used to determine if an individual has two or more haplotypes. This approach was validated on simulated, synthetic, and authentic read sets from an oomycete, fungus, and plant. The approach was deployed to ascertain the genome status of multiple isolates of Bremia lactucae and Phytophthora infestans. VCFvariance.pl is a Perl script available at https://github.com/kfletcher88/VCFvariance.

Scripts for Easier Use of Spice (SEUS): A Perl script package for simulating and creating batches of circuit netlists for Monte Carlo simulations when using Ngspice or Ngspice-based simulators

The Journal of Open Source Software ◽

10.21105/joss.02183 ◽

2020 ◽

Vol 5 (53) ◽

pp. 2183

Author(s):

Michael Turi

Keyword(s):

Monte Carlo ◽

Monte Carlo Simulations ◽

Perl Script

Reducing Honeypot Log Storage Capacity Consumption – Cron Job with Perl-Script Approach

Journal of Computing Research and Innovation ◽

10.24191/jcrinn.v4i1.114 ◽

2019 ◽

Vol 4 (1) ◽

pp. 16-26

Author(s):

Iman Hazwam Bin Abd Halim ◽

Nur Muhammad Irfan Bin Abu Hassan ◽

Tajul Rosli Razak ◽

Muhammad Nabil Fikri bin Jamaluddin ◽

Mohammad Hafiz Bin Ismail

Keyword(s):

Computer System ◽

System Performance ◽

Storage Capacity ◽

Heavy Traffic ◽

Perl Script ◽

Disk Space ◽

Ddos Attack ◽

Log File

Honeypot is a decoy computer system that is used to attract and monitor hackers’ activities in the network. The honeypot aims to collect information from the hackers in order to create a more secure system. However, the log file generated by honeypot can grow very large when heavy traffic occurred in the system, such as Distributed Denial of Services’ (DDoS) attack. The DDoS possesses difficulty when it is being processed and analyzed by the network administrator as it required a lot of time and resources. Therefore, in this paper, we propose an approach to decrease the log size that is by using a Cron job that will run with a Perl-script. This approach parses the collected data into the database periodically to decrease the log size. Three DDoS attack cases were conducted in this study to show the increasing of the log size by sending a different amount of packet per second for 8 hours in each case. The results have shown that by utilizing the Cron job with Perl-script, the log size has been significantly reduced, the disk space used in the system has also decreased. Consequently, this approach capable of speeding up the process of parsing the log file into the database and thus, improving the overall system performance. This study contributes to providing a pathway in reducing honeypot log storage using the Cron job with Perl-Script.

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees

BMC Research Notes ◽

10.1186/s13104-018-3268-y ◽

2018 ◽

Vol 11 (1) ◽

Cited By ~ 3

Author(s):

Thomas Sauvage ◽

Sophie Plouviez ◽

William E. Schmidt ◽

Suzanne Fredericq

Keyword(s):

Phylogenetic Trees ◽

Perl Script ◽

Batch Extraction

Detection of simple sequence repeats in the chloroplast genome of Tetraphis pellucida Hedw.

Plant Science Today ◽

10.14719/pst.2016.3.2.206 ◽

2016 ◽

Vol 3 (2) ◽

pp. 207 ◽

Cited By ~ 1

Author(s):

Asheesh Shanker

Keyword(s):

Chloroplast Genome ◽

Simple Sequence Repeats ◽

Average Length ◽

Accession Number ◽

Repeat Type ◽

Perl Script ◽

Repeat Units ◽

Short Repeat ◽

Diversity Studies ◽

Simple Sequence

Simple sequence repeats (SSRs) consist of short repeat motifs of 1-6 nucleotides and are found in DNA sequences.The present study was conducted to detect SSRs in chloroplast genome of Tetraphis pellucida (Accession number: NC_024291), downloaded from the National Center for Biotechnology Information (NCBI). The sequence was mined with the help of MISA, a Perl script, to detect SSRs. The length of SSRs defined as ≥12 for mono, di, tri and tetranucleotide, ≥15 for pentanucleotide and ≥18 for hexanucleotide repeats. In total, 41 perfect microsatellites were identified in 127.489 kb sequence mined. An average length of 13.56 bp was calculated for mined SSRs with a density of 1 SSR/3.04 kb. Depending on the repeat units, the length of SSRs ranged from 12 to 20 nt. Dinucleotides (14, 34.15%) were the most frequent repeat type, followed by tetranucleotides (10, 24.39%), trinucleotides (7, 17.07%), mononucleotides (6, 14.63%) and pentanucleotide (4, 9.76%) repeats. Hexanucleotide repeats were completely absent in chloroplast genome of Tetraphis pellucida. The mined SSRs can be used to develop molecular markers and genetic diversity studies in Tetraphis species.

Intragenomic polymorphisms among high-copy loci: A genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae)

10.7287/peerj.preprints.512 ◽

2014 ◽

Author(s):

Kevin Weitemier ◽

Shannon C. K. Straub ◽

Mark Fishbein ◽

Aaron Liston

Keyword(s):

Ribosomal Dna ◽

Analytical Approach ◽

Phylogenetic Signal ◽

Low Frequency ◽

Nuclear Ribosomal Dna ◽

Asclepias Syriaca ◽

Custom Perl Script ◽

Bioinformatic Pipeline ◽

Perl Script ◽

Low Coverage

Despite knowledge that concerted evolution of high-copy loci is often imperfect, few studies investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA) across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual's consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the “noncoding” ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming).

Intragenomic polymorphisms among high-copy loci: A genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae)

10.7287/peerj.preprints.512v1 ◽

2014 ◽

Author(s):

Kevin Weitemier ◽

Shannon C. K. Straub ◽

Mark Fishbein ◽

Aaron Liston

Keyword(s):

Ribosomal Dna ◽

Analytical Approach ◽

Phylogenetic Signal ◽

Low Frequency ◽

Nuclear Ribosomal Dna ◽

Asclepias Syriaca ◽

Custom Perl Script ◽

Bioinformatic Pipeline ◽

Perl Script ◽

Low Coverage

Despite knowledge that concerted evolution of high-copy loci is often imperfect, few studies investigate the extent of intragenomic polymorphisms and comparisons across a large number of species are rarely made. We present a bioinformatic pipeline for characterizing polymorphisms within an individual among copies of a high-copy locus. Results are presented for nuclear ribosomal DNA (nrDNA) across the milkweed genus, Asclepias. The 18S-26S portion of the nrDNA cistron of Asclepias syriaca served as a reference for assembly of the region from 124 samples representing 90 species of Asclepias. Reads were mapped back to each individual's consensus and at each position reads differing from the consensus were tallied using a custom perl script. Low frequency polymorphisms existed in all individuals (mean = 5.8%). Most nrDNA positions (91%) were polymorphic in at least one individual, with polymorphic sites being less frequent in subunit regions and loops. Highly polymorphic sites existed in each individual, with highest abundance in the “noncoding” ITS regions. Phylogenetic signal was present in the distribution of intragenomic polymorphisms across the genus. Intragenomic polymorphisms in nrDNA are common in Asclepias, being found at higher frequency than any other study to date. The high and variable frequency of polymorphisms across species highlights concerns that phylogenetic applications of nrDNA may be error-prone. The new analytical approach provided here is applicable to other taxa and other high-copy regions characterized by low coverage genome sequencing (genome skimming).

Computational Mining of Microsatellites in the Chloroplast Genome of Ptilidium pulcherrimum, a Liverwort

International Journal of Environment ◽

10.3126/ije.v3i3.11063 ◽

2014 ◽

Vol 3 (3) ◽

pp. 50-58

Author(s):

Asheesh Shanker

Keyword(s):

Chloroplast Genome ◽

Dna Sequences ◽

Pcr Primers ◽

Repeat Type ◽

Dinucleotide Repeats ◽

Perl Script ◽

Repeat Units ◽

Chloroplast Genome Sequence ◽

Nucleotide Repeats ◽

Simple Sequence

Microsatellites also known as simple sequence repeats (SSRs) are found in DNA sequences. These repeats consist of short motifs of 1-6 bp and play important role in population genetics, phylogenetics and also in the development of molecular markers. In this study chloroplastic SSRs (cpSSRs) in the chloroplast genome of Ptilidium pulcherrimum, downloaded from the National Center for Biotechnology Information (NCBI), were detected. The chloroplast genome sequence of P. pulcherrimum was mined with the help of a Perl script named MISA. A total of 23 perfect cpSSRs were detected in 119.007 kb sequence mined showing density of 1 SSR/5.17 kb. Depending on the repeat units, the length of SSRs found to be 12 bp for mono and tri, 12 to “22 bp for di, 12 to 16 bp for tetra nucleotide repeats. Penta and hexanucleotide repeats were completely absent in chloroplast genome of P. pulcherrimum. Dinucleotide repeats were the most frequent repeat type (47.83%) followed by tri (21.74%) and tetranucleotide (21.74%) repeats. Out of 23 SSRs detected, PCR primers were successfully designed for 22 (95.65%) cpSSRs. DOI: http://dx.doi.org/10.3126/ije.v3i3.11063 International Journal of Environment Vol.3(3) 2014: 50-58

perl script
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

AMAW: automated gene annotation for non-model eukaryotic genomes

Coulped neutronics/thermal-hydraulics calculation of VVER-1000 fuel assembly

Variance of allele balance calculated from low coverage sequencing data infers departure from a diploid state.

Scripts for Easier Use of Spice (SEUS): A Perl script package for simulating and creating batches of circuit netlists for Monte Carlo simulations when using Ngspice or Ngspice-based simulators

Reducing Honeypot Log Storage Capacity Consumption – Cron Job with Perl-Script Approach

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees

Detection of simple sequence repeats in the chloroplast genome of Tetraphis pellucida Hedw.

Intragenomic polymorphisms among high-copy loci: A genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae)

Intragenomic polymorphisms among high-copy loci: A genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae)

Computational Mining of Microsatellites in the Chloroplast Genome of Ptilidium pulcherrimum, a Liverwort

Export Citation Format

perl scriptRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

AMAW: automated gene annotation for non-model eukaryotic genomes

Coulped neutronics/thermal-hydraulics calculation of VVER-1000 fuel assembly

Variance of allele balance calculated from low coverage sequencing data infers departure from a diploid state.

Scripts for Easier Use of Spice (SEUS): A Perl script package for simulating and creating batches of circuit netlists for Monte Carlo simulations when using Ngspice or Ngspice-based simulators

Reducing Honeypot Log Storage Capacity Consumption – Cron Job with Perl-Script Approach

TREE2FASTA: a flexible Perl script for batch extraction of FASTA sequences from exploratory phylogenetic trees

Detection of simple sequence repeats in the chloroplast genome of Tetraphis pellucida Hedw.

Intragenomic polymorphisms among high-copy loci: A genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae)

Intragenomic polymorphisms among high-copy loci: A genus-wide study of nuclear ribosomal DNA in Asclepias (Apocynaceae)

Computational Mining of Microsatellites in the Chloroplast Genome of Ptilidium pulcherrimum, a Liverwort

perl script
Recently Published Documents