maximal sequence
Recently Published Documents


TOTAL DOCUMENTS

14
(FIVE YEARS 2)

H-INDEX

3
(FIVE YEARS 0)

Author(s):  
Gokul Yenduri ◽  
B. R. Rajakumar ◽  
K. Praghash ◽  
D. Binu

The identification of opinions and sentiments from tweets is termed as “Twitter Sentiment Analysis (TSA)”. The major process of TSA is to determine the sentiment or polarity of the tweet and then classifying them into a negative or positive tweet. There are several methods introduced for carrying out TSA, however, it remains to be challenging due to slang words, modern accents, grammatical and spelling mistakes, and other issues that could not be solved by existing techniques. This work develops a novel customized BERT-oriented sentiment classification that encompasses two main phases: pre-processing and tokenization, and a “Customized Bidirectional Encoder Representations from Transformers (BERT)”-based classification. At first, the gathered raw tweets are pre-processed under stop-word removal, stemming and blank space removal. After pre-processing, the semantic words are obtained, from which the meaningful words (tokens) are extracted in the tokenization phase. Consequently, these extracted tokens are classified via optimized BERT, where biases and weight are tuned optimally by Particle-Assisted Circle Updating Position (PA-CUP). Moreover, the maximal sequence length of the BERT encoder is updated using standard PA-CUP. Finally, the performance analysis is carried out to substantiate the enhancement of the proposed model.


2019 ◽  
Vol 15 (06) ◽  
pp. 1219-1236
Author(s):  
Håkan Lennerstad

This paper generalizes the Stern–Brocot tree to a tree that consists of all sequences of [Formula: see text] coprime positive integers. As for [Formula: see text] each sequence [Formula: see text] is the sum of a specific set of other coprime sequences, its Stern–Brocot set [Formula: see text], where [Formula: see text] is the degree of [Formula: see text] With an orthonormal base as the root, the tree defines a fast iterative structure on the set of distinct directions in [Formula: see text] and a multiresolution partition of [Formula: see text]. Basic proofs rely on a matrix representation of each coprime sequence, where the Stern–Brocot set forms the matrix columns. This induces a finitely generated submonoid [Formula: see text] of [Formula: see text], and a unimodular multidimensional continued fraction algorithm, also generalizing [Formula: see text]. It turns out that the [Formula: see text]-dimensional subtree starting with a sequence [Formula: see text] is isomorphic to the entire [Formula: see text]-dimensional tree. This allows basic combinatorial properties to be established. It turns out that also in this multidimensional version, Fibonacci-type sequences have maximal sequence sum in each generation.


2017 ◽  
Vol 09 (06) ◽  
pp. 1750077
Author(s):  
Kairi Kangro ◽  
Mozhgan Pourmoradnasseri ◽  
Dirk Oliver Theis

A dispersed Dyck path (DDP) of length [Formula: see text] is a lattice path on [Formula: see text] from [Formula: see text] to [Formula: see text] in which the following steps are allowed: “up” [Formula: see text]; “down” [Formula: see text]; and “right” [Formula: see text]. An ascent in a DDP is an inclusion-wise maximal sequence of consecutive up steps. A 1-ascent is an ascent consisting of exactly 1 up step. We give a closed formula for the total number of 1-ascents in all dispersed Dyck paths of length [Formula: see text], #A191386 in Sloane’s OEIS. Previously, only implicit generating function relations and asymptotics were known.


2015 ◽  
Vol 2015 ◽  
pp. 1-5 ◽  
Author(s):  
Oleg A. Zverkov ◽  
Alexandr V. Seliverstov ◽  
Vassily A. Lyubetsky

We report the database of plastid protein families from red algae, secondary and tertiary rhodophyte-derived plastids, and Apicomplexa constructed with the novel method to infer orthology. The families contain proteins with maximal sequence similarity and minimal paralogous content. The database contains 6509 protein entries, 513 families and 278 nonsingletons (from which 230 are paralog-free, and among the remaining 48, 46 contain at maximum two proteins per species, and 2 contain at maximum three proteins per species). The method is compared with other approaches. Expression regulation of themoeBgene is studied using this database and the model of RNA polymerase competition. An analogous database obtained for green algae and their symbiotic descendants, and applications based on it are published earlier.


2012 ◽  
Vol 14 (04) ◽  
pp. 1250024 ◽  
Author(s):  
R. LABARCA ◽  
C. MOREIRA ◽  
A. PUMARIÑO ◽  
J. A. RODRÍGUEZ

We show the continuity of the topological entropy for the Milnor–Thurston world of interval maps and we compute the minimum and the maximum values for the entropy of a maximal sequence of any given period. We also study (fractal) geometric properties of the bifurcation set in the parameter space and in the associated phase spaces Σ[a, b], and we compare these results with the previously known results about the lexicographic world of interval maps (related to Lorenz-like maps).


10.37236/447 ◽  
2010 ◽  
Vol 17 (1) ◽  
Author(s):  
Fenix W.D. Huang ◽  
Christian M. Reidys

In this paper we present an algorithm that generates $k$-noncrossing, $\sigma$-modular diagrams with uniform probability. A diagram is a labeled graph of degree $\le 1$ over $n$ vertices drawn in a horizontal line with arcs $(i,j)$ in the upper half-plane. A $k$-crossing in a diagram is a set of $k$ distinct arcs $(i_1, j_1), (i_2, j_2),\ldots,(i_k, j_k)$ with the property $i_1 < i_2 < \ldots < i_k < j_1 < j_2 < \ldots < j_k$. A diagram without any $k$-crossings is called a $k$-noncrossing diagram and a stack of length $\sigma$ is a maximal sequence $((i,j),(i+1,j-1),\dots,(i+(\sigma-1),j-(\sigma-1)))$. A diagram is $\sigma$-modular if any arc is contained in a stack of length at least $\sigma$. Our algorithm generates after $O(n^k)$ preprocessing time, $k$-noncrossing, $\sigma$-modular diagrams in $O(n)$ time and space complexity.


2006 ◽  
Vol 74 (2) ◽  
pp. 1161-1170 ◽  
Author(s):  
Maria Giufrè ◽  
Michele Muscillo ◽  
Patrizia Spigaglia ◽  
Rita Cardines ◽  
Paola Mastrantonio ◽  
...  

ABSTRACT The pathogenesis of nontypeable Haemophilus influenzae (NTHi) begins with adhesion to the rhinopharyngeal mucosa. In almost 80% of NTHi clinical isolates, the HMW proteins are the major adhesins. The prototype HMW1 and HMW2 proteins, identified in NTHi strain 12, exhibit different binding specificities. The two binding domains have been localized in regions of maximal sequence dissimilarity (40% identity, 58% similarity). Two areas within these binding domains have been found essential for full level adhesive activity (designated the core-binding domains). To investigate the conservation and diversity of the HMW1 and HMW2 core-binding domains among isolates, PCR and DNA sequencing were used. First, we separately amplified the hmw1A-like and hmw2A-like structural genes in nine invasive NTHi isolates, discovering two new hmwA alleles, whose sequences are herein reported. Then, the hmw1A-like and hmw2A-like PCR products were used as the template in nested PCR to produce amplicons encompassing the encoding sequences of the two core-binding domains. In-depth sequence analysis was then performed among sequences of each group, with the support of specific computer programs. Overall, extensive sequence diversity among isolates was highlighted. However, similarity plots showed patterns consisting of peaks of relatively high similarity alternating with strongly divergent regions. The phylogenetic tree clearly indicated the HMW1-like and HMW2-like core-binding domain sequences as two clusters. Distinct sets of conserved amino acid motifs were identified within each group of sequences using the MEME/MOTIFSEARCH tool. Since HMW adhesins could represent candidates for future vaccines, identification of specific patterns of conserved motifs in otherwise highly variable regions is of great interest.


2001 ◽  
Vol 69 (1) ◽  
pp. 307-314 ◽  
Author(s):  
Suzanne Dawid ◽  
Susan Grass ◽  
Joseph W. St. Geme

ABSTRACT Nontypeable Haemophilus influenzae is an important cause of localized respiratory tract disease, which begins with colonization of the upper respiratory mucosa. In previous work we reported that the nontypeable H. influenzae HMW1 and HMW2 proteins are high-molecular-weight nonpilus adhesins responsible for attachment to human epithelial cells, an essential step in the process of colonization. Interestingly, although HMW1 and HMW2 share significant sequence similarity, they display distinct cellular binding specificities. In order to map the HMW1 and HMW2 binding domains, we generated a series of complementary HMW1-HMW2 chimeric proteins and examined the ability of these proteins to promote in vitro adherence byEscherichia coli DH5α. Using this approach, we localized the HMW1 and HMW2 binding domains to an ∼360-amino-acid region near the N terminus of the mature HMW1 and HMW2 proteins. Experiments with maltose-binding protein fusion proteins containing segments of either HMW1 or HMW2 confirmed these results and suggested that the fully functional binding domains may be conformational structures that require relatively long stretches of sequence. Of note, the HMW1 and HMW2 binding domains correspond to areas of maximal sequence dissimilarity, suggesting that selective advantage associated with broader adhesive potential has been a major driving force duringH. influenzae evolution. These findings should facilitate efforts to develop a subcomponent vaccine effective against nontypeableH. influenzae disease.


Sign in / Sign up

Export Citation Format

Share Document