scholarly journals Fast parallel construction of variable-length Markov chains

2021 ◽  
Vol 22 (1) ◽  
Author(s):  
Joel Gustafsson ◽  
Peter Norberg ◽  
Jan R. Qvick-Wester ◽  
Alexander Schliep

Abstract Background Alignment-free methods are a popular approach for comparing biological sequences, including complete genomes. The methods range from probability distributions of sequence composition to first and higher-order Markov chains, where a k-th order Markov chain over DNA has $$4^k$$ 4 k formal parameters. To circumvent this exponential growth in parameters, variable-length Markov chains (VLMCs) have gained popularity for applications in molecular biology and other areas. VLMCs adapt the depth depending on sequence context and thus curtail excesses in the number of parameters. The scarcity of available fast, or even parallel software tools, prompted the development of a parallel implementation using lazy suffix trees and a hash-based alternative. Results An extensive evaluation was performed on genomes ranging from 12Mbp to 22Gbp. Relevant learning parameters were chosen guided by the Bayesian Information Criterion (BIC) to avoid over-fitting. Our implementation greatly improves upon the state-of-the-art even in serial execution. It exhibits very good parallel scaling with speed-ups for long sequences close to the optimum indicated by Amdahl’s law of 3 for 4 threads and about 6 for 16 threads, respectively. Conclusions Our parallel implementation released as open-source under the GPLv3 license provides a practically useful alternative to the state-of-the-art which allows the construction of VLMCs even for very large genomes significantly faster than previously possible. Additionally, our parameter selection based on BIC gives guidance to end-users comparing genomes.

Author(s):  
Xiao Ling ◽  
Sameer Singh ◽  
Daniel S. Weld

Recent research on entity linking (EL) has introduced a plethora of promising techniques, ranging from deep neural networks to joint inference. But despite numerous papers there is surprisingly little understanding of the state of the art in EL. We attack this confusion by analyzing differences between several versions of the EL problem and presenting a simple yet effective, modular, unsupervised system, called Vinculum, for entity linking. We conduct an extensive evaluation on nine data sets, comparing Vinculum with two state-of-the-art systems, and elucidate key aspects of the system that include mention extraction, candidate generation, entity type prediction, entity coreference, and coherence.


Irriga ◽  
2019 ◽  
Vol 24 (4) ◽  
pp. 781-801
Author(s):  
Jefferson Vieira Jose ◽  
Lucas da C. Santos ◽  
Daniel S. Alves ◽  
Pablo R. Nitsche ◽  
Marcos V. Folegatti ◽  
...  

ASPECTOS ESPACIAIS DA EVAPOTRANSPIRAÇÃO COM O FOCO NO DIMENSIONAMENTO DE SISTEMAS DE IRRIGAÇÃO     JEFFERSON VIEIRA JOSÉ1; LUCAS DA COSTA SANTOS2; DANIEL SOARES ALVES3; PABLO RICARDO NITSCHE4; MARCOS VINICIUS FOLEGATTI5 E WAGNER WOLFF6   1Centro multidisciplinar, UFAC, Campus Floresta, Rua Estrada da Canela Fina, KM 12 Gleba Formoso - São Francisco, CEP: 69895-000, Cruzeiro do Sul – AC, Brasil, e-email: [email protected] 2Departamento de Agronomia, UFVJM, Campus JK - Rodovia MGT 367, Km 583, nº5000 - Bairro Alto da Jacuba, CEP: 39100-000 – Diamantina – MG, Brasil,  e-email: [email protected] 3Departamento de Agrometeorologia. Instituição: Instituto Agronômico do Paraná - IAPAR. Endereço: Rodovia Celso Garcia Cid, km375, Bairro Ernani Moura Lima II, CEP: 86047-90, Londrina - PR, Brasil, e-mail: [email protected] 4Departamento de Agrometeorologia. Instituição: Instituto Agronômico do Paraná - IAPAR. Endereço: Rodovia Celso Garcia Cid, km375, Bairro Ernani Moura Lima II, CEP: 86047-90, Londrina - PR, Brasil, e-mail: [email protected] 5Departamento de Engenharia de Biossistemas, USP, ESALQ, Avenida Pádua Dias, 11, Bairro Agronomia, CEP: 13418-900, Piracicaba – SP, Brasil, e-mail: [email protected] 6Departamento de Engenharia de Biossistemas, USP, ESALQ, Avenida Pádua Dias, 11, Bairro: Agronomia, CEP: 13418-900, Piracicaba – SP, Brasil, e-mail: [email protected]                                                                                               1 RESUMO   Aspectos espaciais do consumo hídrico das culturas, com o foco no dimensionamento de sistemas de irrigação, é imprescindível para a racionalização do uso da água. Este trabalho objetivou analisar a distribuição de frequência da evapotranspiração de referência acumulada (EToac) no estado do Paraná por meio da espacialização dos parâmetros da distribuição de probabilidade, visando o dimensionamento de sistemas de irrigação. Dados diários de elementos meteorológicos (temperatura máxima, mínima e média; umidade relativa média; radiação solar global; insolação; velocidade do vento), entre os anos de 1980 a 2010, de 33 estações meteorológicas no estado do Paraná, foram utilizados na estimativa da evapotranspiração de referência (ETo) pelo método de Penman-Monteith. A ETo foi acumulada em períodos consecutivos de 5, 10, 20 e 30 dias e os seus valores máximos anuais foram avaliados e ajustados a nove distribuições de probabilidade (Log-normal, Weibull, Gamma, Cauchy, Normal, Logística, Birnbaum-Saunders, Gumbel e Gumbel-II). A distribuição de probabilidade de Gumbel II, verificada pelo Critério de Informação de Akaike, foi escolhida na geração de valores de EToac, nos diferentes níveis de probabilidade, por meio de mapas da distribuição dos parâmetros a e b para representar o estado do Paraná.   Keywords: Eventos extremos; geoestatística; Penman-Monteith; Paraná     JOSÉ, J. J.; SANTOS, L. C.; ALVES, S. S.; NITSCHE, P. R.; FOLEGATTI, M. V.; WOLFF, W. SPATIAL ASPECTS OF EVAPOTRANSPIRATION WITH FOCUS ON THE DESIGN OF IRRIGATION SYSTEMS     2 ABSTRACT   Spatial aspects of crop water consumption with focus on the design of irrigation systems is essential for rationalization of water use. This work aimed to analyze the frequency distribution of cumulative reference evapotranspiration (EToac) in the State of Paraná by means of spatialization of parameters of the probability distribution, aiming at the design of irrigation systems. Daily data of meteorological elements (maximum, minimum and average temperature, mean relative humidity, global solar radiation, insolation and wind speed) between the years of 1980 and 2010 of 33 meteorological stations in the State of Paraná were used to estimate evapotranspiration of (ETo) by the Penman-Monteith method. The ETo was accumulated in consecutive periods of 5, 10, 20 and 30 days and its annual maximum values were evaluated and adjusted to nine probability distributions (Log-normal, Weibull, Gamma, Cauchy, Normal, Logistics, Birnbaum-Saunders, Gumbel and Gumbel-II), the probability distribution of Gumbel II, verified by the Akaike Information Criterion, was chosen in the generation of EToac values, at the different levels of probability, by means of maps of distribution of the parameters a and b  to represent the State of Paraná.   Keywords: Extreme events, Gumbel, Geostatistics, Penman-Monteith, Parana


2019 ◽  
Vol 9 (14) ◽  
pp. 2805 ◽  
Author(s):  
Ruber Hernández-García ◽  
Ricardo J. Barrientos ◽  
Cristofher Rojas ◽  
Marco Mora

Biometric identification and verification are essential mechanisms in modern society. Palm vein recognition is an emerging biometric technique, which has several advantages, especially in terms of security against forgery. Contactless palm vein systems are more suitable for real-world applications, but two of the major challenges of the state-of-the-art contributions are image deformations and time efficiency. In the present work, we propose a new method for palm vein recognition by combining DAISY descriptor and the Coarse-to-fine PatchMatch (CPM) algorithm in a parallel matching process. Our proposal aims at providing an effective and efficient technique to obtain similarity of palm vein images considering their displacements as discriminatory information. Extensive evaluation on three publicly available databases demonstrates that the discriminability of the proposed approach reaches the state-of-the-art results while it is considerably superior in time efficiency.


2021 ◽  
Author(s):  
Narina Thakur ◽  
Preeti Nagrath ◽  
Rachna Jain ◽  
Dharmender Saini ◽  
Nitika Sharma ◽  
...  

Abstract Object detection is a key ability required by most computer visions and surveillance applications. Pedestrian detection is a key problem in surveillance, with several applications such as person identification, person count and tracking. The number of techniques to identifying pedestrians in images has gradually increased in recent years, even with the significant advances in the state-of-the-art deep neural network-based framework for object detection models. The research in the field of object detection and image classification has made a stride in the level of accuracy greater than 99% and the level of granularity. A powerful Object detector, specifically designed for high-end surveillance applications, is needed that will not only position the bounding box and label it but will also return their relative positions. The size of these bounding boxes can vary depending on the object and it interacts with the physical world. To address these requirements, an extensive evaluation of the state-of-the-art algorithms has been performed in this paper. The work presented in this paper performs detections on MOT20 dataset using various algorithms and testing on a custom dataset recorded in our organization premises using an Unmanned Aerial Vehicle (UAV). The experimental analysis has been performed on Faster-RCNN, SSD and YOLO models. The Yolov5 model is found to outperform all the other models with 61% precision and 44% of F measure value.


Author(s):  
T. A. Welton

Various authors have emphasized the spatial information resident in an electron micrograph taken with adequately coherent radiation. In view of the completion of at least one such instrument, this opportunity is taken to summarize the state of the art of processing such micrographs. We use the usual symbols for the aberration coefficients, and supplement these with £ and 6 for the transverse coherence length and the fractional energy spread respectively. He also assume a weak, biologically interesting sample, with principal interest lying in the molecular skeleton remaining after obvious hydrogen loss and other radiation damage has occurred.


2003 ◽  
Vol 48 (6) ◽  
pp. 826-829 ◽  
Author(s):  
Eric Amsel
Keyword(s):  

1968 ◽  
Vol 13 (9) ◽  
pp. 479-480
Author(s):  
LEWIS PETRINOVICH
Keyword(s):  

1984 ◽  
Vol 29 (5) ◽  
pp. 426-428
Author(s):  
Anthony R. D'Augelli

1991 ◽  
Vol 36 (2) ◽  
pp. 140-140
Author(s):  
John A. Corson
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document