hierarchical grouping
Recently Published Documents

O pinhão manso (Jatropha curcas L.) é uma euforbiácea utilizada na produção de biodiesel. A espécie possui base genética estreita o que dificulta o processo de lançamento de cultivares. Caracterizar precocemente os genótipos constitui etapa importante no melhoramento da cultura. Objetivou-se com este estudo realizar uma seleção precoce em caracteres morfoagronômicos, predizer o progresso genético de tais caracteres e indicar genitores potenciais para obtenção de progênies. O delineamento experimental utilizado foi o de blocos casualizados com 26 tratamentos e três repetições. As variáveis morfoagronômicas foram analisadas via modelos mistos e o progresso genético obtido pela seleção direta, indireta e índices de seleção. A dissimilaridade genética foi determinada pela distância de Mahalanobis, com agrupamento hierárquico UPGMA e coeficiente de correlação cofenética adquirido de 1.000 permutações. As estimativas demonstraram variabilidade genética, com identificação de genótipos juvenis promissores. Os genótipos JCCE034, JCCE014 e JCCE103 apresentam melhores progressos genéticos e os genótipos JCCE036 e JCCE86 apresentam maiores divergências genéticas, formando clusters individuais. Os genótipos de pinhão-manso são promissores na seleção precoce e possuem satisfatórios ganhos com a seleção para os caracteres avaliados. Os genótipos apresentam boa capacidade para compor grupos de genitores em cruzamentos direcionados, constituindo populações-base no melhoramento de J. curcas. Palavras-chave: Jatropha curcas; diversidade genética; ganhos com a seleção; índices de seleção. Genetic progress and early selection of juvenile physic nut genotypes ABSTRACT: Physic nut (Jatropha curcas L.) is a euphorbia used in the production of biodiesel. The species has a narrow genetic base which the process of launching cultivars is difficult. Early characterization of the genotypes is an important step in crop breending. The aims of this study were to perform an early selection in morpho-agronomic traits, to predict the genetic progress of such traits and to indicate potential parents for progeny development. The experimental design used for the randomized blocks with 26 treatments and three replications. The morpho-agronomic traits were analyzed via mix models and the genetic progress added by direct, indirect selection and selection indexes. The genetic dissimilarity was provided by the Mahalanobis distance, with UPGMA hierarchical grouping and co-phenetic correlation coefficient acquired from 1,000 permutations. The indicators demonstrated genetic variability, with the identification of promising juvenile genotypes. The genotypes JCCE034, JCCE014 and JCCE103 show better genetic progress and the genotypes JCCE036 and JCCE86 show greater genetic divergences, forming individual clusters. J. curcas genotypes are promising in early selection and have satisfactory genetics gains for the traits. The genotypes have a good ability to compose groups of parents in targeted crosses, constituting base populations in the improvement of J. curcas. Keywords: Jatropha curcas; genetic diversity; selection gains; selection indexes.

Download Full-text

Ancestral sequences of a large promiscuous enzyme family correspond to bridges in sequence space in a network representation

Journal of The Royal Society Interface ◽

10.1098/rsif.2021.0389 ◽

2021 ◽

Vol 18 (184) ◽

Author(s):

Patrick C. F. Buchholz ◽

Bert van Loo ◽

Bernard D. G. Eenink ◽

Erich Bornberg-Bauer ◽

Jürgen Pleiss

Keyword(s):

Protein Sequence ◽

Protein Sequences ◽

Large Family ◽

Time Axis ◽

Pairwise Sequence Identity ◽

Consensus Sequences ◽

Sequence Identity ◽

Ancestral Sequences ◽

Hierarchical Grouping ◽

Ancestral Protein

Evolutionary relationships of protein families can be characterized either by networks or by trees. Whereas trees allow for hierarchical grouping and reconstruction of the most likely ancestral sequences, networks lack a time axis but allow for thresholds of pairwise sequence identity to be chosen and, therefore, the clustering of family members with presumably more similar functions. Here, we use the large family of arylsulfatases and phosphonate monoester hydrolases to investigate similarities, strengths and weaknesses in tree and network representations. For varying thresholds of pairwise sequence identity, values of betweenness centrality and clustering coefficients were derived for nodes of the reconstructed ancestors to measure the propensity to act as a bridge in a network. Based on these properties, ancestral protein sequences emerge as bridges in protein sequence networks. Interestingly, many ancestral protein sequences appear close to extant sequences. Therefore, reconstructed ancestor sequences might also be interpreted as yet-to-be-identified homologues. The concept of ancestor reconstruction is compared to consensus sequences, too. It was found that hub sequences in a network, e.g. reconstructed ancestral sequences that are connected to many neighbouring sequences, share closer similarity with derived consensus sequences. Therefore, some reconstructed ancestor sequences can also be interpreted as consensus sequences.

Download Full-text

Statistical and Econometric Analysis of Selected Effects of COVID-19 Pandemic

Multidisciplinary Aspects of Production Engineering ◽

10.2478/mape-2021-0036 ◽

2021 ◽

Vol 4 (1) ◽

pp. 395-407

Author(s):

Wojciech Kempa ◽

Joanna Rydarowska-Kurzbauer ◽

Marzena Halama ◽

Elżbieta Smuda ◽

Maciej Biel

Keyword(s):

Unemployment Rate ◽

Statistical Techniques ◽

Hierarchical Grouping ◽

Tourism Sector ◽

Central Statistical ◽

Kolmogorov Smirnov ◽

Data Source ◽

Selected Effects ◽

Key Indicators ◽

The Impact

Abstract The paper examines the impact of the COVID-19 pandemic on macroeconomic activity in the selected European countries. The studies are based on monthly and quarterly indicators of GDP, unemployment rates and key indicators of the tourism sector. To present how COVID-19 has affected these macroeconomic variables, statistic data from the three periods are compared. Namely, data are collected from the pre-pandemic period, i.e. the fourth quarter of 2019 as the reference period, the second period covers the first quarter of 2020 and means the beginning of the pandemic, and the third one covers second quarter of 2020, during which the pandemic has spread to all the analyzed countries. The following statistical techniques are used in the research: regression analysis, the hierarchical grouping of agglomerations, k-means method, and selected non-parametric tests (Kruskal-Wallis test for a selected group of countries and Kolmogorov-Smirnov test for a selected pair of countries). The results show the significant impact of the pandemic on the level of gross domestic product, unemployment rate and turism sector. In most cases, a correlation between incidence of COVID-19 infections, unemployment rate and GDP is observed. The statistical techniques also allow to demonstrate the similarities and differences in the response of the economies to the COVID-19 pandemic. Central Statistical Offices of the selected countries are the main data source and for all calculations Statistica version 13.3. is used.

Download Full-text

Optimization Inspired on Herd Immunity Applied to Non-Hierarchical Grouping of Objects

Revista de Informática Teórica e Aplicada ◽

10.22456/2175-2745.107478 ◽

2021 ◽

Vol 28 (2) ◽

pp. 50-65

Author(s):

Alfredo Silveira Araújo Neto

Keyword(s):

Herd Immunity ◽

Distinct Group ◽

Finite Collection ◽

Clustering Methods ◽

Hierarchical Grouping ◽

Or Groups ◽

Grouping Strategies ◽

Thermal Annealing Process

Characterized as one of the most important operations related to data analysis, one non-hierarchical grouping consists of, even without having any information about the elements to be classified, establish upon a finite collection of objects, the partitioning of the items that constitute it into subsets or groups without intersecting, so that the elements that are part of a certain group are more similar to each other than the items that belong to distinct group. In this context, this study proposes the application of a meta-heuristic inspired by herd immunity to the determination of the non-hierarchical grouping of objects, and compares the results obtained by this method with the answers provided by four other grouping strategies, described in the literature. In particular, the resulting arrangements of the classification of 33 benchmark collections, performed by the suggested algorithm, by the metaheuristic inspired by the particle swarm, by the genetic algorithm, by the K-means algorithm and by the meta-heuristic inspired by the thermal annealing process, were compared under the perspective of 10 different evaluation measures, indicating that the partitions established by the meta-heuristic inspired by the herd immunity may, in certain respects, be more favorable than the classifications obtained by the other clustering methods.

Download Full-text

Project Rosetta: a childhood social, emotional, and behavioral developmental feature mapping

Journal of Biomedical Semantics ◽

10.1186/s13326-021-00242-4 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Alyson Maslowski ◽

Halim Abbas ◽

Kelley Abrams ◽

Sharief Taraman ◽

Ford Garberson ◽

...

Keyword(s):

Behavioral Assessment ◽

Diagnostic Category ◽

Autism Spectrum ◽

Feature Mapping ◽

Social Emotional ◽

Functional Feature ◽

Hyperactivity Disorder ◽

Hierarchical Grouping ◽

Wide Range ◽

Developmental Feature

Abstract Background A wide array of existing instruments are commonly used to assess childhood behavior and development for the evaluation of social, emotional and behavioral disorders such as Autism Spectrum Disorder (ASD), attention-deficit/hyperactivity disorder (ADHD), and anxiety. Many of these instruments either focus on one diagnostic category or encompass a broad set of childhood behaviors. We analyze a wide range of standardized behavioral instruments and identify a comprehensive, structured semantic hierarchical grouping of child behavioral observational features. We use the hierarchy to create Rosetta: a new set of behavioral assessment questions, designed to be minimal yet comprehensive in its coverage of clinically relevant behaviors. We maintain a full mapping from every functional feature in every covered instrument to a corresponding question in Rosetta. Results In all, 209 Rosetta questions are shown to cover all the behavioral concepts targeted in the eight existing standardized instruments. Conclusion The resulting hierarchy can be used to create more concise instruments across various ages and conditions, as well as create more robust overlapping datasets for both clinical and research use.

Download Full-text

Synthesis and experimental studies of the method of manufacturing screw spirals with a rotating plug with feasibility study

Scientific journal of the Ternopil national technical university ◽

10.33108/visnyk_tntu2021.03.043 ◽

2021 ◽

Vol 103 (3) ◽

pp. 43-52

Author(s):

Roman Leshchuk ◽

Yuriy Palivoda ◽

Tatiana Navrotska ◽

Bogdan Hevko ◽

Roman Khoroshun ◽

...

Keyword(s):

Morphological Analysis ◽

Economic Effect ◽

Wedge Angle ◽

Experimental Studies ◽

Dominant Factor ◽

Structural Synthesis ◽

Calibration Process ◽

Steel 08Kp ◽

Hierarchical Grouping ◽

Multifactor Experiment

The structural synthesis of methods of winding screw spirals using the method of hierarchical grouping by means of morphological analysis is carried out and a number of alternatives that allowed to create the improved way of winding screw spirals with rotating plug is obtained. On the basis of the conducted multifactor experiment the study of torque of the process of screw workpiece calibration per step was performed and the regression dependence was obtained to determine the influence of winding width, wedge angle and winding thickness on the torque of the calibration process. It is established that calibration process of the turn of screw workpiece per step depends on the width, thickness of the spiral and the angle of the wedge of the device. With increasing inclination of the wedge of the device and the thickness of the winding, for the material steel 08kp, the torque increases and reaches 79 N m. It is established that the dominant factor influencing the value of torque is the angle of the wedge of the device, and the least influential is the winding width. The technical and economic estimation of the method of winding screw spirals with rotating plug is carried out and found that the annual economic effect, when replacing the basic version of winding screw spirals (winding on the frame) on the design (winding with rotating plug), when operating equipment in one shift is 16995.79 UAH.

Download Full-text

Development of a Trainable Classifier of State of Rail Lines with Multiple Patterns of Image Recognition

Engineering Technologies and Systems ◽

10.15507/2658-4123.030.202004.659-682 ◽

2020 ◽

Vol 30 (4) ◽

pp. 659-682

Author(s):

Evgeniy M. Tarasov ◽

Ivan K. Andronchev ◽

Andrey A. Bulatov ◽

Anna E. Tarasova

Keyword(s):

Hermite Polynomials ◽

Complex Model ◽

The State ◽

Classification Models ◽

Hierarchical Grouping ◽

Multidimensional Approximation ◽

Rail Line ◽

Control Section ◽

The Moment

Introduction. The necessity to classify the state of rail lines affected by significant damaging factors on the sensitive element of the information sensor providing the assurance of classification quality with the required length of the rail lines of the control section forms the task of creating a classifier with extended functionality. Extending the functionality is possible using multidimensional state images with a set of informative features and training procedures for classification models. Using the classical classification principle with a single model leads to an excessive complication of the classification algorithm with low accuracy due to inaccurate solution of the system of conditional equations with multidimensional approximation by Hermite polynomials. Materials and Methods. The principles of reducing the dimension of the features space, various procedures for trainable classifier of state of rail lines with multidimensional patterns, the selection of decisive classification rules with a hierarchical grouping of classes, and the formation of a set of models of varying degrees of complexity trained to solve an incompatible system of equations are considered to solve the problem. There were obtained various degrees of complexity used in the adaptive algorithm for classifying the rail lines states using Hermite polynomials as models. Results. The article presents the results of developing 57 classifier models using Hermite polynomials with features of 2, 3, 4, 5, 6 arguments. As an example, the procedure of developing models with 2–6 features is shown. The research results showed that with an increase in the number of features, the quality of classification improves, as when dividing the state space into several classes. Discussion and Conclusion. The results of the studies confirm the feasibility of the principle of classification of rail line states by a set of classification models, and an algorithm of recursively increasing the classification complexity using a model of increased complexity. The criterion for presenting a new, more complex model is the mismatch between the results of the class calculation by the i-th model and the real class in which the rail line is located at the moment in time.

Download Full-text

MULTIVARIATE ANALYSIS FOR CLASSIFICATION OF DENDROMETRIC AND ENERGETIC VARIABLES OF Eucalyptus benthamii

FLORESTA ◽

10.5380/rf.v51i1.67423 ◽

2020 ◽

Vol 51 (1) ◽

pp. 118

Author(s):

Cristiane Carla Benin ◽

Luciano Farinha Watzlawick ◽

Vanderlei Aparecido De Lima

Keyword(s):

Energy Use ◽

Basic Density ◽

Multivariate Techniques ◽

Average Height ◽

Multivariate Statistical ◽

Energetic Properties ◽

Hierarchical Grouping ◽

Lower Productivity ◽

Main Components

The objective of this study was to evaluate dendrometric data and energetic properties of E. benthamii, based on evaluations in plantations of different ages and regions of production, in Guarapuava-PR, through multivariate statistical analysis. The data refer to three regions (R1, R2 and R3) and ages (5, 6 and 7 years), were submitted to multivariate techniques: factor analysis, analysis of the main components and analysis of hierarchical grouping. The reduction in the dimensionality of the data was found containing only 5 attributes of the initial 13, which are (average DBH, average height, volume per hectare, basic density of wood and energy density), associated with two main components, capable of representing 95.22% of the data variance. The plantations in the R1 region with seven years of age showed excellent energetic properties, while the plantations in R2 region with seven years and R3 region with six years represent more productive areas, evaluated by dendrometric variables. It was also observed that plantations with older age and higher basic wood density have higher quality for energy use. It can be concluded that the cluster analysis was adequate to efficiently stratify regions and ages with higher and lower productivity, as well as those with better energetic properties.

Download Full-text

The relevance of ecoregions and mountainous environments in the diversity and endemism of land gastropods

Progress in Physical Geography Earth and Environment ◽

10.1177/0309133320948839 ◽

2020 ◽

pp. 030913332094883

Author(s):

DA Dos Santos ◽

E Domínguez ◽

MJ Miranda ◽

DE Gutiérrez Gregoric ◽

MG Cuezzo

Keyword(s):

Short Range ◽

Minimum Spanning Tree ◽

Physical Nature ◽

Taxonomic Diversity ◽

Species Level ◽

Taxonomic Richness ◽

Chaco Serrano ◽

Hierarchical Grouping ◽

Subtropical Dry Forests ◽

Source Of Information

Twenty-five sub-ecoregions make Argentina from southern South America a favored area to study the mutual correspondence between environments and biodiversity. Unfortunately, efforts devoted to study these environments are unbalanced, with the subtropical dry forests less studied than the tropical and subtropical humid ones. Since the limits of ecoregions are based on vegetation criteria, land gastropods represent an independent source of information to test the relevance of sub-ecoregions in different aspects of biodiversity. We ask if land gastropods mirror these traditional diversity patterns when their distributions are framed in the context of sub-ecoregions. Additionally, we want to test if short-range endemic species (SRE) are randomly scattered across the sub-ecoregions. We first built an updated taxonomic checklist and mapped all the valid records compiled to date. Taxonomic richness, taxonomic diversity, and beta-diversity between sub-ecoregions were calculated. We obtained a hierarchical grouping of sub-ecoregions and the respective list of species that significantly support each cluster. We also developed two new analytical resources: a radial plot for showing the species composition of clusters resolved at three taxonomic levels, and a mixed coefficient of distributional size useful to identify SRE from sparse point records. This dimensionless measure of spatial range combines information of both the convex hull area and the length of the minimum spanning tree connecting point localities of presence. The Southern Andean Yungas and Dry Chaco are the species-level richest ecoregions. Although the Paranaense Forest harbors half of the number of species found in the Chaco Serrano, it reaches the highest score of taxonomic diversity because of the eclectic nature of their genera. SRE species are not randomly distributed across the sub-ecoregions, but they broadly overlap with the orographic Peripampasic arc extended over Chaco Serrano and Yungas Forests. SREs are highly dependent on the physical nature of the landscape.

Download Full-text

Development Grouping of Synonym Set Thesaurus Vocabulary The Qur’an in English Using Hierarchical Clustering Algorithm

JURNAL INFOTEL ◽

10.20895/infotel.v12i3.477 ◽

2020 ◽

Vol 12 (3) ◽

Author(s):

Salma Fauziah ◽

Moch Arif Bijaksana

Keyword(s):

Text Mining ◽

Hierarchical Clustering ◽

English Translation ◽

Clustering Algorithm ◽

Research Development ◽

Research System ◽

Hierarchical Grouping ◽

Grouping Method ◽

Hierarchical Clustering Algorithm ◽

F Measure

Research in the field of text mining to process entries or words from the Qur'an is very beneficial for Muslims. This study aims to establish a set of synonyms for the thesaurus in the words of the Qur'an. This research is used because the source of knowledge about the science of the Qur'an is still lacking. The dataset in this study uses the Corpus Qur'an and English Translation. This research is a research development of an article that has been published, namely "The Development of Al-Qur'an Vocabulary Set Synonyms with WordNet Approach" by Laras Gupitasari. Input from this research system uses nouns from the translation of English words in the Quran. The output of the system produces several groups that have the same level of closeness of meaning displayed, the first group means the word in the group has a close meaning. To produce output, this study uses word grouping with a hierarchical grouping method and calculates distances using common paths, then groups results according to the closeness of meaning from word entries. The evaluation in this study produced an F-Measure value of 76%, F-Measure Value is an evaluation to measure the accuracy of predictions issued by the system.

Download Full-text

hierarchical groupingRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

PROGRESSO GENÉTICO E SELEÇÃO PRECOCE EM GENÓTIPOS DE PINHÃO-MANSO EM FASE JUVENIL

Ancestral sequences of a large promiscuous enzyme family correspond to bridges in sequence space in a network representation

Statistical and Econometric Analysis of Selected Effects of COVID-19 Pandemic

Optimization Inspired on Herd Immunity Applied to Non-Hierarchical Grouping of Objects

Project Rosetta: a childhood social, emotional, and behavioral developmental feature mapping

Synthesis and experimental studies of the method of manufacturing screw spirals with a rotating plug with feasibility study

Development of a Trainable Classifier of State of Rail Lines with Multiple Patterns of Image Recognition

MULTIVARIATE ANALYSIS FOR CLASSIFICATION OF DENDROMETRIC AND ENERGETIC VARIABLES OF Eucalyptus benthamii

The relevance of ecoregions and mountainous environments in the diversity and endemism of land gastropods

Development Grouping of Synonym Set Thesaurus Vocabulary The Qur’an in English Using Hierarchical Clustering Algorithm

hierarchical grouping
Recently Published Documents