Comparison of methods for selection of castor beans lineages

The choice of the most appropriate method is determined by the precision desired by the researcher, by the ease of the analysis, as well as by the way of obtaining the data. In order to select lineages of low size and high productivity this study aimed to evaluate different methods of cluster analysis in the representation of genetic divergence, compared to univariate methods. The analyzed variables were grain yield, plant size and oil yield of 24 lineages of castor beans cultivated in the years 2014 and 2015. The Single and Average methods presented similar results in the formation of groups and different from the Complete. Evaluating the purpose of this research the Complete method and principal components analysis, together with the discriminant analysis, were considered the most appropriate methods to evaluate the genetic divergence of the castor bean crop. Lineages 18, 19 and 20 showed average grain yields above 1555 kg.ha-1, high oil content (above 46.9%), and low size plants (below 116 cm).

Download Full-text

Comparison of Methods to Display Principal Component Analysis, Focusing on Biplots and the Selection of Biplot Axes

Advances in Systems Analysis, Software Engineering, and High Performance Computing - Handbook of Research on Computational Simulation and Modeling in Engineering ◽

10.4018/978-1-4666-8823-0.ch010 ◽

2016 ◽

pp. 289-332

Author(s):

Carla Barbosa ◽

M. Rui Alves ◽

Beatriz Oliveira

Keyword(s):

Principal Component ◽

Statistical Technique ◽

Practical Case ◽

Comparison Of Methods ◽

Multivariate Statistical ◽

New Developments ◽

Model Complex ◽

Components Analysis ◽

Almost All ◽

Selection Of

Principal components analysis (PCA) is probably the most important multivariate statistical technique, being used to model complex problems or just for data mining, in almost all areas of science. Although being well known by researchers and available in most statistical packages, it is often misunderstood and poses problems when applied by inexperienced users. A biplot is a way of concentrating all information related to sample units and variables in a single display, in an attempt to help interpretations and avoid overestimations. This chapter covers the main mathematical aspects of PCA, as well as the form and covariance biplots developed by Gabriel and the predictive and interpolative biplots devised by Gower and coworkers. New developments are also presented, involving techniques to automate the production of biplots, with a controlled output in terms of axes predictivities and interpolative accuracies, supported by the AutoBiplot.PCA function developed in R. A practical case is used for illustrations and discussions.

Download Full-text

Application of Cluster Analysis in Breeding Research

Scientific and Technical Bulletin of the Institute of Oilseed Crops NAAS ◽

10.36710/ioc-2020-29-01 ◽

2020 ◽

pp. 6-15

Keyword(s):

Cluster Analysis ◽

Fatty Acid Composition ◽

Oleic Acid ◽

Fatty Acid ◽

Acid Composition ◽

Oil Content ◽

Clustering Method ◽

The Third ◽

Economically Valuable Traits ◽

Selection Of

Varieties and hybrids of agricultural crops are characterized by a large number of indicators: morphological, economically valuable, biochemical. Usually, when conducting a comparative analysis of selection samples at the initial stage of research, only a few traits are used, which are assessed using one-dimensional criteria. In research on rapeseed breeding, an integrated approach is also important in the assessment and selection of promising samples, taking into account the morphological characteristics that are components of productivity; oil content and quality, as well as the glucosinolate content of the seeds. Cluster analysis is a multivariate method for determining the optimal values of the estimated indicators. The aim of the research is the analysis and selection at the initial stage of research of promising breeding samples of winter rape, suitable for further work, using the "k-means" clustering method. The material of the research is 125 breeding samples of winter rapeseed. The number of pods on the central branch, the content of oil and glucosinolates in the seeds was determined, and the fatty acid composition of the oil was analyzed (the content of palmitic, stearic, oleic, linoleic, linolenic and erucic acids in it). The studies were carried out during 2018-2019. in the conditions of the southern Steppe of Ukraine. Statistical processing and evaluation of research results was carried out using a modified "k-means" clustering method, which is carried out using Data Mining. It differs from the classical clustering method in the selection of the optimal number of model clusters, which is performed by the Statistica software package. The processing and analysis of the material under study was carried out in two stages. At the first stage, using cluster analysis by the "k-means" method, separately for economically valuable traits and fatty acid composition of the oil, clusters of samples with the best ratio of the corresponding indicators were determined. At the second stage, the best samples from these clusters were selected only by the content of oil and oleic acid, and again by clustering the group of samples with the maximum value of these indicators was selected. From the cluster analysis for the fatty acid composition of the oil, the sign of the content of linoleic acid was excluded due to its high correlation with oleic acid, as well as erucic acid due to the discrepancy between its sample and the normal distribution. Reduction of samples to dimensionless form, preceding cluster analysis, is carried out by normalization on the z-scale. As a result of cluster analysis, we obtained the distribution of samples according to economically valuable characteristics into four clusters, and according to the fatty acid composition of oil into two clusters. The samples that form these clusters were also identified. The first cluster for economically valuable characteristics unites 26 samples, the second 33, the third 39 and the fourth 27. The first cluster for the fatty acid composition of the oil contains 72 samples, the second 53. The highest content of oil in the seeds and the number of pods on the central branch with the minimum content of glucosinolates in the seeds are inherent in the third cluster, and the maximum content of oleic acid in the oil - in the samples that form the second cluster. Analysis of variance of the clustering results showed that the average values of the economically valuable traits and the fatty acid composition of the oil in the clusters differ statistically significantly. Thus, clustering by the "k-means" method formed clusters of samples that statistically significantly differ from each other in the studied characteristics. Only 15 samples are simultaneously included in the third cluster, formed according to economically valuable characteristics, and in the second cluster according to the fatty acid composition of the oil. The second stage is the selection of the best samples from this group based on the oil content in the seeds and the oleic acid content in it for further breeding work. Based on the results of cluster analysis, a distribution into four clusters was obtained. Finally, for further selection studies in order to obtain a high content of oleic acid in the oil, five samples of the first cluster were selected (the content of oleic acid in the oil is 69.4-70.6%, the oil content is 49.0-52.1%). And also three samples combined into the second cluster with an oil content of 51.1-51.8%. Thus, the effectiveness of the application of the modified clustering method "k-means" for the analysis of a large number of samples of winter rapeseed for several characteristics simultaneously with the aim of selecting genotypes with an optimal ratio of economically valuable indicators has been proved.

Download Full-text

The Golf Players' Motivations: The Algarve Case

Tourism and Hospitality Research ◽

10.1057/palgrave.thr.6050014 ◽

2006 ◽

Vol 6 (3) ◽

pp. 227-238 ◽

Cited By ~ 7

Author(s):

Antonia Correia ◽

Pedro Pintassilgo

Keyword(s):

Cluster Analysis ◽

Principal Components Analysis ◽

Social Environment ◽

Principal Components ◽

Golf Courses ◽

The Sun ◽

Market Segments ◽

The Third ◽

Components Analysis ◽

Selection Of

The purpose of this article is to investigate the motivations behind golf demand in the Algarve — one of Europe's most popular golf destinations. The research is based on the results of a survey on the golf demand of Algarve's golf courses, held in 2002. In order to identify the main motives behind golf demand in the region, a principal components analysis was performed. Four main choice factors were identified to explain the selection of Algarve's golf courses. The first was designated social environment and is associated with motives such as events and beaches. The second, leisure, is related to restaurants and bars, landscape, weather and accommodation. The third, entitled golf, is directly related to characteristics of courses. The fourth, logistics, is associated with variables such as price and accessibility. It is also found, through a cluster analysis that the choice factors can be associated with three market segments: the tourist golfer, who is mostly concerned with the golf courses and the game; the householder golfer, essentially centred on accommodation, gastronomy, landscape, weather, price and accessibility; and finally, the sun-beach tourist, who is mostly interested in tourist opportunities.

Download Full-text

DIVERGÊNCIA GENÉTICA ENTRE ESPÉCIES DE PALMEIRAS Acrocomia Mart. BASEADA EM DESCRITORES MORFOAGRONÔMICOS

ENERGIA NA AGRICULTURA ◽

10.17224/energagric.2020v35n4p562-577 ◽

2021 ◽

Vol 35 (4) ◽

pp. 562-577

Author(s):

Paulo Henrique Silva ◽

Suelen Alves Vianna ◽

Cássia Regina Limonta Carvalho ◽

Joaquim Adelino de Azevedo Filho ◽

Carlos Augusto Colombo

Keyword(s):

Cluster Analysis ◽

Genetic Divergence ◽

Geographical Origin ◽

Economic Potential ◽

Breeding Programs ◽

Future Studies ◽

Palm Trees ◽

Acrocomia Aculeata ◽

Large Groups ◽

Selection Of

Divergência genética entre espécies de palmeiras Acrocomia Mart. baseada em descritores morfoagronômicos PAULO HENRIQUE DA SILVA1, SUELEN ALVES VIANNA2, CÁSSIA REGINA LIMONTA CARVALHO3, JOAQUIM ADELINO DE AZEVEDO FILHO4, CARLOS AUGUSTO COLOMBO5 1 Mestrando no Curso de Pós-Graduação em Agricultura Tropical e Subtropical – Genética, Melhoramento e Biotecnologia Vegetal- Instituto Agronômico (IAC). Avenida: Barão de Itapura, 1481, Botafogo, Campinas, SP, Brasil, CEP: 13.020-902. [email protected] 2 Pós Doutoranda no Centro de Pesquisa & Desenvolvimento de Recursos Genéticos Vegetais, Laboratório de Biologia Molecular – Instituto Agronômico (IAC). Avenida: Barão de Itapura, 1481, Botafogo, Campinas, SP, Brasil, CEP: 13.020-902. [email protected] 3 Pesquisadora no Centro de Pesquisa & Desenvolvimento de Recursos Genéticos Vegetais, Laboratório de Fitoquímica – Instituto Agronômico (IAC). Avenida: Barão de Itapura, 1481, Botafogo, Campinas, SP, Brasil, CEP: 13.020-902. [email protected] 4 Pesquisador na Agência Paulista de Tecnologia dos Agronegócios (APTA) - Pólo Regional do Leste Paulista. Rua: Dr. José Paiva Castro, 1493, Monte Alegre do Sul, SP, CEP: 13.910-000. [email protected] 5 Pesquisador no Centro de Pesquisa & Desenvolvimento de Recursos Genéticos Vegetais, Laboratório de Biologia Molecular – Instituto Agronômico (IAC). Avenida: Barão de Itapura, 1481, Botafogo, Campinas, SP, Brasil, CEP: 13.020-902. [email protected] Resumo: As palmeiras nativas Acrocomia aculeata e Acrocomia totai são utilizadas para diversos fins sobretudo, o uso da polpa fresca ou processada para alimentação e a extração de óleo da polpa e da amêndoa com diversas aplicações. Sabendo de seu potencial econômico e a dúvida existente sobre sua taxonomia, foi realizada a caracterização de 60 indivíduos em três populações de cada uma das espécies com o uso de 41 descritores morfoagronômicos. Os dados foram analisados com o uso de estatística univariada e multivariada (estimativa de similaridade pelo índice de Gower e formação dos agrupamentos pelo método UPGMA). Foi encontrada grande variação na maioria dos descritores analisados dentro e entre populações e espécies. A população de Luz-MG apresentou os maiores valores relativos aos descritores de frutos e a de Corumbá-MS os menores. A análise de agrupamento revelou a formação de dois grandes grupos correspondentes às espécies analisadas e a subdivisão dentro de cada um destes corresponde a sua origem geográfica. A variação encontrada dentro de cada uma das espécies pode orientar a seleção de indivíduos mais produtivos em programas de melhoramento e a divergência entre espécies além de comprovar sua taxonomia subsidia futuros estudos e sua melhor utilização. Palavras-chave: Acrocomia aculeata, Acrocomia totai, Arecaceae, diversidade, pré-melhoramento GENETIC DIVERGENCE AMONG SPECIES OF PALM TREES Acrocomia Mart. BASED ON MORPHOAGRONOMIC DESCRIPTORS ABSTRACT: The native palm trees Acrocomia aculeata and Acrocomia totai are used for several purposes, mainly, the use of fresh or processed pulp for food and the extraction of oil from the pulp and almond with different applications. Knowing its economic potential and doubt about its taxonomy, 60 individuals were characterized in three populations of each species using 41 morpho-agronomic descriptors. The data were analyzed using univariate and multivariate statistics (similarity estimate using the Gower index and formation of clusters using the UPGMA method). Great variation was found in most of the descriptors analyzed within and between populations and species. The population of Luz-MG had the highest values for fruit descriptors and the population of Corumbá-MS the lowest. The cluster analysis revealed the formation of two large groups corresponding to the analyzed species and the subdivision within each of these corresponds to their geographical origin. The variation found within each species can guide the selection of more productive individuals in breeding programs and the divergence between species, in addition to proving their taxonomy supports future studies and their better use. Keywords: Acrocomia aculeata, Acrocomia totai, Arecaceae, diversity, pre-breeding.

Download Full-text

Understanding and utilization of genotype-by-environment interaction in maize breeding

Genetika ◽

10.2298/gensr1001079b ◽

2010 ◽

Vol 42 (1) ◽

pp. 79-90 ◽

Cited By ~ 3

Author(s):

Vojka Babic ◽

Milosav Babic ◽

Mile Ivanovic ◽

Marija Kraljevic-Balalic ◽

Miodrag Dimitrijevic

Keyword(s):

Cluster Analysis ◽

Principal Components ◽

Maximum Yield ◽

Growing Season ◽

Genotype By Environment Interaction ◽

Environment Interaction ◽

Maize Breeding ◽

Good Guide ◽

Components Analysis ◽

Selection Of

Due to the interaction and noise in the experiments, yield trails for studying varieties are carried out in numerous locations and in the course of several years. Data of such trials have three principle tasks: to evaluate precisely and to predict the yield on the basis of limited experimental data; to determine stability and explain variability in the response of genotypes across locations; and to be a good guide for the selection of the best genotype for sowing under new agroecological conditions. The yield prediction without the inclusion of the interaction with the environments is incomplete and imprecise. Therefore, a great deal of breeding and agronomic studies are devoted to observing of the interaction via multilocation trials with replicates with the aim to use the interaction to obtain the maximum yield in any environment. Fifteen maize hybrids were analyzed in 24 environments. As the interaction participates in the total sum of squares with 6%, and genotypes with 2%, the interaction deserves observations more detailed than the classical analysis of variance (ANOVA) provides it. With a view to observe the interaction effect in detail in order to prove better understanding of genotypes, environments and their interactions AMMI (Additive Main Effect and Multiplicative Interaction) and the cluster analysis were applied. The partition of the interaction into the principal components by the PCA analysis (Principal Components Analysis) revealed a part of systematic variations in the interaction. These variations are attributed to the length of the growing season in genotypes and to the precipitation sum during the growing season in environments. Results of grouping by the cluster analysis are in high accordance with grouping observed in the biplot of the AMMI1 model.

Download Full-text

Cluster analysis for large data sets: applications to individual aerosol particles from the mid-pacific

Proceedings, annual meeting, Electron Microscopy Society of America ◽

10.1017/s0424820100132078 ◽

1992 ◽

Vol 50 (2) ◽

pp. 1488-1489

Author(s):

Thomas W. Shattuck ◽

James R. Anderson ◽

Neil W. Tindale ◽

Peter R. Buseck

Keyword(s):

Cluster Analysis ◽

Chemical Reactivity ◽

Large Data ◽

Large Data Sets ◽

Particle Analysis ◽

Data Sets ◽

Halogen Chemistry ◽

Complete Study ◽

Components Analysis ◽

Automated Scanning

Individual particle analysis involves the study of tens of thousands of particles using automated scanning electron microscopy and elemental analysis by energy-dispersive, x-ray emission spectroscopy (EDS). EDS produces large data sets that must be analyzed using multi-variate statistical techniques. A complete study uses cluster analysis, discriminant analysis, and factor or principal components analysis (PCA). The three techniques are used in the study of particles sampled during the FeLine cruise to the mid-Pacific ocean in the summer of 1990. The mid-Pacific aerosol provides information on long range particle transport, iron deposition, sea salt ageing, and halogen chemistry.Aerosol particle data sets suffer from a number of difficulties for pattern recognition using cluster analysis. There is a great disparity in the number of observations per cluster and the range of the variables in each cluster. The variables are not normally distributed, they are subject to considerable experimental error, and many values are zero, because of finite detection limits. Many of the clusters show considerable overlap, because of natural variability, agglomeration, and chemical reactivity.

Download Full-text

Cluster Analysis of Antigenic Profiles of Tumors: Selection of Number of Clusters Using Akaike’s Information Criterion

Methods of Information in Medicine ◽

10.1055/s-0038-1634783 ◽

1990 ◽

Vol 29 (03) ◽

pp. 200-204 ◽

Cited By ~ 7

Author(s):

J. A. Koziol

Keyword(s):

Cluster Analysis ◽

Basic Problem ◽

Information Criterion ◽

Akaike's Information Criterion ◽

Cell Surface Antigens ◽

Number Of Clusters ◽

Akaike’S Information Criterion ◽

Multinomial Data ◽

Tumor Types ◽

Selection Of

AbstractA basic problem of cluster analysis is the determination or selection of the number of clusters evinced in any set of data. We address this issue with multinomial data using Akaike’s information criterion and demonstrate its utility in identifying an appropriate number of clusters of tumor types with similar profiles of cell surface antigens.

Download Full-text

Application of Surface Water Quality Classification Models Using Principal Components Analysis and Cluster Analysis

SSRN Electronic Journal ◽

10.2139/ssrn.3364401 ◽

2019 ◽

Cited By ~ 1

Author(s):

Mohamed Ahmed Reda Hamed

Keyword(s):

Water Quality ◽

Cluster Analysis ◽

Surface Water ◽

Principal Components ◽

Surface Water Quality ◽

Classification Models ◽

Quality Classification ◽

Water Quality Classification ◽

Components Analysis ◽

And Cluster Analysis

Download Full-text

Using clusterization principles for the selection of repair sections of main oil pipelines based on diagnostic data

Proceedings of the Mavlyutov Institute of Mechanics ◽

10.21662/uim2011.1.019 ◽

2011 ◽

Vol 8 (1) ◽

pp. 201-210

Author(s):

R.M. Bogdanov

Keyword(s):

Cluster Analysis ◽

Defect Density ◽

Density Variation ◽

Distance Functions ◽

Oil Pipeline ◽

Oil Pipelines ◽

Classification Of Images ◽

Main Oil Pipeline ◽

Selection Of

The problem of determining the repair sections of the main oil pipeline is solved, basing on the classification of images using distance functions and the clustering principle, The criteria characterizing the cluster are determined by certain given values, based on a comparison with which the defect is assigned to a given cluster, procedures for the redistribution of defects in cluster zones are provided, and the cluster zones parameters are being changed. Calculations are demonstrating the range of defect density variation depending on pipeline sections and the universal capabilities of linear objects configuration with arbitrary density, provided by cluster analysis.

Download Full-text