An approach to the development of a core set of germplasm using a mixture of qualitative and quantitative data

2014 ◽  
Vol 13 (2) ◽  
pp. 96-103 ◽  
Author(s):  
Rupam Kumar Sarkar ◽  
Prabina Kumar Meher ◽  
S. D. Wahi ◽  
T. Mohapatra ◽  
A. R. Rao

Development of a representative and well-diversified core with minimum duplicate accessions and maximum diversity from a larger population of germplasm is highly essential for breeders involved in crop improvement programmes. Most of the existing methodologies for the identification of a core set are either based on qualitative or quantitative data. In this study, an approach to the identification of a core set of germplasm based on the response from a mixture of qualitative (single nucleotide polymorphism genotyping) and quantitative data was proposed. For this purpose, six different combined distance measures, three for quantitative data and two for qualitative data, were proposed and evaluated. The combined distance matrices were used as inputs to seven different clustering procedures for classifying the population of germplasm into homogeneous groups. Subsequently, an optimum number of clusters based on all clustering methodologies using different combined distance measures were identified on a consensus basis. Average cluster robustness values across all the identified optimum number of clusters under each clustering methodology were calculated. Overall, three different allocation methods were applied to sample the accessions that were selected from the clusters identified under each clustering methodology, with the highest average cluster robustness value being used to formulate a core set. Furthermore, an index was proposed for the evaluation of diversity in the core set. The results reveal that the combined distance measure A1B2 – the distance based on the average of the range-standardized absolute difference for quantitative data with the rescaled distance based on the average absolute difference for qualitative data – from which three clusters that were identified by using the k-means clustering algorithm along with the proportional allocation method was suitable for the identification of a core set from a collection of rice germplasm.

2019 ◽  
Vol 37 (2) ◽  
pp. 172-179 ◽  
Author(s):  
Gisely Paula Gomes ◽  
Viviane Yumi Baba ◽  
Odair P dos Santos ◽  
Cláudia P Sudré ◽  
Cintia dos S Bento ◽  
...  

ABSTRACT Characterization and evaluation of genotypes conserved in the germplasm banks have become of great importance due to gradual loss of genetic variability and search for more adapted and productive genotypes. This can be obtained through several ways, generating quantitative and qualitative data. Joint analysis of those variables may be considered a strategy for an accurate germplasm characterization. In this study we aimed to evaluate different clustering techniques for characterization and evaluation of Capsicum spp. accessions using combinations of specific measures for quantitative and qualitative variables. A collection of 56 Capsicum spp. accessions was characterized based on 25 morphoagronomic descriptors. Six quantitative distances were used [A1) average of the range-standardized absolute difference (Gower), A2) Pearson correlation, A3) Kulczynski, A4) Canberra, A5) Bray-Curtis, and A6) Morisita] combined with distance for qualitative data [Simple Coincidence (B1)]. Clustering analyses were performed using agglomerative hierarchical methods (Ward, the nearest neighbor, the farthest neighbor, UPGMA and WPGMA). All combined distances were highly correlated. UPGMA clustering was the most efficient through cophenetic correlation and 2-norm analyses, showing a concordance between the two methods. Six clusters were considered an ideal number by UPGMA clustering, in which Gower distance showed a better adjustment for clustering. Most combined distances using UPGMA clustering allowed the separation of the accessions in relation to species, using both quantitative and qualitative data, which could be an alternative for simultaneous joint analysis, aiming to compare different clusters.


2019 ◽  
Vol 22 (1) ◽  
pp. 55-58
Author(s):  
Nahla Ibraheem Jabbar

Our proposed method used to overcome the drawbacks of computing values parameters in the mountain algorithm to image clustering. All existing clustering algorithms are required values of parameters to starting the clustering process such as these algorithms have a big problem in computing parameters. One of the famous clustering is a mountain algorithm that gives expected number of clusters, we presented in this paper a new modification of mountain clustering called Spatial Modification in the Parameters of Mountain Image Clustering Algorithm. This modification in the spatial information of image by taking a window mask for each center pixel value to compute distance between pixel and neighborhood for estimation the values of parameters σ, β that gives a potential optimum number of clusters requiring in image segmentation process. Our experiments show ability the proposed algorithm in image brain segmentation with a quality in the large data sets


2012 ◽  
Vol 21 (01) ◽  
pp. 1250001 ◽  
Author(s):  
PUNIETHAA PRABHU ◽  
K. DURAISWAMY

The increased demand for clustering objects of unlabeled data into similarity group lies in determining a number of clusters. In addition, the performance of the cluster should be analyzed to provide the precise clustering of objects. Available clustering algorithm depends on the number of clusters C to search with threshold. The proposed method of this paper is Enhanced Visual Assessment of Cluster Tendency, which robotically identifies the number of object groups or clusters in unlabeled datasets. The proposed algorithm relies on visual assessment of cluster tendency (VAT) that intermingles Euclidean, Mahalanobis distance measures, and common image processing techniques. Enhanced VAT produces a binary image, which can be visually assessed for the cluster tendency. However, VAT becomes disrupting for huge datasets. Enhanced VAT reduces the amount of computation and performs dissimilarities with different measures of metrics that are used for an effective visual evaluation process. Validation of our algorithm is performed on several UCI datasets and HIV real world datasets.


2011 ◽  
Vol 9 (4) ◽  
pp. 523-527 ◽  
Author(s):  
Rupam Kumar Sarkar ◽  
A. R. Rao ◽  
S. D. Wahi ◽  
K. V. Bhat

Knowledge of the genetic diversity of germplasm of breeding material is invaluable in crop improvement programmes. Frequently, qualitative and quantitative data are used separately to assess genetic diversity of crop genotypes. While assessing diversity based on qualitative and quantitative traits separately, there may occur a problem when the degree of correspondence between the clusters formed does not agree with each other. This study compares five different procedures of clustering based on the criterion of weighted average of observed proportion of misclassification in black gram genotypes using qualitative, quantitative traits and mixture data. The INDOMIX- and PRINQUAL-based clustering procedures, i.e. INDOMIX and PRINQUAL methods in conjunction with the k-means clustering procedure, show better performance compared with other clustering procedures, followed by clustering based on either quantitative or qualitative data alone. The use of the INDOMIX- and PRINQUAL-based procedures can help breeders in capturing the variation present in both qualitative and quantitative trait data simultaneously and solving the problem of ambiguity over the degree of correspondence between clustering based on either qualitative or quantitative traits alone.


2019 ◽  
Vol 8 (3) ◽  
pp. 285-295
Author(s):  
Ratna Kencana Putri ◽  
Budi Warsito ◽  
Mustafid Mustafid

Online social media is a new kind of media which is steadily growing and has become publicly popular. Due to its ability to spread informations rapidly and its easiness to access for internet users, social media provides new alternative to conduct advertising and product segmentation. Twitter is one of the most favored social media with 19.5 million users in Indonesia to the date. In this research, the application of text mining to cluster tweets from the @LazadaID Twitter account is done using the Modified Gustafson-Kessel clustering algorithm. The clustering process is executed five times with the number of cluster starts from two to six cluster. The results of this research indicate that the optimum number of clusters formed based on the Partition Coefficient and Classification Entropy validation index are three clusters. Those three clusters are tweets containing electronic stuff offers, discounts, and prize quizes. Tweets with the most retweets and likes are prize quiz tweets. PT Lazada Indonesia could use this kind of tweet to conduct advertising on social media Twitter because the prize quiz tweets are liked by the @LazadaID Twitter account followers.Keywords: Twitter, advertising, Lazada Indonesia, Gustafson-Kessel Clustering algorithm, validation index


2021 ◽  
Vol 2021 ◽  
pp. 1-13
Author(s):  
Rachid Sammouda ◽  
Ali El-Zaart

Prostate cancer disease is one of the common types that cause men’s prostate damage all over the world. Prostate-specific membrane antigen (PSMA) expressed by type-II is an extremely attractive style for imaging-based diagnosis of prostate cancer. Clinically, photodynamic therapy (PDT) is used as noninvasive therapy in treatment of several cancers and some other diseases. This paper aims to segment or cluster and analyze pixels of histological and near-infrared (NIR) prostate cancer images acquired by PSMA-targeting PDT low weight molecular agents. Such agents can provide image guidance to resection of the prostate tumors and permit for the subsequent PDT in order to remove remaining or noneradicable cancer cells. The color prostate image segmentation is accomplished using an optimized image segmentation approach. The optimized approach combines the k-means clustering algorithm with elbow method that can give better clustering of pixels through automatically determining the best number of clusters. Clusters’ statistics and ratio results of pixels in the segmented images show the applicability of the proposed approach for giving the optimum number of clusters for prostate cancer analysis and diagnosis.


Author(s):  
Anupama Chadha

Notice of Retraction-----------------------------------------------------------------------After careful and considered review of the content of this paper by a duly constituted expert committee, this paper has been found to be in violation of APTIKOM's Publication Principles.We hereby retract the content of this paper. Reasonable effort should be made to remove all past references to this paper.The presenting author of this paper has the option to appeal this decision by contacting ij.aptikom@gmail.com.-----------------------------------------------------------------------Clustering of mixed numerical and categorical data has become a challenge now a days. A number of algorithms dealing with mixed data have been proposed. Speed and simplicity are the two major features that have made the K-Prototype algorithm a famous partition based clustering algorithm. This algorithm has a constraint of providing the value of K initially and sometimes predicting the optimum number of clusters in advance becomes practically impossible. In this paper, a new algorithm based on the K-Prototype algorithm for clustering mixed data with advanced features for automatic generation of appropriate number of clusters is presented.


2018 ◽  
Vol 4 (2) ◽  
pp. 43-55
Author(s):  
Ika Yulianti ◽  
Endah Masrunik ◽  
Anam Miftakhul Huda ◽  
Diana Elvianita

This study aims to find a comparison of the calculation of the cost of goods manufactured in the CV. Mitra Setia Blitar uses the company's method and uses the Job Order Costing (JOC) method. The method used in this study is quantitative. The types of data used are quantitative and qualitative. Quantitative data is in the form of map production cost data while qualitative data is in the form of information about map production process. The result of calculating the cost of production of the map between the two methods results in a difference of Rp. 306. Calculation using the company method is more expensive than using the Job Order Costing method. Calculation of cost of goods manufactured using the company method is Rp. 2,205,000, - or Rp. 2,205, - each unit. While using the Job Order Costing (JOC) method is Rp. 1,899,000, - or Rp 1,899, - each unit. So that the right method used in calculating the cost of production is the Job Order Costing (JOC) method


2016 ◽  
pp. 54-73 ◽  
Author(s):  
Anh Doan Ngoc Phi

This study seeks to help fill an important gap in the literature by investigating factors that have facilitated the use of management accounting practices (MAPs) in Vietnam - a transitional economy. Data were collected from 220 medium-to-large enterprises. Follow-up interviews were conducted with 20 accounting heads/vice heads to obtain further information and clarification. The quantitative data collected was analyzed using both descriptive and inferential statistics (including t-tests and structural equation modeling), while the qualitative data was used to shed further light on the various relationships described by the quantitative analysis. This paper reveals that both decentralization and competition have a positive, significant influence on the use of new MAPs except for the old ones. Consequently, the use of MAPs has a positive, significant influence on enterprise performance.


Sign in / Sign up

Export Citation Format

Share Document