Klasifikasi Citra Burung Lovebird Menggunakan Decision Tree dengan Empat Jenis Evaluasi

Lovebird is a pet that many people in Indonesia have known. The diversity of species, coat color, and body shape gives it its charm. As well in this lovebird bird has its uniqueness of various rare colors. However, many ordinary people have difficulty distinguishing the types of lovebirds. This research is needed to improve previous study performance in classifying lovebird images using the Decision Tree J48 algorithm with 4 types of evaluation. In this case, also to reduce the stage of feature extraction to speed up the computational process. Based on available comparisons, the results obtained at the same split ratio with a comparison of 60:40 in Decision Tree J48 have the precision of 1,000, recall of 1,000, f-measure of 1,000, and accuracy value of 100%. Then the Artificial Neural Network with a split ratio of 60:40 has a precision of 0.854, recall of 0.843, f-measurement of 0.841, and an accuracy value of 84.25%. These results prove that by testing the first-level extraction on color features, Decision Tree J48 is superior in classifying images of lovebird species, and Decision Tree J48 can improve performance and produce the best accuracy.

Download Full-text

Merle phenotypes in dogs - SILV SINE insertions from Mc to Mh

10.1101/328690 ◽

2018 ◽

Author(s):

Langevin Mary ◽

Helena Synkova ◽

Tereza Jancuskova ◽

Sona Pekova

Keyword(s):

Coat Color ◽

Study Group ◽

Wild Type ◽

Health Consequences ◽

Breeding Programs ◽

Health Impairment ◽

Color Features ◽

Coat Pattern ◽

Type Sequence ◽

Selection Of

ABSTRACTIt has been recognized that the Merle coat pattern is not only a visually interesting feature, but it also exerts an important biological role, in terms of hearing and vision impairments. In 2006, the Merle (M) locus was mapped to the SILV gene with a SINE element in it, and the inserted retroelement was proven causative to the Merle phenotype. Mapping of the M locus was a genetic breakthrough and many breeders started implementing SILV SINE testing in their breeding programs. Unfortunately, the situation turned out complicated as genotypes of Merle tested individuals did not always correspond to expected phenotypes, sometimes with undesired health consequences in offspring. Two variants of SILV SINE, allelic to the wild type sequence, have been described so far - Mc and M.Here we report a significantly larger portfolio of existing Merle alleles (Mc, Mc+, Ma, Ma+, M, Mh) in Merle dogs, which are associated with unique coat color features and stratified health impairment risk. The refinement of allelic identification was made possible by systematic, detailed observation of Merle phenotypes in a cohort of 181 dogs from known Merle breeds, by many breeders worldwide, and the use of advanced molecular technology enabling the discrimination of individual Merle alleles with significantly higher precision than previously available.We also show that mosaicism of Merle alleles is an unexpectedly frequent phenomenon, which was identified in 30 out of 181 (16.6%). dogs in our study group. Importantly, not only major alleles, but also minor Merle alleles can be inherited by the offspring. Thus, mosaic findings cannot be neglected and must be reported to the breeder in their whole extent.In light of negative health consequences that may be attributed to certain Merle breeding strategies, we strongly advocate implementation of the refined Merle allele testing for all dogs of Merle breeds to help the breeders in selection of suitable mating partners and production of healthy offspring.

Download Full-text

The Hardness of Speeding-up Knapsack

BRICS Report Series ◽

10.7146/brics.v5i14.19286 ◽

1998 ◽

Vol 5 (14) ◽

Cited By ~ 1

Author(s):

Sandeep Sen

Keyword(s):

Decision Tree ◽

Knapsack Problem ◽

Decision Tree Model ◽

Fixed Degree ◽

Tree Model ◽

Pram Model ◽

Speed Up ◽

Decision Version ◽

Polynomial Class ◽

Strongly Polynomial

We show that it is not possible to speed-up the Knapsack problem efficiently in the parallel algebraic decision tree model. More specifically, we prove that any parallel algorithm in the fixed degree algebraic decision tree model that solves the decision version of the Knapsack problem requires Omega(sqrt(n)) rounds even by using 2^sqrt(n) processors. We extend the result to the PRAM model without bit-operations. These results are consistent with Mulmuley's recent result on the separation of the strongly-polynomial class and the corresponding NC class in the arithmetic PRAM model. Keywords lower-bounds, parallel algorithms, algebraic decision tree

Download Full-text

The effect of inbreeding, body size and morphology on health in dog breeds

Canine Medicine and Genetics ◽

10.1186/s40575-021-00111-4 ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Danika Bannasch ◽

Thomas Famula ◽

Jonas Donner ◽

Heidi Anderson ◽

Leena Honkanen ◽

...

Keyword(s):

Health Care ◽

Body Weight ◽

Body Size ◽

Body Shape ◽

Robust Regression ◽

Coat Color ◽

Significant Difference ◽

Size And Morphology ◽

Insurance Data ◽

Head Type

Abstract Background Dog breeds are known for their distinctive body shape, size, coat color, head type and behaviors, features that are relatively similar across members of a breed. Unfortunately, dog breeds are also characterized by distinct predispositions to disease. We explored the relationships between inbreeding, morphology and health using genotype based inbreeding estimates, body weight and insurance data for morbidity. Results The average inbreeding based on genotype across 227 breeds was Fadj = 0.249 (95% CI 0.235–0.263). There were significant differences in morbidity between breeds with low and high inbreeding (H = 16.49, P = 0.0004). There was also a significant difference in morbidity between brachycephalic breeds and non-brachycephalic breeds (P = 0.0048) and between functionally distinct groups of breeds (H = 14.95 P < 0.0001). Morbidity was modeled using robust regression analysis and both body weight (P < 0.0001) and inbreeding (P = 0.013) were significant (r2 = 0.77). Smaller less inbred breeds were healthier than larger more inbred breeds. Conclusions In this study, body size and inbreeding along with deleterious morphologies contributed to increases in necessary health care in dogs.

Download Full-text

Penerapan Decision Tree J48 dan Reptree dalam Menentukan Prediksi Produksi Minyak Kelapa Sawit menggunakan Metode Fuzzy Tsukamoto

Jurnal Teknologi Informasi dan Ilmu Komputer ◽

10.25126/jtiik.2020731870 ◽

2020 ◽

Vol 7 (3) ◽

pp. 483

Author(s):

Tundo Tundo ◽

Shofwatul 'Uyun

Keyword(s):

Decision Tree ◽

Production Process ◽

Palm Oil ◽

Oil Production ◽

Actual Data ◽

Production Data ◽

Actual Production ◽

Speed Up ◽

Rule Making

Penelitian ini menerangkan penerapan decision tree J48 dan REPTree dengan menggunakan metode fuzzy Tsukamoto dengan objek yang digunakan adalah penentuan jumlah produksi minyak kelapa sawit di perusahaan PT Tapiana Nadenggan dengan tujuan untuk mengetahui decision tree mana yang hasilnya mendekati dari data sesungguhnya sehingga dapat digunakan untuk membantu memprediksi jumlah produksi minyak kelapa sawit di PT Tapiana Nadenggan ketika proses produksi belum diproses. Digunakannya decision tree J48 dan REPTree yaitu untuk mempercepat dalam pembuatan rule yang digunakan tanpa harus berkonsultasi dengan para pakar dalam menentukan rule yang digunakan. Dari data yang digunakan akurasi dari decision tree J48 adalah 95.2381%, sedangkan akurasi REPTree adalah 90.4762%, akan tetapi dalam kasus ini decision tree REPTree yang lebih tepat digunakan dalam proses prediksi produksi minyak kelapa sawit, karena di uji dengan data sesungguhnya pada bulan Maret tahun 2019 menggunakan REPTree diperoleh 16355835 liter, sedangkan menggunakan J48 diperoleh 11844763 liter, dimana data produksi sesungguhnya sebesar 17920000 liter. Sehingga dapat ditemukan suatu kesimpulan bahwa untuk kasus ini data produksi yang mendekati dengan data sesungguhnya adalah REPTree, meskipun akurasi yang diperoleh lebih kecil dibandingkan dengan J48.Abstract<div>This study explains the application of the J48 and REPTree decision tree using the fuzzy Tsukamoto method with the object used is the determination of the amount of palm oil production in the company PT Tapiana Nadenggan with the aim of knowing which decision tree the results are close to the actual data so that it can be used to help predict the amount palm oil production at PT Tapiana Nadenggan when the production process has not been processed. The use of the J48 and REPTree decision tree is to speed up the rule making that is used without having to consult with experts in determining the rules used. From the data used the accuracy of the J48 decision tree is 95.2381%, while the REPTree accuracy is 90.4762%, but in this case the REPTree decision tree is more appropriate to be used in the prediction process of palm oil production, because it is tested with actual data in March 2019 uses REPTree obtained 16355835 liters, while using J48 obtained 11844763 liters, where the actual production data is 179,20000 liters. So that it can be found a conclusion that for this case the production data approaching the actual data is REPTree, even though the accuracy obtained is smaller compared to J48.</div>

Download Full-text

G2Basy: A framework to improve the RNN language model and ease overfitting problem

PLoS ONE ◽

10.1371/journal.pone.0249820 ◽

2021 ◽

Vol 16 (4) ◽

pp. e0249820

Author(s):

Lu Yuwen ◽

Shuyu Chen ◽

Xiaohan Yuan

Keyword(s):

Language Model ◽

Word Embedding ◽

Language Models ◽

Batch Size ◽

Training Process ◽

Improve Performance ◽

Step Size ◽

Learning Rates ◽

Speed Up ◽

Overfitting Problem

Recurrent neural networks are efficient ways of training language models, and various RNN networks have been proposed to improve performance. However, with the increase of network scales, the overfitting problem becomes more urgent. In this paper, we propose a framework—G2Basy—to speed up the training process and ease the overfitting problem. Instead of using predefined hyperparameters, we devise a gradient increasing and decreasing technique that changes the parameters training batch size and input dropout simultaneously by a user-defined step size. Together with a pretrained word embedding initialization procedure and the introduction of different optimizers at different learning rates, our framework speeds up the training process dramatically and improves performance compared with a benchmark model of the same scale. For the word embedding initialization, we propose the concept of “artificial features” to describe the characteristics of the obtained word embeddings. We experiment on two of the most often used corpora—the Penn Treebank and WikiText-2 datasets—and both outperform the benchmark results and show potential towards further improvement. Furthermore, our framework shows better results with the larger and more complicated WikiText-2 corpus than with the Penn Treebank. Compared with other state-of-the-art results, we achieve comparable results with network scales hundreds of times smaller and within fewer training epochs.

Download Full-text

Chemformer: A Pre-Trained Transformer for Computational Chemistry

Machine Learning: Science and Technology ◽

10.1088/2632-2153/ac3ffb ◽

2021 ◽

Author(s):

Ross Irwin ◽

Spyridon Dimitriadis ◽

Jiazhen He ◽

Esben Jannik Bjerrum

Keyword(s):

Computational Chemistry ◽

State Of The Art ◽

Direct Synthesis ◽

Single Application ◽

Improve Performance ◽

Molecular Line ◽

Benchmark Datasets ◽

Speed Up

Abstract Transformer models coupled with Simplified Molecular Line Entry System (SMILES) have recently proven to be a powerful combination for solving challenges in cheminformatics. These models, however, are often developed specifically for a single application and can be very resource-intensive to train. In this work we present Chemformer model – a Transformerbased model which can be quickly applied to both sequence-to-sequence and discriminative cheminformatics tasks. Additionally, we show that self-supervised pre-training can improve performance and significantly speed up convergence on downstream tasks. On direct synthesis and retrosynthesis prediction benchmark datasets we publish state-of-the-art results for top- 1 accuracy. We also improve on existing approaches for a molecular optimisation task and show that Chemformer can optimise on multiple discriminative tasks simultaneously. Models, datasets and code will be made available after publication.

Download Full-text

USING ATTRIBUTE BEHAVIOR DIVERSITY TO BUILD ACCURATE DECISION TREE COMMITTEES FOR MICROARRAY DATA

Journal of Bioinformatics and Computational Biology ◽

10.1142/s0219720012500059 ◽

2012 ◽

Vol 10 (04) ◽

pp. 1250005 ◽

Cited By ~ 3

Author(s):

QIAN HAN ◽

GUOZHU DONG

Keyword(s):

Decision Tree ◽

Decision Trees ◽

Microarray Data ◽

Dna Microarrays ◽

Ensemble Methods ◽

Medical Studies ◽

Improve Performance ◽

Gene Chips ◽

New Ideas ◽

And Behavior

DNA microarrays (gene chips), frequently used in biological and medical studies, measure the expressions of thousands of genes per sample. Using microarray data to build accurate classifiers for diseases is an important task. This paper introduces an algorithm, called Committee of Decision Trees by Attribute Behavior Diversity (CABD), to build highly accurate ensembles of decision trees for such data. Since a committee's accuracy is greatly influenced by the diversity among its member classifiers, CABD uses two new ideas to "optimize" that diversity, namely (1) the concept of attribute behavior–based similarity between attributes, and (2) the concept of attribute usage diversity among trees. The ideas are effective for microarray data, since such data have many features and behavior similarity between genes can be high. Experiments on microarray data for six cancers show that CABD outperforms previous ensemble methods significantly and outperforms SVM, and show that the diversified features used by CABD's decision tree committee can be used to improve performance of other classifiers such as SVM. CABD has potential for other high-dimensional data, and its ideas may apply to ensembles of other classifier types.

Download Full-text

OVERVIEW OF POPULAR APPROACHES IN CREATING CLIENT-SERVER APPLICATIONS BASED ON SCIENTOMETRICS ONAFTS’ PLATFORM

Automation technological and business processes ◽

10.15673/atbp.v10i4.833 ◽

2018 ◽

Vol 9 (4) ◽

Cited By ~ 1

Author(s):

D. Salskyi ◽

A. Kozhukhar ◽

O. Olshevska ◽

N. Povarova

Keyword(s):

Web Applications ◽

Main Idea ◽

Main Theme ◽

Improve Performance ◽

Client Server ◽

Meta Information ◽

Speed Up ◽

Server Architecture ◽

Client Server Architecture ◽

Standard Solutions

Most of the currently developed systems are based on the client-server architecture. This architecture is usedeverywhere, from mobile-native development to Web applications.However implementing an application based on this architectural solution requires quite a lot of effort from the softwaredeveloper, and therefore, in order to simplify and speed up the development, certain standard solutions and approachesappeared. This article will discuss the most popular technologies used in the development of Web applications in the context ofenterprise development.Also in this article will be mentioned the project, built on the architecture of "client-server" - ScienceToMetrics.The main theme of this project is the study of science-metric indicators for the structural divisions of the faculty of theOdessa National Academy of Food Technologies. In fact, it is a portal for viewing and editing information on employees, inthe future this portal may be extended to subprojects.In this project, the main idea of this architecture was embodied: decomposition of the application into atomic parts inorder to distribute it to several hardware units of capacity to improve performance. The client is an independent application,which at the same time receives information from an external API-interface through REST-requests. In turn, the backendprovides this API with certain security restrictions on the content provided. The backend for this architecture provides a layerfor the content of the data users, whether it's a database (NoSQL, SQL) or an integration API with external aggregationsystems. To ensure the necessary level of security, JWT (Javascript Web Token) authorization is used, which allows you not tocreate an explicit session between the client and the backend, but allows you to communicate through a token that stores allthe necessary meta-information for this user.

Download Full-text

Politeness and Stable Infiniteness: Stronger Together

Automated Deduction – CADE 28 - Lecture Notes in Computer Science ◽

10.1007/978-3-030-79876-5_9 ◽

2021 ◽

pp. 148-165

Author(s):

Ying Sheng ◽

Yoni Zohar ◽

Christophe Ringeissen ◽

Andrew Reynolds ◽

Clark Barrett ◽

...

Keyword(s):

Preliminary Evidence ◽

Satisfiability Modulo Theories ◽

The Other ◽

Combination Method ◽

Improve Performance ◽

Worst Case ◽

Smart Contract ◽

Shared Variables ◽

Speed Up ◽

Time Required

AbstractWe make two contributions to the study of polite combination in satisfiability modulo theories. The first is a separation between politeness and strong politeness, by presenting a polite theory that is not strongly polite. This result shows that proving strong politeness (which is often harder than proving politeness) is sometimes needed in order to use polite combination. The second contribution is an optimization to the polite combination method, obtained by borrowing from the Nelson-Oppen method. The Nelson-Oppen method is based on guessing arrangements over shared variables. In contrast, polite combination requires an arrangement over all variables of the shared sorts. We show that when using polite combination, if the other theory is stably infinite with respect to a shared sort, only the shared variables of that sort need be considered in arrangements, as in the Nelson-Oppen method. The time required to reason about arrangements is exponential in the worst case, so reducing the number of variables considered has the potential to improve performance significantly. We show preliminary evidence for this by demonstrating a speed-up on a smart contract verification benchmark.

Download Full-text

Somatotype identification of middle-aged women based on decision tree algorithm

International Journal of Clothing Science and Technology ◽

10.1108/ijcst-12-2019-0193 ◽

2020 ◽

Vol ahead-of-print (ahead-of-print) ◽

Author(s):

Lanmin Wang ◽

Hongmin Wang ◽

Huiyan Zhang ◽

Naiseman Akemujiang ◽

Aimin Xiao

Keyword(s):

Decision Tree ◽

Body Shape ◽

Great Influence ◽

The Body ◽

Decision Tree Algorithm ◽

Middle Aged ◽

Body Type ◽

Tree Algorithm ◽

Content Type ◽

Type Classification

PurposeBody type classification has a great influence on plate making and garment sizing system, and the accuracy of body type classification method will greatly affect the fit of garment production. The purpose of this paper is to use the decision tree algorithm to study body classification rules, develop a decision tree body recognition model and judge the body shape of middle-aged women in Xinjiang.Design/methodology/approachFirst, perform dimensionless processing on the collected data of 256 middle-aged women in Xinjiang, and the dimensionless data were used for K-means body clustering; Then, quantitatively analyze the effectiveness of different classification clusters based on the silhouette coefficients. Second, the decision tree algorithm is used to divide the classified sample data into a training set and a test set at a ratio of 70/30, and select the best node and the best branch based on the Gini coefficient to construct a classification tree. Last, the overall optimal decision tree is generated by means of hyperparameter pruning.FindingsThe body shape of middle-aged women in Xinjiang can be divided into three types: standard body, plump body and obese body. The decision tree model has an excellent effect on body classification of middle-aged women in Xinjiang (precision (macro), 95.46%; precision (micro), 95.95%; recall (macro), 95.46%; recall (micro), 95.95%; F1 (macro), 95.46%; F1 (micro), 95.95%).Originality/valueFor scientific research, this paper is conducive to increasing the regional body type theory and stimulating the establishment of a garment sizing subdivision system in Xinjiang. In terms of production practice, this paper not only establishes a model for judging the shape of middle-aged women in Xinjiang, but also provides reference data for intermediates of various sizes. In addition, to facilitate pattern-making and the establishment of a subdivision system for the size of middle-aged women's garments in Xinjiang, this paper provides the grading values of various body control parts of middle-aged women in Xinjiang.

Download Full-text