Curve Generalization Algorithm Based on Area of Bends using Head/tail Breaks

Abstract. For the early curve generalization algorithms, most of them only consider the reduction of the number of vertices, and do not take into consideration the important role of bends, especially the characteristic bends, on the shape of the curve. And the existing generalization methods based on the bends of the curve have complex algorithms and a large amount of calculation, focus on relationship between adjacent bends excessively and ignore the relationship among the overall bends. In addition, the threshold setting for filtering the bends is based on the unreasonable experience. Aiming at the problems above, a generalization algorithm based on the area of bends is proposed to achieve the purpose of simplifying the curve with the head/tail breaks classification method in this paper. Experiment shows that the algorithm is simple and efficient, and can iteratively take account of the overall bends with reasonable threshold, discarding the small bends and retaining the characteristic bends of the original curve to obtain generalization results which conform the natural law and is highly similar to the original graphics at different levels of detail.Head/tail breaks is a classification method that is always applied to the classification of heavy-tailed data. Heavy-tailed data is universal in nature and human society. For example, there are more small towns than big cities in the world. However, small towns are less important than big cities in the field of economy and politics. Thus, cartographers will mark the big cities on the map and eliminate the small town. Map generalization is a progress of retaining important elements and delete unimportant elements. Head/tail breaks is able to extract significant data which can be retained as a generalization result by arithmetic mean.Figure 1 shows the algorithm flow chart. First of all, we divide the curve into several bends with oblique-dividing-curve method. Secondly, we calculate the area of each bend, and then use head/tail breaks to complete the classification of the area of bends. If the percentage of bends in the head is less than 40%, it means the data conform to heavy-tailed distribution and can be classified with head/tail breaks. If the percentage of bends in the head is greater than 40%, the head/tail breaks is not applicable to this data. After classification, for the bends which is more important in the head, we reserve them directly. For the bends in the tail, we extract the feature points of each bend by retaining the point farthest from the axis so as to maintain the local shape of the original curve. Finally, we merge the bends in the head and the feature points as a generalization result.The experimental result is shown in the Fig 2. The data of this experiment is administrative division map of Gansu Province extracted from a China map with a scale of 1&thinsp;:&thinsp;10,000,000. Because algorithm can be executed iteratively, it can generate results at different levels of detail. We can see that from the result in detail to concise result, the graphic changes progressively and there is no oversimplified result. With comparison of three algorithms in the Fig 3, the generalization results of both this paper and bend group division algorithm have better retention of characteristic bends than Douglas-Peuker algorithm. However, the algorithm of this paper has higher compression ratio and less execution time than bend group division algorithm, as shown in Table 1.The algorithm of this paper is based on nature law rather than empirical threshold, and can generate progressive results at different levels of details by iteration. In addition, it takes overall relationship of bends into consideration, so the generalization result is unique. The experimental result shows this algorithm has not only better retention of characteristic bends than Douglas-Peuker algorithm but also higher compression ratio and less execution time than bend group division algorithm. To further optimize the algorithm, we will study how to evaluate the apparent extent of the curved feature better and how to extract and eliminate the small bend inside of the bend in the head in order to improve compression ratio in the future.

Download Full-text

Development and Evaluation of Fuzzy Criteria for the Diagnosis of Rheumatoid Arthritis

Methods of Information in Medicine ◽

10.1055/s-0038-1634678 ◽

1996 ◽

Vol 35 (04/05) ◽

pp. 334-342 ◽

Cited By ~ 17

Author(s):

K.-P. Adlassnig ◽

G. Kolarz ◽

H. Leitich

Keyword(s):

Rheumatoid Arthritis ◽

Reference Model ◽

American Rheumatism Association ◽

Uniform Definition ◽

Sensitivity Rate ◽

Fuzzy Techniques ◽

Definition Of ◽

Different Levels ◽

Specificity Rate

Abstract:In 1987, the American Rheumatism Association issued a set of criteria for the classification of rheumatoid arthritis (RA) to provide a uniform definition of RA patients. Fuzzy set theory and fuzzy logic were used to transform this set of criteria into a diagnostic tool that offers diagnoses at different levels of confidence: a definite level, which was consistent with the original criteria definition, as well as several possible and superdefinite levels. Two fuzzy models and a reference model which provided results at a definite level only were applied to 292 clinical cases from a hospital for rheumatic diseases. At the definite level, all models yielded a sensitivity rate of 72.6% and a specificity rate of 87.0%. Sensitivity and specificity rates at the possible levels ranged from 73.3% to 85.6% and from 83.6% to 87.0%. At the superdefinite levels, sensitivity rates ranged from 39.0% to 63.7% and specificity rates from 90.4% to 95.2%. Fuzzy techniques were helpful to add flexibility to preexisting diagnostic criteria in order to obtain diagnoses at the desired level of confidence.

Download Full-text

IMAGE THRESHOLDING BASED ON HIERARCHICAL CLUSTERING ANALYSIS AND PERCENTILE METHOD FOR TUNA IMAGE SEGMENTATION

NJCA (Nusantara Journal of Computers and Its Applications) ◽

10.36564/njca.v2i1.24 ◽

2018 ◽

Vol 2 (1) ◽

Author(s):

Alifia Puspaningrum ◽

Nahya Nur ◽

Ozzy Secio Riza ◽

Agus Zainal Arifin

Keyword(s):

Image Segmentation ◽

Hierarchical Clustering ◽

Gabor Filter ◽

Hierarchical Cluster ◽

Experimental Result ◽

Main Process ◽

Percentile Method ◽

Cluster Analysis Method ◽

2D Gabor Filter

Automatic classification of tuna image needs a good segmentation as a main process. Tuna image is taken with textural background and the tuna’s shadow behind the object. This paper proposed a new weighted thresholding method for tuna image segmentation which adapts hierarchical clustering analysisand percentile method. The proposed method considering all part of the image and the several part of the image. It will be used to estimate the object which the proportion has been known. To detect the edge of tuna images, 2D Gabor filter has been implemented to the image. The result image then threshold which the value has been calculated by using HCA and percentile method. The mathematical morphologies are applied into threshold image. In the experimental result, the proposed method can improve the accuracy value up to 20.04%, sensitivity value up to 29.94%, and specificity value up to 17,23% compared to HCA. The result shows that the proposed method cansegment tuna images well and more accurate than hierarchical cluster analysis method.

Download Full-text

Development of a Simulation-Based Process Chain – Strategy for Different Levels of Detail for the Preprocessing Definitions

SNE Simulation Notes Europe ◽

10.11128/sne.21.tn.10081 ◽

2011 ◽

Vol 21 (3-4) ◽

pp. 135-140 ◽

Cited By ~ 5

Author(s):

Toni A. Krol ◽

Sebastian Westhäuser ◽

M. F. Zäh ◽

Johannes Schilp ◽

G. Groth

Keyword(s):

Process Chain ◽

Levels Of Detail ◽

Simulation Based ◽

Different Levels

Download Full-text

A surrogate-based generic classifier for Chinese TV series reviews

Information Discovery and Delivery ◽

10.1108/idd-11-2016-0044 ◽

2017 ◽

Vol 45 (2) ◽

pp. 66-74

Author(s):

Yufeng Ma ◽

Long Xia ◽

Wenqi Shen ◽

Mi Zhou ◽

Weiguo Fan

Keyword(s):

Design Methodology ◽

Experimental Result ◽

Digital Information ◽

Data Set ◽

Content Type ◽

Category Labels ◽

Information Supply Chain ◽

Feature Selection Techniques ◽

Practical Implications

Purpose The purpose of this paper is automatic classification of TV series reviews based on generic categories. Design/methodology/approach What the authors mainly applied is using surrogate instead of specific roles or actors’ name in reviews to make reviews more generic. Besides, feature selection techniques and different kinds of classifiers are incorporated. Findings With roles’ and actors’ names replaced by generic tags, the experimental result showed that it can generalize well to agnostic TV series as compared with reviews keeping the original names. Research limitations/implications The model presented in this paper must be built on top of an already existed knowledge base like Baidu Encyclopedia. Such database takes lots of work. Practical implications Like in digital information supply chain, if reviews are part of the information to be transported or exchanged, then the model presented in this paper can help automatically identify individual review according to different requirements and help the information sharing. Originality/value One originality is that the authors proposed the surrogate-based approach to make reviews more generic. Besides, they also built a review data set of hot Chinese TV series, which includes eight generic category labels for each review.

Download Full-text

Study on Consistency Analysis in Text Categorization

Applied Mechanics and Materials ◽

10.4028/www.scientific.net/amm.539.181 ◽

2014 ◽

Vol 539 ◽

pp. 181-184

Author(s):

Wan Li Zuo ◽

Zhi Yan Wang ◽

Ning Ma ◽

Hong Liang

Keyword(s):

Text Categorization ◽

Training Data ◽

Experimental Result ◽

Final Decision ◽

Consistency Analysis ◽

Training Set ◽

Weak Classifier ◽

Data Set ◽

Basic Premise

Accurate classification of text is a basic premise of extracting various types of information on the Web efficiently and utilizing the network resources properly. In this paper, a brand new text classification method was proposed. Consistency analysis method is a type of iterative algorithm, which mainly trains different classifiers (weak classifier) by aiming at the same training set, and then these classifiers will be gathered for testing the consistency degrees of various classification methods for the same text, thus to manifest the knowledge of each type of classifier. It main determines the weight of each sample according to the fact is the classification of each sample is accurate in each training set, as well as the accuracy of the last overall classification, and then sends the new data set whose weight has been modified to the subordinate classifier for training. In the end, the classifier gained in the training will be integrated as the final decision classifier. The classifier with consistency analysis can eliminate some unnecessary training data characteristics and place the key words on key training data. According to the experimental result, the average accuracy of this method is 91.0%, while the average recall rate is 88.1%.

Download Full-text

A chemical–mineralogical classification of common plutonic rocks and associations

Transactions of the Royal Society of Edinburgh Earth Sciences ◽

10.1017/s0263593300010117 ◽

1983 ◽

Vol 73 (3) ◽

pp. 135-149 ◽

Cited By ~ 405

Author(s):

F. Debon ◽

P. Le Fort

Keyword(s):

Analytical Data ◽

Quartz Content ◽

New Approach ◽

Mineralogical Characteristics ◽

Simple Chemical ◽

Plutonic Rocks ◽

The Individual ◽

New Criteria ◽

Different Levels

ABSTRACTA classification is proposed, based mainly on major element analytical data plotted in a coherent set of three simple chemical-mineralogical diagrams. The procedure follows two complementary steps at two different levels. The first is concerned with the individual sample: the sample is given a name (e.g. granite, adamellite, granodiorite) and its chemical and mineralogical characteristics are determined. The second one is more important: it aims at defining the type of magmatic association (or series) to which the studied sample or group of samples belongs. Three main types of association are distinguished: cafemic (from source-material mainly or completely mantle-derived), aluminous (mainly or completely derived by anatexis of continental crust), and alumino-cafemic (intermediate between the other two types). Subtypes are then distinguished among the cafemic and alumino-cafemic associations: calc-alkaline (or granodioritic), subalkaline (or monzonitic), alkaline (and peralkaline), tholeiitic (or gabbroic-trondhjemitic), etc. In the same way, numerous subtypes and variants are also distinguished among the aluminous associations using a set of complementary criteria such as quartz content, colour index, alkali ratio, quartz–alkalies relationships and alumina index.Although involving a new approach using partly new criteria, this classification is consistent with most of the divisions used in previous typologies. The method may also be used in the classification of the volcanic equivalents of common plutonic rocks.

Download Full-text

Discrimination of Genealogical Groups of Arabica Coffee by the Chemical Composition of the Beans

Journal of Agricultural Science ◽

10.5539/jas.v11n16p141 ◽

2019 ◽

Vol 11 (16) ◽

pp. 141

Author(s):

Larissa O. Fassio ◽

Marcelo R. Malta ◽

Gladyston R. Carvalho ◽

Antônio A. Pereira ◽

Ackson D. Silva ◽

...

Keyword(s):

Fatty Acids ◽

Chemical Composition ◽

Chemical Markers ◽

Total Lipids ◽

Germplasm Bank ◽

Total Sugars ◽

Coffee Beans ◽

Different Levels ◽

Important Marker

This work aimed to characterize and discriminate genealogical groups of coffee as to the chemical composition of the grains through the model created by PLS-DA method. 22 accessions of Coffea arabica, from the Active Germplasm Bank of Minas Gerais, were divided into groups according to the genealogical origin. Samples of ripe fruits were harvested selectively and processed by the wet method, to obtain pulped coffee beans, with 11% (b.u.) of water content. The raw beans were assessed as to the content of polyphenols, total sugars, total lipids, protein, caffeine, sucrose, and fatty acids. The data were submitted the chemometric analysis, PCA and PLS-DA. The results of PLS-DA identified the variables which most influence the classification of genealogical groups and possible chemical markers to accessions processed by the pulped method. The sucrose content was an important marker for the Exotic accession group. However, the content of polyphenols has been identified as a marker for the group Tymor Hybrid, and the caffeine for the bourbon group. The different fatty acids have been identified as markers for all genealogical groups, at different levels. The model PLS-DA is effective in discriminating genealogical groups from the chemical composition of the beans.

Download Full-text

The Worshipers of Stones. Lacandon Sacred Stone Landscape

Ethnologia Actualis ◽

10.2478/eas-2021-0001 ◽

2020 ◽

Vol 20 (1) ◽

pp. 1-27

Author(s):

Milan Kováč

Keyword(s):

Archaeological Site ◽

Theoretical Approaches ◽

Different Types ◽

Spiritual Energy ◽

Different Levels ◽

Rock Shelters

Abstract This article deals with the Lacandon cosmology, one of the few Maya cosmologies which has been exceptionally structured and until today, very well preserved. The present study is based mainly on associations related to stone. There are investigated the emic classifications of the Lacandon. Their classification of divine beings according to their location, and their connection to the stone houses, whether of natural or cultural origin. In the article are analyzed the most sacred Lacandon sites such as the rock shelters, cliffs and caves around the Lake Mensäbäk and Lake Yahaw Petha, as well as Yaxchilan, the archaeological site with the long tradition of Lacandon pilgrimages. The Lacandon believe in different types of transfer of spiritual energy through stone. The stones could be considered on different levels as the seat, heart or embodiment of deities. These relationships and contexts are very complex. The article tries to identify it and to offer some linguistic and theoretical approaches.

Download Full-text

Towards Transparent Human-in-the-Loop Classification of Fraudulent Web Shops

Frontiers in Artificial Intelligence and Applications - Legal Knowledge and Information Systems ◽

10.3233/faia200873 ◽

2020 ◽

Author(s):

Daphne Odekerken ◽

Floris Bex

Keyword(s):

Case Based Reasoning ◽

Agent Architecture ◽

Human In The Loop ◽

Levels Of Detail ◽

Legal Case ◽

Dynamic Argumentation ◽

Case Based

We propose an agent architecture for transparent human-in-the-loop classification. By combining dynamic argumentation with legal case-based reasoning, we create an agent that is able to explain its decisions at various levels of detail and adapts to new situations. It keeps the human analyst in the loop by presenting suggestions for corrections that may change the factors on which the current decision is based and by enabling the analyst to add new factors. We are currently implementing the agent for classification of fraudulent web shops at the Dutch Police.

Download Full-text

Territorial distribution of investments in Russian cities in 2015-2018

Regional'nye issledovaniya ◽

10.5922/1994-5280-2020-3-6 ◽

2020 ◽

pp. 68-78

Author(s):

D. Yu. Zemlianskii ◽

V.A. Chuzhenkova

Keyword(s):

Russian Federation ◽

Small Towns ◽

Regional Heterogeneity ◽

Investment Activity ◽

Regional Economic ◽

Geographical Factors ◽

Nonlinear Character ◽

Big Cities ◽

Oil Gas

This article is devoted to analysis of the territorial investment distribution in Russian Federation cities in depending on population and economics-geographical factors. The main aim is gives raised researchers influence to the regional heterogeneity of investments distribution in cities and investment aspects of big cities development. Herewith, the shortage of works, devoted to investment situation in cities with population less 100 thousand, is remained. The work conclusions are based on the analysis of the investment distributions to the main capital for 1066 Russian Federation cities over the period from 2015 to 2018 years. Based on this analysis, conclusions are made about hard investment territorial distribution dependence from the regional situation. The cities influence to the investment situation firstly manifests in the Moscow and the St. Petersburg agglomerations. But even other cities over a million people mostly depends on its regional economic. It was found, that investment distribution dependence from the city’s population has a nonlinear character. The distortions appear because of small oil-gas cities and the large and largest cities underinvestments. The polarization is especially strong among the small towns: 2% of all the settlement with population less than 50% concentrates almost quarter of investments for this cities group. It was found, that the investment activity for most of the cities doesn’t bring comparable results for cities economics and budget.

Download Full-text