scholarly journals Complexity of possibly gapped histogram and analysis of histogram

2018 ◽  
Vol 5 (2) ◽  
pp. 171026 ◽  
Author(s):  
Hsieh Fushing ◽  
Tania Roy

We demonstrate that gaps and distributional patterns embedded within real-valued measurements are inseparable biological and mechanistic information contents of the system. Such patterns are discovered through data-driven possibly gapped histogram, which further leads to the geometry-based analysis of histogram (ANOHT). Constructing a possibly gapped histogram is a complex problem of statistical mechanics due to the ensemble of candidate histograms being captured by a two-layer Ising model. This construction is also a distinctive problem of Information Theory from the perspective of data compression via uniformity. By defining a Hamiltonian (or energy) as a sum of total coding lengths of boundaries and total decoding errors within bins, this issue of computing the minimum energy macroscopic states is surprisingly resolved by applying the hierarchical clustering algorithm. Thus, a possibly gapped histogram corresponds to a macro-state. And then the first phase of ANOHT is developed for simultaneous comparison of multiple treatments, while the second phase of ANOHT is developed based on classical empirical process theory for a tree-geometry that can check the authenticity of branches of the treatment tree. The well-known Iris data are used to illustrate our technical developments. Also, a large baseball pitching dataset and a heavily right-censored divorce data are analysed to showcase the existential gaps and utilities of ANOHT.

Author(s):  
Mohana Priya K ◽  
Pooja Ragavi S ◽  
Krishna Priya G

Clustering is the process of grouping objects into subsets that have meaning in the context of a particular problem. It does not rely on predefined classes. It is referred to as an unsupervised learning method because no information is provided about the "right answer" for any of the objects. Many clustering algorithms have been proposed and are used based on different applications. Sentence clustering is one of best clustering technique. Hierarchical Clustering Algorithm is applied for multiple levels for accuracy. For tagging purpose POS tagger, porter stemmer is used. WordNet dictionary is utilized for determining the similarity by invoking the Jiang Conrath and Cosine similarity measure. Grouping is performed with respect to the highest similarity measure value with a mean threshold. This paper incorporates many parameters for finding similarity between words. In order to identify the disambiguated words, the sense identification is performed for the adjectives and comparison is performed. semcor and machine learning datasets are employed. On comparing with previous results for WSD, our work has improvised a lot which gives a percentage of 91.2%


Mathematics ◽  
2021 ◽  
Vol 9 (4) ◽  
pp. 370
Author(s):  
Shuangsheng Wu ◽  
Jie Lin ◽  
Zhenyu Zhang ◽  
Yushu Yang

The fuzzy clustering algorithm has become a research hotspot in many fields because of its better clustering effect and data expression ability. However, little research focuses on the clustering of hesitant fuzzy linguistic term sets (HFLTSs). To fill in the research gaps, we extend the data type of clustering to hesitant fuzzy linguistic information. A kind of hesitant fuzzy linguistic agglomerative hierarchical clustering algorithm is proposed. Furthermore, we propose a hesitant fuzzy linguistic Boole matrix clustering algorithm and compare the two clustering algorithms. The proposed clustering algorithms are applied in the field of judicial execution, which provides decision support for the executive judge to determine the focus of the investigation and the control. A clustering example verifies the clustering algorithm’s effectiveness in the context of hesitant fuzzy linguistic decision information.


2021 ◽  
Vol 13 (3) ◽  
pp. 1089
Author(s):  
Hailin Zheng ◽  
Qinyou Hu ◽  
Chun Yang ◽  
Jinhai Chen ◽  
Qiang Mei

Since the spread of the coronavirus disease 2019 (COVID-19) pandemic, the transportation of cargo by ship has been seriously impacted. In order to prevent and control maritime COVID-19 transmission, it is of great significance to track and predict ship sailing behavior. As the nodes of cargo ship transportation networks, ports of call can reflect the sailing behavior of the cargo ship. Accurate hierarchical division of ports of call can help to clarify the navigation law of ships with different ship types and scales. For typical cargo ships, ships with deadweight over 10,000 tonnages account for 95.77% of total deadweight, and 592,244 berthing ships’ records were mined from automatic identification system (AIS) from January to October 2020. Considering ship type and ship scale, port hierarchy classification models are constructed to divide these ports into three kinds of specialized ports, including bulk, container, and tanker ports. For all types of specialized ports (considering ship scale), port call probability for corresponding ship type is higher than other ships, positively correlated with the ship deadweight if port scale is bigger than ship scale, and negatively correlated with the ship deadweight if port scale is smaller than ship scale. Moreover, port call probability for its corresponding ship type is positively correlated with ship deadweight, while port call probability for other ship types is negatively correlated with ship deadweight. Results indicate that a specialized port hierarchical clustering algorithm can divide the hierarchical structure of typical cargo ship calling ports, and is an effective method to track the maritime transmission path of the COVID-19 pandemic.


2020 ◽  
Vol 16 (4) ◽  
pp. 15-29
Author(s):  
Jayalakshmi D. ◽  
Dheeba J.

The incidence of skin cancer has been increasing in recent years and it can become dangerous if not detected early. Computer-aided diagnosis systems can help the dermatologists in assisting with skin cancer detection by examining the features more critically. In this article, a detailed review of pre-processing and segmentation methods is done on skin lesion images by investigating existing and prevalent segmentation methods for the diagnosis of skin cancer. The pre-processing stage is divided into two phases, in the first phase, a median filter is used to remove the artifact; and in the second phase, an improved K-means clustering with outlier removal (KMOR) algorithm is suggested. The proposed method was tested in a publicly available Danderm database. The improved cluster-based algorithm gives an accuracy of 92.8% with a sensitivity of 93% and specificity of 90% with an AUC value of 0.90435. From the experimental results, it is evident that the clustering algorithm has performed well in detecting the border of the lesion and is suitable for pre-processing dermoscopic images.


2014 ◽  
Vol 42 (2) ◽  
pp. 174-194 ◽  
Author(s):  
Akil Elkamel ◽  
Mariem Gzara ◽  
Hanêne Ben-Abdallah

Sign in / Sign up

Export Citation Format

Share Document