gap statistic Latest Research Papers

Proximal soil sensors are receiving strong attention from several disciplinary fields, and this has led to a rise in their availability in the market in the last two decades. The aim of this work was to validate agronomically a zone management delineation procedure from electromagnetic induction (EMI) maps applied to two different rainfed durum wheat fields. The k-means algorithm was applied based on the gap statistic index for the identification of the optimal number of management zones and their positions. Traditional statistical analysis was performed to detect significant differences in soil characteristics and crop response of each management zones. The procedure showed the presence of two management zones at both two sites under analysis, and it was agronomically validated by the significant difference in soil texture (+24.17%), bulk density (+6.46%), organic matter (+39.29%), organic carbon (+39.4%), total carbonates (+25.34%), total nitrogen (+30.14%), protein (+1.50%) and yield data (+1.07 t ha−1). Moreover, six unmanned aerial vehicle (UAV) flight missions were performed to investigate the relationship between five vegetation indexes and the EMI maps. The results suggest performing the multispectral images acquisition during the flowering phenological stages to attribute the crop spatial variability to different soil proprieties.

Download Full-text

Updating incomplete framework of target recognition database based on fuzzy gap statistic

Engineering Applications of Artificial Intelligence ◽

10.1016/j.engappai.2021.104521 ◽

2022 ◽

Vol 107 ◽

pp. 104521

Author(s):

Zichong Chen ◽

Rui Cai

Keyword(s):

Target Recognition ◽

Gap Statistic

Download Full-text

On the Cluster Validity Test (s) in Unsupervised Machine Learning TDA Approach for Atmospheric River Patterns on Flood Detection in Nigeria

10.21203/rs.3.rs-459258/v1 ◽

2021 ◽

Author(s):

Felix Obi Ohanuba ◽

Mohd Tahir Ismail ◽

Majid Khan Majahar Ali ◽

Ekele Alih ◽

Precious Ndidiamaka Ezra

Keyword(s):

Spatial Data ◽

Research Area ◽

Topological Data Analysis ◽

Cluster Validity ◽

Shape Information ◽

Gap Statistic ◽

Automated Method ◽

Flood Zone ◽

Negative Effect ◽

Software Codes

Abstract TDA (i.e., Topological Data Analysis) has recently been a reliable and current research area in Statistics for extracting shape (information) from data. In this study, the researchers proposed an automated method that uses TDA & ML in identifying floods (ARs) in big data. Our process gives vital details on time series trends, which help mitigate the negative effect of ARs, such as flooding. The spatial data (between 1970 - 2018) from Nigeria Hydrological Services Agency (NIHSA) on four weather parameters were used. The daily datasets were converted to monthly datasets before the proposed method was applied. Python Software is used to develop code in the implementation of our process. Mostly, the outcome facts studied will drastically reduce disasters due to extreme events like floods and achieve some SDG goals related to the flood. The second objective is to identify potential flooding and no flooding in each zone. The work successfully used a real dataset and four variables that other studies have not used to fill a gap. After our model's training process, we obtained the best group at k = 2, where we have the highest Silhouette coefficient in each of the seven states. We have found a reasonable structure in the study considering the total average range (0.3 - 0.8). That gives an efficiency outcome of approximately 80%. Summary of clustered feature pattern shows the potential flood zone and no flood zone. We conducted cluster validity of our results using R software codes and, the test validated the best group at the same cluster k = 2. The Gap statistic shows efficiency ranging between 65% to 80% in the seven states. We found from figure 11 that only the Silhouette plot obtained optimal values at exactly k = 2; The researchers got the extent of the spread from the centroid using Excel software.

Download Full-text

Tumor Microenvironment Characteristics of Pancreatic Cancer to Determine Prognosis and Immune-Related Gene Signatures

Frontiers in Molecular Biosciences ◽

10.3389/fmolb.2021.645024 ◽

2021 ◽

Vol 8 ◽

Author(s):

Congjun Zhang ◽

Jun Ding ◽

Xiao Xu ◽

Yangyang Liu ◽

Wei Huang ◽

...

Keyword(s):

Pancreatic Cancer ◽

Tumor Microenvironment ◽

Immune Cell ◽

Inhibitory Effect ◽

Gene Expression Omnibus ◽

The Cancer Genome Atlas ◽

Diagnosis And Prognosis ◽

Gap Statistic ◽

New Strategy ◽

Immune Cell Subpopulations

Background: Pancreatic cancer (PC) is one of the most lethal types of cancer with extremely poor diagnosis and prognosis, and the tumor microenvironment plays a pivotal role during PC progression. Poor prognosis is closely associated with the unsatisfactory results of currently available treatments, which are largely due to the unique pancreatic tumor microenvironment (TME).Methods: In this study, a total of 177 patients with PC from The Cancer Genome Atlas (TCGA) cohort and 65 patients with PC from the GSE62452 cohort in Gene Expression Omnibus (GEO) were included. Based on the proportions of 22 types of infiltrated immune cell subpopulations calculated by cell-type identification by estimating relative subsets of RNA transcripts (CIBERSORT), the TME was classified by K-means clustering and differentially expressed genes (DEGs) were determined. A combination of the elbow method and the gap statistic was used to explore the likely number of distinct clusters in the data. The ConsensusClusterPlus package was utilized to identify radiomics clusters, and the samples were divided into two subtypes.Result: Survival analysis showed that the patients with TMEscore-high phenotype had better prognosis. In addition, the TMEscore-high had better inhibitory effect on the immune checkpoint. A total of 10 miRNAs, 311 DEGs, and 68 methylation sites related to survival were obtained, which could be biomarkers to evaluate the prognosis of patients with pancreatic cancer.Conclusions: Therefore, a comprehensive description of TME characteristics of pancreatic cancer can help explain the response of pancreatic cancer to immunotherapy and provide a new strategy for cancer treatment.

Download Full-text

Color Text Fading Detection

Electronic Imaging ◽

10.2352/issn.2470-1173.2021.16.color-253 ◽

2021 ◽

Vol 2021 (16) ◽

pp. 253-1-253-8

Author(s):

Runzhe Zhang ◽

Eric Maggard ◽

Yousun Bang ◽

Minki Cho ◽

Mark Shaw ◽

...

Keyword(s):

Clustering Algorithm ◽

Detection Method ◽

Region Of Interest ◽

Support Vector ◽

Print Quality ◽

Black And White ◽

Density Reduction ◽

Gap Statistic ◽

The Difference ◽

Defects Analysis

The text fading defect is one of the most common defects in electrophotographic printers; and it dramatically affects print quality. It usually appears in a significant symbol Region of Interest (ROI), easily noticed by a user on his or her print. We can detect text fading by the density reduction for the black and white printed symbol ROI. It is difficult to detect the color text fading only by density reduction, because the depleted cartridge may only cause the color distortion without density reduction in the color printed symbol ROI. In our previous work with print quality defects analysis, the text fading detection method only works for black text fading defect detection [1]. Our new text fading method can detect the color text fading defect and predict the depleted cartridge. In this new text fading detection method, we use whole page image registration and the median threshold bitmap (MTB) matching method to align the text characters between the master and test symbol ROIs, because with the aligned text characters, it is easy to extract the difference between the master and the test text characters to detect the text fading defect. We use a support vector machine classifier to assign a rank to the overall quality of the printed page. We also use the gap statistic method with the K-means clustering algorithm to extract the different text characters’ different colors to predict the depleted cartridge.

Download Full-text

Block Mining reward prediction with Polynomial Regression, Long short-term memory, and Prophet API for Ethereum blockchain miners

ITM Web of Conferences ◽

10.1051/itmconf/20213701004 ◽

2021 ◽

Vol 37 ◽

pp. 01004

Author(s):

Jeyasheela Rakkini Simon ◽

K Geetha

Keyword(s):

Linear Regression ◽

Short Term Memory ◽

Polynomial Regression ◽

Optimal Number ◽

Time Data ◽

Data Set ◽

Gap Statistic ◽

Block Number ◽

Long Short Term Memory ◽

Turing Complete

The Ethereum blockchain is an open-source, decentralized blockchain with functions triggered by smart contract and has voluminous real-time data for analysis using machine learning and deep learning algorithms. Ether is the cryptocurrency of the Ethereum blockchain. Ethereum virtual machine is used to run Turing complete scripts. The data set concerning a block in the Ethereum blockchain with a block number, timestamp, crypto address of the miner, and the block rewards for the miner are explored for K means clustering for clustering miners with a unique crypto address and their rewards. Linear regression and polynomial regression are used for the prediction of the next block reward to the miner. The Long ShortTerm Memory (LSTM) algorithm is used to exploit the Ether market data set for predicting the next ether price in the market. Every kind of price and volume for every four hours is taken for prediction. The root mean square error of 34.9% is obtained for linear regression, the silhouette score is 71% for K-means clustering of miners with same rewards, with the optimal number of clusters obtained by Gap statistic method.

Download Full-text

Ordinal Approaches to Decomposing Between-Group Test Score Disparities

Journal of Educational and Behavioral Statistics ◽

10.3102/1076998620967726 ◽

2020 ◽

pp. 107699862096772

Author(s):

David M. Quinn ◽

Andrew D. Ho

Keyword(s):

Test Score ◽

Educational Inequality ◽

Decomposition Methods ◽

Ordered Probit ◽

Group Differences ◽

Scale Invariant ◽

Gap Statistic ◽

The Difference ◽

Ordered Probit Models ◽

Test Score Gaps

The estimation of test score “gaps” and gap trends plays an important role in monitoring educational inequality. Researchers decompose gaps and gap changes into within- and between-school portions to generate evidence on the role schools play in shaping these inequalities. However, existing decomposition methods assume an equal-interval test scale and are a poor fit to coarsened data such as proficiency categories. This leaves many potential data sources ill-suited for decomposition applications. We develop two decomposition approaches that overcome these limitations: an extension of V, an ordinal gap statistic, and an extension of ordered probit models. Simulations show V decompositions have negligible bias with small within-school samples. Ordered probit decompositions have negligible bias with large within-school samples but more serious bias with small within-school samples. More broadly, our methods enable analysts to (1) decompose the difference between two groups on any ordinal outcome into portions within- and between some third categorical variable and (2) estimate scale-invariant between-group differences that adjust for a categorical covariate.

Download Full-text

Temporal gap statistic: A new internal index to validate time series clustering

Chaos Solitons & Fractals ◽

10.1016/j.chaos.2020.110326 ◽

2020 ◽

pp. 110326

Author(s):

Rosana Guimarães Ribeiro ◽

Ricardo Rios

Keyword(s):

Time Series ◽

Time Series Clustering ◽

Gap Statistic

Download Full-text

Corrosion Evaluation Using Clustering Method Based on Eddy Current Pulsed Thermography

Studies in Applied Electromagnetics and Mechanics - Electromagnetic Non-Destructive Evaluation (XXIII) ◽

10.3233/saem200041 ◽

2020 ◽

Author(s):

Peizhen Shi ◽

Song Ding ◽

Yuming Chen ◽

Yiqing Wang ◽

Guiyun Tian ◽

...

Keyword(s):

Steel Corrosion ◽

Cluster Center ◽

Clustering Method ◽

Pulsed Thermography ◽

Gap Statistic ◽

Density Peaks ◽

Corrosion Evaluation ◽

Q235 Carbon Steel ◽

State Difference ◽

Corrosion State

As a popular defect of steel, corrosion had been a big challenge to industry safe and structural health. For atmosphere corrosion characterization and evaluation, a clustering by fast search and find of density peaks (CFSFDP) algorithm, combined with gap statistic (GS) method is utilized to corroded Q235 carbon steel tubes. With the proposed method, three natural atmosphere corroded samples are investigated and classified. The proposed method successfully identifies the samples with different service periods. The temperature gradient, which indicates the heat generation and conductivity, is used to analyze cluster center selection. The matching rate is presented as a feature to reflect the corrosion state difference.

Download Full-text

Wind Farm Clustering Optimization Method Using Gap Statistic

2020 5th International Conference on Power and Renewable Energy (ICPRE) ◽

10.1109/icpre51194.2020.9233199 ◽

2020 ◽

Author(s):

Li Peng ◽

Hang Su ◽

Hongying Peng ◽

Xiaomin Qiao ◽

Pan Wu ◽

...

Keyword(s):

Wind Farm ◽

Optimization Method ◽

Gap Statistic ◽

Clustering Optimization

Download Full-text

gap statistic
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

Validation of Rapid and Low-Cost Approach for the Delineation of Zone Management Based on Machine Learning Algorithms

Updating incomplete framework of target recognition database based on fuzzy gap statistic

On the Cluster Validity Test (s) in Unsupervised Machine Learning TDA Approach for Atmospheric River Patterns on Flood Detection in Nigeria

Tumor Microenvironment Characteristics of Pancreatic Cancer to Determine Prognosis and Immune-Related Gene Signatures

Color Text Fading Detection

Block Mining reward prediction with Polynomial Regression, Long short-term memory, and Prophet API for Ethereum blockchain miners

Ordinal Approaches to Decomposing Between-Group Test Score Disparities

Temporal gap statistic: A new internal index to validate time series clustering

Corrosion Evaluation Using Clustering Method Based on Eddy Current Pulsed Thermography

Wind Farm Clustering Optimization Method Using Gap Statistic

Export Citation Format

gap statisticRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

Validation of Rapid and Low-Cost Approach for the Delineation of Zone Management Based on Machine Learning Algorithms

Updating incomplete framework of target recognition database based on fuzzy gap statistic

On the Cluster Validity Test (s) in Unsupervised Machine Learning TDA Approach for Atmospheric River Patterns on Flood Detection in Nigeria

Tumor Microenvironment Characteristics of Pancreatic Cancer to Determine Prognosis and Immune-Related Gene Signatures

Color Text Fading Detection

Block Mining reward prediction with Polynomial Regression, Long short-term memory, and Prophet API for Ethereum blockchain miners

Ordinal Approaches to Decomposing Between-Group Test Score Disparities

Temporal gap statistic: A new internal index to validate time series clustering

Corrosion Evaluation Using Clustering Method Based on Eddy Current Pulsed Thermography

Wind Farm Clustering Optimization Method Using Gap Statistic

gap statistic
Recently Published Documents