AdvCodeMix: Adversarial Attack on Code-Mixed Data

Hyperspectral unmixing is an important technique for analyzing remote sensing images which aims to obtain a collection of endmembers and their corresponding abundances. In recent years, non-negative matrix factorization (NMF) has received extensive attention due to its good adaptability for mixed data with different degrees. The majority of existing NMF-based unmixing methods are developed by incorporating additional constraints into the standard NMF based on the spectral and spatial information of hyperspectral images. However, they neglect to exploit the nature of imbalanced pixels included in the data, which may cause the pixels mixed with imbalanced endmembers to be ignored, and thus the imbalanced endmembers generally cannot be accurately estimated due to the statistical property of NMF. To exploit the information of imbalanced samples in hyperspectral data during the unmixing procedure, in this paper, a cluster-wise weighted NMF (CW-NMF) method for the unmixing of hyperspectral images with imbalanced data is proposed. Specifically, based on the result of clustering conducted on the hyperspectral image, we construct a weight matrix and introduce it into the model of standard NMF. The proposed weight matrix can provide an appropriate weight value to the reconstruction error between each original pixel and the reconstructed pixel in the unmixing procedure. In this way, the adverse effect of imbalanced samples on the statistical accuracy of NMF is expected to be reduced by assigning larger weight values to the pixels concerning imbalanced endmembers and giving smaller weight values to the pixels mixed by majority endmembers. Besides, we extend the proposed CW-NMF by introducing the sparsity constraints of abundance and graph-based regularization, respectively. The experimental results on both synthetic and real hyperspectral data have been reported, and the effectiveness of our proposed methods has been demonstrated by comparing them with several state-of-the-art methods.

Download Full-text

Visualizing Profiles of Large Datasets of Weighted and Mixed Data

Mathematics ◽

10.3390/math9080891 ◽

2021 ◽

Vol 9 (8) ◽

pp. 891

Author(s):

Aurea Grané ◽

Alpha A. Sow-Barry

Keyword(s):

Multidimensional Scaling ◽

Random Sample ◽

Simulation Study ◽

Clustering Algorithm ◽

Computational Cost ◽

Interpolation Formula ◽

Large Datasets ◽

Mixed Data ◽

Multivariate Techniques ◽

High Computational Cost

This work provides a procedure with which to construct and visualize profiles, i.e., groups of individuals with similar characteristics, for weighted and mixed data by combining two classical multivariate techniques, multidimensional scaling (MDS) and the k-prototypes clustering algorithm. The well-known drawback of classical MDS in large datasets is circumvented by selecting a small random sample of the dataset, whose individuals are clustered by means of an adapted version of the k-prototypes algorithm and mapped via classical MDS. Gower’s interpolation formula is used to project remaining individuals onto the previous configuration. In all the process, Gower’s distance is used to measure the proximity between individuals. The methodology is illustrated on a real dataset, obtained from the Survey of Health, Ageing and Retirement in Europe (SHARE), which was carried out in 19 countries and represents over 124 million aged individuals in Europe. The performance of the method was evaluated through a simulation study, whose results point out that the new proposal solves the high computational cost of the classical MDS with low error.

Download Full-text

Head-to-head comparison of clustering methods for heterogeneous data: a simulation-driven benchmark

Scientific Reports ◽

10.1038/s41598-021-83340-8 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Gregoire Preud’homme ◽

Kevin Duarte ◽

Kevin Dalleau ◽

Claire Lacomblez ◽

Emmanuel Bresso ◽

...

Keyword(s):

Hierarchical Clustering ◽

Latent Class ◽

Latent Class Model ◽

Real Life ◽

Heterogeneous Data ◽

Mixed Data ◽

Categorical Variables ◽

Clustering Methods ◽

Model Based ◽

Partitioning Around Medoids

AbstractThe choice of the most appropriate unsupervised machine-learning method for “heterogeneous” or “mixed” data, i.e. with both continuous and categorical variables, can be challenging. Our aim was to examine the performance of various clustering strategies for mixed data using both simulated and real-life data. We conducted a benchmark analysis of “ready-to-use” tools in R comparing 4 model-based (Kamila algorithm, Latent Class Analysis, Latent Class Model [LCM] and Clustering by Mixture Modeling) and 5 distance/dissimilarity-based (Gower distance or Unsupervised Extra Trees dissimilarity followed by hierarchical clustering or Partitioning Around Medoids, K-prototypes) clustering methods. Clustering performances were assessed by Adjusted Rand Index (ARI) on 1000 generated virtual populations consisting of mixed variables using 7 scenarios with varying population sizes, number of clusters, number of continuous and categorical variables, proportions of relevant (non-noisy) variables and degree of variable relevance (low, mild, high). Clustering methods were then applied on the EPHESUS randomized clinical trial data (a heart failure trial evaluating the effect of eplerenone) allowing to illustrate the differences between different clustering techniques. The simulations revealed the dominance of K-prototypes, Kamila and LCM models over all other methods. Overall, methods using dissimilarity matrices in classical algorithms such as Partitioning Around Medoids and Hierarchical Clustering had a lower ARI compared to model-based methods in all scenarios. When applying clustering methods to a real-life clinical dataset, LCM showed promising results with regard to differences in (1) clinical profiles across clusters, (2) prognostic performance (highest C-index) and (3) identification of patient subgroups with substantial treatment benefit. The present findings suggest key differences in clustering performance between the tested algorithms (limited to tools readily available in R). In most of the tested scenarios, model-based methods (in particular the Kamila and LCM packages) and K-prototypes typically performed best in the setting of heterogeneous data.

Download Full-text

Characterization of smallholder cattle production systems in South-Kivu province, eastern Democratic Republic of Congo

Pastoralism Research Policy and Practice ◽

10.1186/s13570-020-00187-w ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Yannick Mugumaarhahama ◽

Rodrigue Balthazar Basengere Ayagirwe ◽

Valence Bwana Mutwedu ◽

Nadège Cizungu Cirezi ◽

Dieudonné Shukuru Wasso ◽

...

Keyword(s):

Production Systems ◽

Democratic Republic Of Congo ◽

Animal Husbandry ◽

Mixed Data ◽

Medium Size ◽

Fallow Land ◽

Cattle Production ◽

Access To Credit ◽

Cattle Farming ◽

South Kivu

AbstractIn South-Kivu province, cattle farming is an integral component of farmers’ livelihoods and one of the few income-generating opportunities for smallholders. However, very few studies have been conducted to characterize smallholders’ cattle production systems. This study documents cattle production systems to better understand their current situation, constraints they face and opportunities they offer. For that purpose, an investigation was conducted based on a structured survey questionnaire and participatory interviews with 863 farmers in South-Kivu province. Collected data were analysed using factorial analysis of mixed data and clustering techniques. The results revealed three types of smallholder cattle farms differing mainly in their herds’ sizes and landholding. The first category is the most common and includes farmers raising small herds (6.3 ± 6.7 cattle) of local breeds in herding system (in this work, “herding system” refers to a rearing system for which the farmer drives and stays with his animals on pastures and fallow land during the day) and grazing fodder in community pastures, fallow lands and roadside grasses, while land constitutes a scarce resource. In the second category, some farmers have small tracts of land (< 5 ha) and others have large tracts (> 5 ha), but all have medium-size herds (45.1 ± 19.4 cattle) made up of local breeds, which they rear in herding system. They also exploit community pastures, fallow land and roadside fodder for animal feeding. The third and last category includes farmers with large cattle herds (78.1 ± 28.1 cattle) of local, crossbred and exotic breeds raised free range in the fenced paddocks on vast areas of land (> 5 ha) found in high-altitude regions. However, while being different according to the above-considered characteristics, the three categories of cattle farming remain extensive pastoral farms dominated by male farmers. Agriculture and/or animal husbandry are their main source of income while their livestock are also composed of goats and poultry, beside cattle. Still, the three farming groups require more inputs and improvement strategies for increased productivity in the challenging environment characterized by low land accessibility and high demand for milk and meat. Fodder cultivation and crop-livestock integration through agro-ecological systems as well as access to credit and extension services are the proposed strategies for the improvement of this economic sector.

Download Full-text

An adversarial attack detection method in deep neural networks based on re-attacking approach

Multimedia Tools and Applications ◽

10.1007/s11042-020-10261-5 ◽

2021 ◽

Author(s):

Morteza Ali Ahmadi ◽

Rouhollah Dianat ◽

Hossein Amirkhani

Keyword(s):

Neural Networks ◽

Deep Neural Networks ◽

Detection Method ◽

Attack Detection ◽

Adversarial Attack

Download Full-text

An Adversarial Attack Defending System for Securing In-Vehicle Networks

2021 IEEE 18th Annual Consumer Communications & Networking Conference (CCNC) ◽

10.1109/ccnc49032.2021.9369569 ◽

2021 ◽

Author(s):

Yi Li ◽

Jing Lin ◽

Kaiqi Xiong

Keyword(s):

Vehicle Networks ◽

Adversarial Attack

Download Full-text

A Knitted Sensing Glove for Human Hand Postures Pattern Recognition

Sensors ◽

10.3390/s21041364 ◽

2021 ◽

Vol 21 (4) ◽

pp. 1364

Author(s):

Seulah Lee ◽

Yuna Choi ◽

Minchang Sung ◽

Jihyun Bae ◽

Youngjin Choi

Keyword(s):

Pattern Recognition ◽

Confusion Matrix ◽

Strain Sensors ◽

Mixed Data ◽

Human Hand ◽

Flexible Sensors ◽

Target Hand ◽

The Cost ◽

Grasp Posture ◽

Conductive Yarn

In recent years, flexible sensors for data gloves have been developed that aim to achieve excellent wearability, but they are associated with difficulties due to the complicated manufacturing and embedding into the glove. This study proposes a knitted glove integrated with strain sensors for pattern recognition of hand postures. The proposed sensing glove is fabricated at all once by a knitting technique without sewing and bonding, which is composed of strain sensors knitted with conductive yarn and a glove body with non-conductive yarn. To verify the performance of the developed glove, electrical resistance variations were measured according to the flexed angle and speed. These data showed different values depending on the speed or angle of movements. We carried out experiments on hand postures pattern recognition for the practicability verification of the knitted sensing glove. For this purpose, 10 able-bodied subjects participated in the recognition experiments on 10 target hand postures. The average classification accuracy of 10 subjects reached 94.17% when their own data were used. The accuracy of up to 97.1% was achieved in the case of grasp posture among 10 target postures. When all mixed data from 10 subjects were utilized for pattern recognition, the average classification expressed by the confusion matrix arrived at 89.5%. Therefore, the comprehensive experimental results demonstrated the effectiveness of the knitted sensing gloves. In addition, it is expected to reduce the cost through a simple manufacturing process of the knitted sensing glove.

Download Full-text