Scalable and Efficient Pairwise Learning to Achieve Statistical Accuracy

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v33i01.33013697 ◽

2019 ◽

Vol 33 ◽

pp. 3697-3704

Author(s):

Bin Gu ◽

Zhouyuan Huo ◽

Heng Huang

Keyword(s):

Learning Community ◽

Learning Algorithms ◽

Metric Learning ◽

Computational Cost ◽

Gradient Algorithm ◽

Statistical Accuracy ◽

Pairwise Learning ◽

Auc Maximization ◽

Real World Datasets ◽

The Relationship

Pairwise learning is an important learning topic in the machine learning community, where the loss function involves pairs of samples (e.g., AUC maximization and metric learning). Existing pairwise learning algorithms do not perform well in the generality, scalability and efficiency simultaneously. To address these challenging problems, in this paper, we first analyze the relationship between the statistical accuracy and the regularized empire risk for pairwise loss. Based on the relationship, we propose a scalable and efficient adaptive doubly stochastic gradient algorithm (AdaDSG) for generalized regularized pairwise learning problems. More importantly, we prove that the overall computational cost of AdaDSG is O(n) to achieve the statistical accuracy on the full training set with the size of n, which is the best theoretical result for pairwise learning to the best of our knowledge. The experimental results on a variety of real-world datasets not only confirm the effectiveness of our AdaDSG algorithm, but also show that AdaDSG has significantly better scalability and efficiency than the existing pairwise learning algorithms.

Download Full-text

Pairwise Learning with Differential Privacy Guarantees

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i01.5411 ◽

2020 ◽

Vol 34 (01) ◽

pp. 694-701

Author(s):

Mengdi Huai ◽

Di Wang ◽

Chenglin Miao ◽

Jinhui Xu ◽

Aidong Zhang

Keyword(s):

Differential Privacy ◽

Metric Learning ◽

Loss Functions ◽

Sensitive Information ◽

Training Set ◽

Pairwise Learning ◽

Auc Maximization ◽

General Convex ◽

Convex Loss ◽

Real World Datasets

Pairwise learning has received much attention recently as it is more capable of modeling the relative relationship between pairs of samples. Many machine learning tasks can be categorized as pairwise learning, such as AUC maximization and metric learning. Existing techniques for pairwise learning all fail to take into consideration a critical issue in their design, i.e., the protection of sensitive information in the training set. Models learned by such algorithms can implicitly memorize the details of sensitive information, which offers opportunity for malicious parties to infer it from the learned models. To address this challenging issue, in this paper, we propose several differentially private pairwise learning algorithms for both online and offline settings. Specifically, for the online setting, we first introduce a differentially private algorithm (called OnPairStrC) for strongly convex loss functions. Then, we extend this algorithm to general convex loss functions and give another differentially private algorithm (called OnPairC). For the offline setting, we also present two differentially private algorithms (called OffPairStrC and OffPairC) for strongly and general convex loss functions, respectively. These proposed algorithms can not only learn the model effectively from the data but also provide strong privacy protection guarantee for sensitive information in the training set. Extensive experiments on real-world datasets are conducted to evaluate the proposed algorithms and the experimental results support our theoretical analysis.

Download Full-text

Generalization Bounds for Regularized Pairwise Learning

Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2018/329 ◽

2018 ◽

Author(s):

Yunwen Lei ◽

Shao-Bo Lin ◽

Ke Tang

Keyword(s):

State Of The Art ◽

Metric Learning ◽

Distance Metric Learning ◽

Generalization Error ◽

Unified Framework ◽

Generalization Bounds ◽

Learning Tasks ◽

Pairwise Learning ◽

Auc Maximization ◽

Learning Schemes

Pairwise learning refers to learning tasks with the associated loss functions depending on pairs of examples. Recently, pairwise learning has received increasing attention since it covers many machine learning schemes, e.g., metric learning, ranking and AUC maximization, in a unified framework. In this paper, we establish a unified generalization error bound for regularized pairwise learning without either Bernstein conditions or capacity assumptions. We apply this general result to typical learning tasks including distance metric learning and ranking, for each of which our discussion is able to improve the state-of-the-art results.

Download Full-text

Quadruply Stochastic Gradients for Large Scale Nonlinear Semi-Supervised AUC Optimization

Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence ◽

10.24963/ijcai.2019/474 ◽

2019 ◽

Cited By ~ 3

Author(s):

Wanli Shi ◽

Bin Gu ◽

Xiang Li ◽

Xiang Geng ◽

Heng Huang

Keyword(s):

Learning Community ◽

Supervised Learning ◽

Large Scale ◽

Optimal Solution ◽

Stochastic Gradient ◽

Gradient Algorithm ◽

Maximization Problem ◽

Classification Problems ◽

Auc Maximization ◽

Auc Optimization

Semi-supervised learning is pervasive in real-world applications, where only a few labeled data are available and large amounts of instances remain unlabeled. Since AUC is an important model evaluation metric in classification, directly optimizing AUC in semi-supervised learning scenario has drawn much attention in the machine learning community. Recently, it has been shown that one could find an unbiased solution for the semi-supervised AUC maximization problem without knowing the class prior distribution. However, this method is hardly scalable for nonlinear classification problems with kernels. To address this problem, in this paper, we propose a novel scalable quadruply stochastic gradient algorithm (QSG-S2AUC) for nonlinear semi-supervised AUC optimization. In each iteration of the stochastic optimization process, our method randomly samples a positive instance, a negative instance, an unlabeled instance and their random features to compute the gradient and then update the model by using this quadruply stochastic gradient to approach the optimal solution. More importantly, we prove that QSG-S2AUC can converge to the optimal solution in O(1/t), where t is the iteration number. Extensive experimental results on a variety of benchmark datasets show that QSG-S2AUC is far more efficient than the existing state-of-the-art algorithms for semi-supervised AUC maximization, while retaining the similar generalization performance.

Download Full-text

Stability and optimization error of stochastic gradient descent for pairwise learning

Analysis and Applications ◽

10.1142/s0219530519400062 ◽

2019 ◽

Vol 18 (05) ◽

pp. 887-927

Author(s):

Wei Shen ◽

Zhenhuan Yang ◽

Yiming Ying ◽

Xiaoming Yuan

Keyword(s):

Gradient Descent ◽

Metric Learning ◽

Stochastic Gradient ◽

Stochastic Gradient Descent ◽

Trade Off ◽

Lower Bounding ◽

Pairwise Learning ◽

Auc Maximization ◽

The Stability ◽

Stability Results

In this paper, we study the stability and its trade-off with optimization error for stochastic gradient descent (SGD) algorithms in the pairwise learning setting. Pairwise learning refers to a learning task which involves a loss function depending on pairs of instances among which notable examples are bipartite ranking, metric learning, area under ROC curve (AUC) maximization and minimum error entropy (MEE) principle. Our contribution is twofolded. Firstly, we establish the stability results for SGD for pairwise learning in the convex, strongly convex and non-convex settings, from which generalization errors can be naturally derived. Secondly, we establish the trade-off between stability and optimization error of SGD algorithms for pairwise learning. This is achieved by lower-bounding the sum of stability and optimization error by the minimax statistical error over a prescribed class of pairwise loss functions. From this fundamental trade-off, we obtain lower bounds for the optimization error of SGD algorithms and the excess expected risk over a class of pairwise losses. In addition, we illustrate our stability results by giving some specific examples of AUC maximization, metric learning and MEE.

Download Full-text

An Analysis of the Relationship between Teacher Learning Community and Democratic, Collaborative School Culture

Korean Association For Learner-Centered Curriculum And Instruction ◽

10.22251/jlcci.2019.19.2.623 ◽

2019 ◽

Vol 19 (2) ◽

pp. 623-639

Author(s):

Soojung Park ◽

Xiaofei Fang

Keyword(s):

School Culture ◽

Learning Community ◽

Teacher Learning ◽

The Relationship ◽

Teacher Learning Community

Download Full-text

Algorithmic and human prediction of success in human collaboration from visual features

Scientific Reports ◽

10.1038/s41598-021-81145-3 ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Martin Saveski ◽

Edmond Awad ◽

Iyad Rahwan ◽

Manuel Cebrian

Keyword(s):

Machine Learning ◽

Visual Cues ◽

Success Factors ◽

Group Performance ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Adventure Game ◽

Group Success ◽

The Relationship ◽

Better Than

AbstractAs groups are increasingly taking over individual experts in many tasks, it is ever more important to understand the determinants of group success. In this paper, we study the patterns of group success in Escape The Room, a physical adventure game in which a group is tasked with escaping a maze by collectively solving a series of puzzles. We investigate (1) the characteristics of successful groups, and (2) how accurately humans and machines can spot them from a group photo. The relationship between these two questions is based on the hypothesis that the characteristics of successful groups are encoded by features that can be spotted in their photo. We analyze >43K group photos (one photo per group) taken after groups have completed the game—from which all explicit performance-signaling information has been removed. First, we find that groups that are larger, older and more gender but less age diverse are significantly more likely to escape. Second, we compare humans and off-the-shelf machine learning algorithms at predicting whether a group escaped or not based on the completion photo. We find that individual guesses by humans achieve 58.3% accuracy, better than random, but worse than machines which display 71.6% accuracy. When humans are trained to guess by observing only four labeled photos, their accuracy increases to 64%. However, training humans on more labeled examples (eight or twelve) leads to a slight, but statistically insignificant improvement in accuracy (67.4%). Humans in the best training condition perform on par with two, but worse than three out of the five machine learning algorithms we evaluated. Our work illustrates the potentials and the limitations of machine learning systems in evaluating group performance and identifying success factors based on sparse visual cues.

Download Full-text

Deep autoencoder-based community detection in complex networks with particle swarm optimization and continuation algorithms

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-201342 ◽

2021 ◽

pp. 1-17

Author(s):

Mohammed Al-Andoli ◽

Wooi Ping Cheah ◽

Shing Chiang Tan

Keyword(s):

Particle Swarm Optimization ◽

Complex Networks ◽

Learning Community ◽

Community Detection ◽

Particle Swarm ◽

Premature Convergence ◽

Swarm Optimization ◽

Detection Algorithms ◽

Real World Datasets ◽

The Cost

Detecting communities is an important multidisciplinary research discipline and is considered vital to understand the structure of complex networks. Deep autoencoders have been successfully proposed to solve the problem of community detection. However, existing models in the literature are trained based on gradient descent optimization with the backpropagation algorithm, which is known to converge to local minima and prove inefficient, especially in big data scenarios. To tackle these drawbacks, this work proposed a novel deep autoencoder with Particle Swarm Optimization (PSO) and continuation algorithms to reveal community structures in complex networks. The PSO and continuation algorithms were utilized to avoid the local minimum and premature convergence, and to reduce overall training execution time. Two objective functions were also employed in the proposed model: minimizing the cost function of the autoencoder, and maximizing the modularity function, which refers to the quality of the detected communities. This work also proposed other methods to work in the absence of continuation, and to enable premature convergence. Extensive empirical experiments on 11 publically-available real-world datasets demonstrated that the proposed method is effective and promising for deriving communities in complex networks, as well as outperforming state-of-the-art deep learning community detection algorithms.

Download Full-text

Factor-Bounded Nonnegative Matrix Factorization

ACM Transactions on Knowledge Discovery from Data ◽

10.1145/3451395 ◽

2021 ◽

Vol 15 (6) ◽

pp. 1-18

Author(s):

Kai Liu ◽

Xiangyu Li ◽

Zhihui Zhu ◽

Lodewijk Brand ◽

Hua Wang

Keyword(s):

Matrix Factorization ◽

Clustering Algorithm ◽

Nonnegative Matrix Factorization ◽

Nonnegative Matrix ◽

Optimization Methods ◽

Auxiliary Function ◽

Image Clustering ◽

Real World Datasets ◽

The Relationship ◽

Matrix Factors

Nonnegative Matrix Factorization (NMF) is broadly used to determine class membership in a variety of clustering applications. From movie recommendations and image clustering to visual feature extractions, NMF has applications to solve a large number of knowledge discovery and data mining problems. Traditional optimization methods, such as the Multiplicative Updating Algorithm (MUA), solves the NMF problem by utilizing an auxiliary function to ensure that the objective monotonically decreases. Although the objective in MUA converges, there exists no proof to show that the learned matrix factors converge as well. Without this rigorous analysis, the clustering performance and stability of the NMF algorithms cannot be guaranteed. To address this knowledge gap, in this article, we study the factor-bounded NMF problem and provide a solution algorithm with proven convergence by rigorous mathematical analysis, which ensures that both the objective and matrix factors converge. In addition, we show the relationship between MUA and our solution followed by an analysis of the convergence of MUA. Experiments on both toy data and real-world datasets validate the correctness of our proposed method and its utility as an effective clustering algorithm.

Download Full-text

Chi-Squared Distance Metric Learning for Histogram Data

Mathematical Problems in Engineering ◽

10.1155/2015/352849 ◽

2015 ◽

Vol 2015 ◽

pp. 1-12 ◽

Cited By ~ 2

Author(s):

Wei Yang ◽

Luhui Xu ◽

Xiaopan Chen ◽

Fengbin Zheng ◽

Yang Liu

Keyword(s):

Nearest Neighbor ◽

State Of The Art ◽

Metric Learning ◽

Nearest Neighbors ◽

Distance Metric Learning ◽

Distance Metric ◽

Projected Gradient Method ◽

Proper Distance ◽

Chi Squared ◽

Real World Datasets

Learning a proper distance metric for histogram data plays a crucial role in many computer vision tasks. The chi-squared distance is a nonlinear metric and is widely used to compare histograms. In this paper, we show how to learn a general form of chi-squared distance based on the nearest neighbor model. In our method, the margin of sample is first defined with respect to the nearest hits (nearest neighbors from the same class) and the nearest misses (nearest neighbors from the different classes), and then the simplex-preserving linear transformation is trained by maximizing the margin while minimizing the distance between each sample and its nearest hits. With the iterative projected gradient method for optimization, we naturally introduce thel2,1norm regularization into the proposed method for sparse metric learning. Comparative studies with the state-of-the-art approaches on five real-world datasets verify the effectiveness of the proposed method.

Download Full-text

Heterogeneous Influence Maximization Through Community Detection in Social Networks

International Journal of Ambient Computing and Intelligence ◽

10.4018/ijaci.2021100107 ◽

2021 ◽

Vol 12 (4) ◽

pp. 118-131

Author(s):

Jaya Krishna Raguru ◽

Devi Prasad Sharma

Keyword(s):

Community Detection ◽

Greedy Algorithms ◽

Computational Cost ◽

Optimal Solution ◽

Influence Maximization ◽

Centrality Measures ◽

Influence Spread ◽

Real World Datasets ◽

Initial Seed ◽

High Computational Cost

The problem of identifying a seed set composed of K nodes that increase influence spread over a social network is known as influence maximization (IM). Past works showed this problem to be NP-hard and an optimal solution to this problem using greedy algorithms achieved only 63% of spread. However, this approach is expensive and suffered from performance issues like high computational cost. Furthermore, in a network with communities, IM spread is not always certain. In this paper, heterogeneous influence maximization through community detection (HIMCD) algorithm is proposed. This approach addresses initial seed nodes selection in communities using various centrality measures, and these seed nodes act as sources for influence spread. A parallel influence maximization is applied with the aid of seed node set contained in each group. In this approach, graph is partitioned and IM computations are done in a distributed manner. Extensive experiments with two real-world datasets reveals that HCDIM achieves substantial performance improvement over state-of-the-art techniques.

Download Full-text