Tackling challenges of neural purchase stage identification from imbalanced twitter data

2019 ◽  
Vol 26 (4) ◽  
pp. 383-411
Author(s):  
Heike Adel ◽  
Francine Chen ◽  
Yan-Ying Chen

AbstractTwitter and other social media platforms are often used for sharing interest in products. The identification of purchase decision stages, such as in the AIDA model (Awareness, Interest, Desire, and Action), can enable more personalized e-commerce services and a finer-grained targeting of advertisements than predicting purchase intent only. In this paper, we propose and analyze neural models for identifying the purchase stage of single tweets in a user’s tweet sequence. In particular, we identify three challenges of purchase stage identification: imbalanced label distribution with a high number of non-purchase-stage instances, limited amount of training data, and domain adaptation with no or only little target domain data. Our experiments reveal that the imbalanced label distribution is the main challenge for our models. We address it with ranking loss and perform detailed investigations of the performance of our models on the different output classes. In order to improve the generalization of the models and augment the limited amount of training data, we examine the use of sentiment analysis as a complementary, secondary task in a multitask framework. For applying our models to tweets from another product domain, we consider two scenarios: for the first scenario without any labeled data in the target product domain, we show that learning domain-invariant representations with adversarial training is most promising, while for the second scenario with a small number of labeled target examples, fine-tuning the source model weights performs best. Finally, we conduct several analyses, including extracting attention weights and representative phrases for the different purchase stages. The results suggest that the model is learning features indicative of purchase stages and that the confusion errors are sensible.

Author(s):  
Jianqun Zhang ◽  
Qing Zhang ◽  
Xianrong Qin ◽  
Yuantao Sun

To identify rolling bearing faults under variable load conditions, a method named DISA-KNN is proposed in this paper, which is based on the strategy of feature extraction-domain adaptation-classification. To be specific, the time-domain and frequency-domain indicators are used for feature extraction. Discriminative and domain invariant subspace alignment (DISA) is used to minimize the data distributions’ discrepancies between the training data (source domain) and testing data (target domain). K-nearest neighbor (KNN) is applied to identify rolling bearing faults. DISA-KNN’s validation is proved by the experimental signal collected under different load conditions. The identification accuracies obtained by the DISA-KNN method are more than 90% on four datasets, including one dataset with 99.5% accuracy. The strength of the proposed method is further highlighted by comparisons with the other 8 methods. These results reveal that the proposed method is promising for the rolling bearing fault diagnosis in real rotating machinery.


Author(s):  
Haidi Hasan Badr ◽  
Nayer Mahmoud Wanas ◽  
Magda Fayek

Since labeled data availability differs greatly across domains, Domain Adaptation focuses on learning in new and unfamiliar domains by reducing distribution divergence. Recent research suggests that the adversarial learning approach could be a promising way to achieve the domain adaptation objective. Adversarial learning is a strategy for learning domain-transferable features in robust deep networks. This paper introduces the TSAL paradigm, a two-step adversarial learning framework. It addresses the real-world problem of text classification, where source domain(s) has labeled data but target domain (s) has only unlabeled data. TSAL utilizes joint adversarial learning with class information and domain alignment deep network architecture to learn both domain-invariant and domain-specific features extractors. It consists of two training steps that are similar to the paradigm, in which pre-trained model weights are used as initialization for training with new data. TSAL’s two training phases, however, are based on the same data, not different data, as is the case with fine-tuning. Furthermore, TSAL only uses the learned domain-invariant feature extractor from the first training as an initialization for its peer in subsequent training. By doubling the training, TSAL can emphasize the leverage of the small unlabeled target domain and learn effectively what to share between various domains. A detailed analysis of many benchmark datasets reveals that our model consistently outperforms the prior art across a wide range of dataset distributions.


2021 ◽  
Author(s):  
Jiahao Fan ◽  
Hangyu Zhu ◽  
Xinyu Jiang ◽  
Long Meng ◽  
Cong Fu ◽  
...  

Deep sleep staging networks have reached top performance on large-scale datasets. However, these models perform poorer when training and testing on small sleep cohorts due to data inefficiency. Transferring well-trained models from large-scale datasets (source domain) to small sleep cohorts (target domain) is a promising solution but still remains challenging due to the domain-shift issue. In this work, an unsupervised domain adaptation approach, domain statistics alignment (DSA), is developed to bridge the gap between the data distribution of source and target domains. DSA adapts the source models on the target domain by modulating the domain-specific statistics of deep features stored in the Batch Normalization (BN) layers. Furthermore, we have extended DSA by introducing cross-domain statistics in each BN layer to perform DSA adaptively (AdaDSA). The proposed methods merely need the well-trained source model without access to the source data, which may be proprietary and inaccessible. DSA and AdaDSA are universally applicable to various deep sleep staging networks that have BN layers. We have validated the proposed methods by extensive experiments on two state-of-the-art deep sleep staging networks, DeepSleepNet+ and U-time. The performance was evaluated by conducting various transfer tasks on six sleep databases, including two large-scale databases, MASS and SHHS, as the source domain, four small sleep databases as the target domain. Thereinto, clinical sleep records acquired in Huashan Hospital, Shanghai, were used. The results show that both DSA and AdaDSA could significantly improve the performance of source models on target domains, providing novel insights into the domain generalization problem in sleep staging tasks.<br>


Author(s):  
D. Gritzner ◽  
J. Ostermann

Abstract. Modern machine learning, especially deep learning, which is used in a variety of applications, requires a lot of labelled data for model training. Having an insufficient amount of training examples leads to models which do not generalize well to new input instances. This is a particular significant problem for tasks involving aerial images: often training data is only available for a limited geographical area and a narrow time window, thus leading to models which perform poorly in different regions, at different times of day, or during different seasons. Domain adaptation can mitigate this issue by using labelled source domain training examples and unlabeled target domain images to train a model which performs well on both domains. Modern adversarial domain adaptation approaches use unpaired data. We propose using pairs of semantically similar images, i.e., whose segmentations are accurate predictions of each other, for improved model performance. In this paper we show that, as an upper limit based on ground truth, using semantically paired aerial images during training almost always increases model performance with an average improvement of 4.2% accuracy and .036 mean intersection-over-union (mIoU). Using a practical estimate of semantic similarity, we still achieve improvements in more than half of all cases, with average improvements of 2.5% accuracy and .017 mIoU in those cases.


2020 ◽  
Vol 34 (07) ◽  
pp. 12975-12983
Author(s):  
Sicheng Zhao ◽  
Guangzhi Wang ◽  
Shanghang Zhang ◽  
Yang Gu ◽  
Yaxian Li ◽  
...  

Deep neural networks suffer from performance decay when there is domain shift between the labeled source domain and unlabeled target domain, which motivates the research on domain adaptation (DA). Conventional DA methods usually assume that the labeled data is sampled from a single source distribution. However, in practice, labeled data may be collected from multiple sources, while naive application of the single-source DA algorithms may lead to suboptimal solutions. In this paper, we propose a novel multi-source distilling domain adaptation (MDDA) network, which not only considers the different distances among multiple sources and the target, but also investigates the different similarities of the source samples to the target ones. Specifically, the proposed MDDA includes four stages: (1) pre-train the source classifiers separately using the training data from each source; (2) adversarially map the target into the feature space of each source respectively by minimizing the empirical Wasserstein distance between source and target; (3) select the source training samples that are closer to the target to fine-tune the source classifiers; and (4) classify each encoded target feature by corresponding source classifier, and aggregate different predictions using respective domain weight, which corresponds to the discrepancy between each source and target. Extensive experiments are conducted on public DA benchmarks, and the results demonstrate that the proposed MDDA significantly outperforms the state-of-the-art approaches. Our source code is released at: https://github.com/daoyuan98/MDDA.


Author(s):  
Alejandro Moreo Fernández ◽  
Andrea Esuli ◽  
Fabrizio Sebastiani

Domain Adaptation (DA) techniques aim at enabling machine learning methods learn effective classifiers for a “target” domain when the only available training data belongs to a different “source” domain. In this extended abstract, we briefly describe our new DA method called Distributional Correspondence Indexing (DCI) for sentiment classification. DCI derives term representations in a vector space common to both domains where each dimension reflects its distributional correspondence to a pivot, i.e., to a highly predictive term that behaves similarly across domains. The experiments we have conducted show that DCI obtains better performance than current state-of-the-art techniques for cross-lingual and cross-domain sentiment classification.


Author(s):  
A. Paul ◽  
F. Rottensteiner ◽  
C. Heipke

Domain adaptation techniques in transfer learning try to reduce the amount of training data required for classification by adapting a classifier trained on samples from a source domain to a new data set (target domain) where the features may have different distributions. In this paper, we propose a new technique for domain adaptation based on logistic regression. Starting with a classifier trained on training data from the source domain, we iteratively include target domain samples for which class labels have been obtained from the current state of the classifier, while at the same time removing source domain samples. In each iteration the classifier is re-trained, so that the decision boundaries are slowly transferred to the distribution of the target features. To make the transfer procedure more robust we introduce weights as a function of distance from the decision boundary and a new way of regularisation. Our methodology is evaluated using a benchmark data set consisting of aerial images and digital surface models. The experimental results show that in the majority of cases our domain adaptation approach can lead to an improvement of the classification accuracy without additional training data, but also indicate remaining problems if the difference in the feature distributions becomes too large.


Author(s):  
D. Wittich ◽  
F. Rottensteiner

<p><strong>Abstract.</strong> Domain adaptation (DA) can drastically decrease the amount of training data needed to obtain good classification models by leveraging available data from a source domain for the classification of a new (target) domains. In this paper, we address deep DA, i.e. DA with deep convolutional neural networks (CNN), a problem that has not been addressed frequently in remote sensing. We present a new method for semi-supervised DA for the task of pixel-based classification by a CNN. After proposing an encoder-decoder-based fully convolutional neural network (FCN), we adapt a method for adversarial discriminative DA to be applicable to the pixel-based classification of remotely sensed data based on this network. It tries to learn a feature representation that is domain invariant; domain-invariance is measured by a classifier’s incapability of predicting from which domain a sample was generated. We evaluate our FCN on the ISPRS labelling challenge, showing that it is close to the best-performing models. DA is evaluated on the basis of three domains. We compare different network configurations and perform the representation transfer at different layers of the network. We show that when using a proper layer for adaptation, our method achieves a positive transfer and thus an improved classification accuracy in the target domain for all evaluated combinations of source and target domains.</p>


2021 ◽  
Vol 2021 ◽  
pp. 1-21
Author(s):  
Fei Dong ◽  
Xiao Yu ◽  
Xinguo Shi ◽  
Ke Liu ◽  
Zhaoli Wu ◽  
...  

In the actual industrial scenarios, most existing fault diagnosis approaches are faced with two challenges, insufficient labeled training data and distribution divergences between training and testing datasets. For the above issues, a new transferable fault diagnosis approach of rotating machinery based on deep autoencoder and dominant features selection is proposed in this article. First, maximal overlap discrete wavelet packet transform is applied for signals processing and mix-domains statistical feature extraction. Second, dominant features selection by importance score and differences between domains is proposed to select dominant features with high fault-discriminative ability and domain invariance. Then, selected dominant features are used for pretraining deep autoencoder (source model), which helps in enhancing the fault representative ability of deep features. The parameters of the source model are transferred to the target model, and normal state features from target domain are adopted for fine-tuning the target model. Finally, the target model is applied for fault patterns classification. Motor and bearing fault datasets are used for a series of experiments, and the results verify that the proposed methods have better cross-domain diagnosis performance than comparative models.


2020 ◽  
Vol 12 (7) ◽  
pp. 1099 ◽  
Author(s):  
Ahram Song ◽  
Yongil Kim

Change detection (CD) networks based on supervised learning have been used in diverse CD tasks. However, such supervised CD networks require a large amount of data and only use information from current images. In addition, it is time consuming to manually acquire the ground truth data for newly obtained images. Here, we proposed a novel method for CD in case of a lack of training data in an area near by another one with the available ground truth data. The proposed method automatically entails generating training data and fine-tuning the CD network. To detect changes in target images without ground truth data, the difference images were generated using spectral similarity measure, and the training data were selected via fuzzy c-means clustering. Recurrent fully convolutional networks with multiscale three-dimensional filters were used to extract objects of various sizes from unmanned aerial vehicle (UAV) images. The CD network was pre-trained on labeled source domain data; then, the network was fine-tuned on target images using generated training data. Two further CD networks were trained with a combined weighted loss function. The training data in the target domain were iteratively updated using he prediction map of the CD network. Experiments on two hyperspectral UAV datasets confirmed that the proposed method is capable of transferring change rules and improving CD results based on training data extracted in an unsupervised way.


Sign in / Sign up

Export Citation Format

Share Document