Phrase Embedding and Clustering for Sub-Feature Extraction from Online Data

2021 ◽  
pp. 1-11
Author(s):  
Seyoung Park ◽  
Harrison Kim

Abstract Recently, online user-generated data has been used as an efficient resource for customer analysis. In the product design area, various methods for analyzing customer preference for product features have been suggested. However, most of them focused on feature categories rather than product components which are crucial in practical applications. To address that limitation, this paper proposes a new methodology for extracting part-level features from online data. First, the method detects phrases in the data and filtered them using product manual documents. The filtered phrases are embedded into vectors, and then they are divided into several groups by two clustering methods. The resulting clusters are labeled by analyzing items in each cluster. Finally, cue phrases for sub-features are obtained by selecting clusters with labels representing product features. The proposed methodology was tested on smartphone review data. The result provides feature clusters containing sub-feature phrases with high accuracy. The obtained cue phrases will be used in analyzing customer preferences for part-level features and this can help product designers determine the optimal component configuration in embodiment design.

Author(s):  
Seyoung Park ◽  
Harrison M. Kim

Abstract In product design, it is essential to understand customers’ preferences for product features. Traditional methods including the survey and interview are time-consuming and costly. As an alternative, research on utilizing online data for user analysis has been actively conducted. Although various methods have been proposed in this domain, most of them focus on the main features or usages of the product. However, from the manufacturer’s perspective, sub-features are as crucial as main features or usages, because the preference for sub-features is necessary for component configuration in actual product development. As the first step to solve this problem, this paper proposes a methodology to extract and cluster sub-features by incorporating phrase embedding into the previous word embedding. Also, the presented methodology increases the accuracy and diversity of the clustering result by using X-means clustering as a noise filter and adopting spectral clustering.


2021 ◽  
Vol 1 ◽  
pp. 457-466
Author(s):  
Jinju Kim ◽  
Seyoung Park ◽  
Harrison Kim

AbstractThe outbreak of the coronavirus disease not only caused many deaths worldwide but also severely affected the development of the global economy, such as supply chain disruptions, plummeted demand, unemployment, etc. These social changes have led to changes in customers' purchasing patterns. Therefore, it is more important than ever for manufacturers to quickly identify and respond to changing customer purchasing patterns and requirements. However, few studies have been done on dynamic changes in customer preferences for product features following COVID-19 spread. This study aims to investigate the dynamic change of customer sentiment on product features following COVID-19 through sentiment analysis based on online reviews. The proposed methodology consists of two main processes: feature extraction and sentiment analysis. After finding a specific feature of the product through feature extraction, the words used to mention the feature in the review were analyzed for sentiment analysis of customers. To demonstrate the methodology, a case study is conducted using new and refurbished smartphone reviews to investigate the dynamic changes in customer sentiment during COVID-19.


2021 ◽  
Vol 13 (15) ◽  
pp. 2901
Author(s):  
Zhiqiang Zeng ◽  
Jinping Sun ◽  
Congan Xu ◽  
Haiyang Wang

Recently, deep learning (DL) has been successfully applied in automatic target recognition (ATR) tasks of synthetic aperture radar (SAR) images. However, limited by the lack of SAR image target datasets and the high cost of labeling, these existing DL based approaches can only accurately recognize the target in the training dataset. Therefore, high precision identification of unknown SAR targets in practical applications is one of the important capabilities that the SAR–ATR system should equip. To this end, we propose a novel DL based identification method for unknown SAR targets with joint discrimination. First of all, the feature extraction network (FEN) trained on a limited dataset is used to extract the SAR target features, and then the unknown targets are roughly identified from the known targets by computing the Kullback–Leibler divergence (KLD) of the target feature vectors. For the targets that cannot be distinguished by KLD, their feature vectors perform t-distributed stochastic neighbor embedding (t-SNE) dimensionality reduction processing to calculate the relative position angle (RPA). Finally, the known and unknown targets are finely identified based on RPA. Experimental results conducted on the MSTAR dataset demonstrate that the proposed method can achieve higher identification accuracy of unknown SAR targets than existing methods while maintaining high recognition accuracy of known targets.


Data Mining ◽  
2013 ◽  
pp. 142-158
Author(s):  
Baoying Wang ◽  
Aijuan Dong

Clustering and outlier detection are important data mining areas. Online clustering and outlier detection generally work with continuous data streams generated at a rapid rate and have many practical applications, such as network instruction detection and online fraud detection. This chapter first reviews related background of online clustering and outlier detection. Then, an incremental clustering and outlier detection method for market-basket data is proposed and presented in details. This proposed method consists of two phases: weighted affinity measure clustering (WC clustering) and outlier detection. Specifically, given a data set, the WC clustering phase analyzes the data set and groups data items into clusters. Then, outlier detection phase examines each newly arrived transaction against the item clusters formed in WC clustering phase, and determines whether the new transaction is an outlier. Periodically, the newly collected transactions are analyzed using WC clustering to produce an updated set of clusters, against which transactions arrived afterwards are examined. The process is carried out continuously and incrementally. Finally, the future research trends on online data mining are explored at the end of the chapter.


2019 ◽  
Vol 9 (1) ◽  
pp. 390-396 ◽  
Author(s):  
Marek Potkány ◽  
Monika Škultétyová

AbstractThe purpose of this paper is to present the results of a research into the customer preferences of potential buyers of simple wood-based house for the purpose of using the Target Costing methodology. Respondents’ opinions of 280 customers were obtained through direct interviews taking place at a specialized exhibition of furniture and timber constructions in Slovakia. The object of the research was a simple wood-based house, namely a weekend garden cottage, made of northern spruce at the level of target price 9,320 €. The paper contribution is represented by the use of the Target Costing methodology in the conditions of wood-processing industry while defining the customer preference for a specific product with the subsequent use for a functional cost analysis and determining the target cost index of the evaluated components. Presented results can be used as information database for decision making in the field of make or buy decisions at the level of fixed purchase prices of individual components or as the upper price limit for producing the components by the business itself.


2020 ◽  
Vol 10 (20) ◽  
pp. 7068
Author(s):  
Minh Tuan Pham ◽  
Jong-Myon Kim ◽  
Cheol Hong Kim

Recent convolutional neural network (CNN) models in image processing can be used as feature-extraction methods to achieve high accuracy as well as automatic processing in bearing fault diagnosis. The combination of deep learning methods with appropriate signal representation techniques has proven its efficiency compared with traditional algorithms. Vital electrical machines require a strict monitoring system, and the accuracy of these machines’ monitoring systems takes precedence over any other factors. In this paper, we propose a new method for diagnosing bearing faults under variable shaft speeds using acoustic emission (AE) signals. Our proposed method predicts not only bearing fault types but also the degradation level of bearings. In the proposed technique, AE signals acquired from bearings are represented by spectrograms to obtain as much information as possible in the time–frequency domain. Feature extraction and classification processes are performed by deep learning using EfficientNet and a stochastic line-search optimizer. According to our various experiments, the proposed method can provide high accuracy and robustness under noisy environments compared with existing AE-based bearing fault diagnosis methods.


2019 ◽  
Vol 19 (4) ◽  
pp. 967-986 ◽  
Author(s):  
Xintian Chi ◽  
Dario Di Maio ◽  
Nicholas AJ Lieven

This research focuses on the development of a damage detection algorithm based on modal testing, vibrothermography, and feature extraction. The theoretical development of mathematical models is presented to illustrate the principles supporting the associated algorithms, through which the importance of the three components contributing to this approach is demonstrated. Experimental tests and analytical simulations have been performed in laboratory conditions to show that the proposed damage detection algorithm is able to detect, locate, and extract the features generated due to the presence of sub-surface damage in aerospace grade composite materials captured by an infrared camera. Through tests and analyses, the reliability and repeatability of this damage detection algorithm are verified. In the concluding observations of this article, suggestions are proposed for this algorithm’s practical applications in an operational environment.


Author(s):  
Wurood A. Jbara

<p>Biometric verification based on ear features is modern filed for scientific research. As known, there are many biometric identifiers that can identify people such as fingerprints, iris and speech. In this paper, the focus is placed on the ear biometric model in order to verifying the identity of persons. The main idea is based on used the moments as ear feature extractors. The proposed approach included some operations as follow: image capturing, edge detection, erosion, feature extraction, and matching. The proposed approach has been tested using many images of the ears with different states. Experimental results using several trails verified that the proposed approach is achieved high accuracy level over a wide variety of ear images. Also, the verification process will be completed by matching query ear image with ear images that kept in database during real time.</p>


Sign in / Sign up

Export Citation Format

Share Document