scholarly journals Semi-supervised clustering techniques for categorization of text documents

2015 ◽  
Author(s):  
Yang Yan

In Present situation, a huge quantity of data is recorded in variety of forms like text, image, video, and audio and is estimated to enhance in future. The major tasks related to text are entity extraction, information extraction, entity relation modeling, document summarization are performed by using text mining. This paper main focus is on document clustering, a sub task of text mining and to measure the performance of different clustering techniques. In this paper we are using an enhanced features selection for clustering of text documents to prove that it produces better results compared to traditional feature selection.


2016 ◽  
Vol 3 (4) ◽  
pp. 21-40 ◽  
Author(s):  
Ali Fallah Tehrani ◽  
Diane Ahrens

Clustering techniques typically group similar instances underlying individual attributes by supposing that similar instances have similar attributes characteristic. On contrary, clustering similar instances given a specific behavior is framed through supervised learning. For instance, which fashion products have similar behavior in term of sales. Unfortunately, conventional clustering methods cannot tackle this case, since they handle attributes by a same manner. In fact, conventional clustering approaches do not consider any response, and moreover they assume attributes act by the same importance. However, clustering instances with respect to responses leads to a better data analytics. In this research, the authors introduce an approach for the goal supervised clustering and show its advantage in terms of data analytics as well as prediction. To verify the feasibility and the performance of this approach the authors conducted several experiments on a real dataset derived from an apparel industry.


2020 ◽  
Author(s):  
Andrea Giani ◽  
de Souza Patricia Borges ◽  
Stefania Bartoletti ◽  
Flavio Morselli ◽  
Andrea Conti ◽  
...  

2019 ◽  
Vol 7 (3) ◽  
pp. 50-54
Author(s):  
N. Thilagavathi ◽  
Christy Wood ◽  
V. Hemalakshumi ◽  
V. Mathumiithaa

Author(s):  
Wing Chiu Tam ◽  
Osei Poku ◽  
R. D. (Shawn) Blanton

Abstract Systematic defects due to design-process interactions are a dominant component of integrated circuit (IC) yield loss in nano-scaled technologies. Test structures do not adequately represent the product in terms of feature diversity and feature volume, and therefore are unable to identify all the systematic defects that affect the product. This paper describes a method that uses diagnosis to identify layout features that do not yield as expected. Specifically, clustering techniques are applied to layout snippets of diagnosis-implicated regions from (ideally) a statistically-significant number of IC failures for identifying feature commonalties. Experiments involving an industrial chip demonstrate the identification of possible systematic yield loss due to lithographic hotspots.


Sign in / Sign up

Export Citation Format

Share Document