Non-hierarchical document clustering using the ICL distribution array processor

This paper presents two advances towards the automated three-dimensional (3-D) analysis of thick and heavily-overlapped regions in cytological preparations such as cervical/vaginal smears. First, a high speed 3-D brightfield microscope has been developed, allowing the acquisition of image data at speeds approaching 30 optical slices per second. Second, algorithms have been developed to detect and segment nuclei in spite of the extremely high image variability and low contrast typical of such regions. The analysis of such regions is inherently a 3-D problem that cannot be solved reliably with conventional 2-D imaging and image analysis methods.High-Speed 3-D imaging of the specimen is accomplished by moving the specimen axially relative to the objective lens of a standard microscope (Zeiss) at a speed of 30 steps per second, where the stepsize is adjustable from 0.2 - 5μm. The specimen is mounted on a computer-controlled, piezoelectric microstage (Burleigh PZS-100, 68/μm displacement). At each step, an optical slice is acquired using a CCD camera (SONY XC-11/71 IP, Dalsa CA-D1-0256, and CA-D2-0512 have been used) connected to a 4-node array processor system based on the Intel i860 chip.

Download Full-text

New array processor architectures for two-dimensional FIR digital filters

IEE Proceedings E Computers and Digital Techniques ◽

10.1049/ip-e.1989.0032 ◽

1989 ◽

Vol 136 (4) ◽

pp. 234 ◽

Cited By ~ 1

Author(s):

M.M. Fahmy ◽

Y. Wan

Keyword(s):

Digital Filters ◽

Two Dimensional ◽

Array Processor ◽

Processor Architectures ◽

Fir Digital Filters

Download Full-text

Comparision of Different Distance Measure Methods in Text Document Clustering

INTERNATIONAL JOURNAL OF RESEARCH AND ENGINEERING ◽

10.21276/ijre.2018.5.7.2 ◽

2018 ◽

Vol 5 (7) ◽

Author(s):

Yin Min Tun ◽

Keyword(s):

Distance Measure ◽

Document Clustering ◽

Text Document ◽

Measure Methods

Download Full-text

An Improved B-hill Climbing Optimization Technique for Solving the Text Documents Clustering Problem

Current Medical Imaging Formerly Current Medical Imaging Reviews ◽

10.2174/1573405614666180903112541 ◽

2020 ◽

Vol 16 (4) ◽

pp. 296-306 ◽

Cited By ~ 3

Author(s):

Laith Mohammad Abualigah ◽

Essam Said Hanandeh ◽

Ahamad Tajudin Khader ◽

Mohammed Abdallh Otair ◽

Shishir Kumar Shandilya

Keyword(s):

Optimization Technique ◽

Document Clustering ◽

Text Clustering ◽

Hill Climbing ◽

Text Documents ◽

Clustering Problem ◽

Text Document ◽

Text Information ◽

Amount Of Knowledge ◽

The Hill

Background: Considering the increasing volume of text document information on Internet pages, dealing with such a tremendous amount of knowledge becomes totally complex due to its large size. Text clustering is a common optimization problem used to manage a large amount of text information into a subset of comparable and coherent clusters. Aims: This paper presents a novel local clustering technique, namely, β-hill climbing, to solve the problem of the text document clustering through modeling the β-hill climbing technique for partitioning the similar documents into the same cluster. Methods: The β parameter is the primary innovation in β-hill climbing technique. It has been introduced in order to perform a balance between local and global search. Local search methods are successfully applied to solve the problem of the text document clustering such as; k-medoid and kmean techniques. Results: Experiments were conducted on eight benchmark standard text datasets with different characteristics taken from the Laboratory of Computational Intelligence (LABIC). The results proved that the proposed β-hill climbing achieved better results in comparison with the original hill climbing technique in solving the text clustering problem. Conclusion: The performance of the text clustering is useful by adding the β operator to the hill climbing.

Download Full-text

Deep Multi-view Document Clustering with Enhanced Semantic Embedding

Information Sciences ◽

10.1016/j.ins.2021.02.027 ◽

2021 ◽

Author(s):

Ruina Bai ◽

Ruizhang Huang ◽

Yanping Chen ◽

Yongbin Qin

Keyword(s):

Document Clustering ◽

Semantic Embedding

Download Full-text

Execution Array Memory Array Processor (XarMa)

2020 SoutheastCon ◽

10.1109/southeastcon44009.2020.9368300 ◽

2020 ◽

Author(s):

Gerald G. Pechanek

Keyword(s):

Array Processor ◽

Memory Array

Download Full-text

Automatic trend detection: Time-biased document clustering

Knowledge-Based Systems ◽

10.1016/j.knosys.2021.106907 ◽

2021 ◽

pp. 106907

Author(s):

Sahar Behpour ◽

Mohammadmahdi Mohammadi ◽

Mark V. Albert ◽

Zinat S. Alam ◽

Lingling Wang ◽

...

Keyword(s):

Document Clustering ◽

Trend Detection ◽

Detection Time

Download Full-text

Similarity Measure Approaches Applied in Text Document Clustering for Information Retrieval

2020 Sixth International Conference on Parallel, Distributed and Grid Computing (PDGC) ◽

10.1109/pdgc50313.2020.9315851 ◽

2020 ◽

Author(s):

Naveen Kumar ◽

Sanjay Kumar Yadav ◽

Divakar Singh Yadav

Keyword(s):

Information Retrieval ◽

Similarity Measure ◽

Document Clustering ◽

Text Document

Download Full-text

Unsupervised neural networks for automatic Arabic text summarization using document clustering and topic modeling

Expert Systems with Applications ◽

10.1016/j.eswa.2021.114652 ◽

2021 ◽

Vol 172 ◽

pp. 114652

Author(s):

Nabil Alami ◽

Mohammed Meknassi ◽

Noureddine En-nahnahi ◽

Yassine El Adlouni ◽

Ouafae Ammor

Keyword(s):

Neural Networks ◽

Topic Modeling ◽

Document Clustering ◽

Text Summarization ◽

Arabic Text ◽

Arabic Text Summarization ◽

Unsupervised Neural Networks

Download Full-text

Natural language processing methods for knowledge management—Applying document clustering for fast search and grouping of engineering documents

Concurrent Engineering ◽

10.1177/1063293x20982973 ◽

2021 ◽

pp. 1063293X2098297

Author(s):

Ivar Örn Arnarsson ◽

Otto Frost ◽

Emil Gustavsson ◽

Mats Jirstrand ◽

Johan Malmqvist

Keyword(s):

Natural Language Processing ◽

Natural Language ◽

Language Processing ◽

Domain Knowledge ◽

Clustering Algorithms ◽

Document Clustering ◽

Unstructured Data ◽

Free Text ◽

Engineering Change ◽

Engineering Documents

Product development companies collect data in form of Engineering Change Requests for logged design issues, tests, and product iterations. These documents are rich in unstructured data (e.g. free text). Previous research affirms that product developers find that current IT systems lack capabilities to accurately retrieve relevant documents with unstructured data. In this research, we demonstrate a method using Natural Language Processing and document clustering algorithms to find structurally or contextually related documents from databases containing Engineering Change Request documents. The aim is to radically decrease the time needed to effectively search for related engineering documents, organize search results, and create labeled clusters from these documents by utilizing Natural Language Processing algorithms. A domain knowledge expert at the case company evaluated the results and confirmed that the algorithms we applied managed to find relevant document clusters given the queries tested.

Download Full-text