Restrictive Methods and Meta Methods for Thematically Focused Web Exploration

Handbook of Research on Web Information Systems Quality ◽

10.4018/978-1-59904-847-5.ch023 ◽

2011 ◽

pp. 405-423

Author(s):

Sergej Sizov ◽

Stefan Siersdorfer

Keyword(s):

Experimental Study ◽

Probabilistic Model ◽

Supervised Classification ◽

Unsupervised Clustering ◽

Use Case ◽

Clustering Methods ◽

Web Crawler ◽

Web Documents ◽

Case Examples ◽

Expert Search

This chapter addresses the problem of automatically organizing heterogeneous collections of Web documents for the generation of thematically-focused expert search engines and portals. As a possible application scenario for our techniques, we consider a focused Web crawler that aims to populate topics of interest by automatically categorizing newly-fetched documents. A higher accuracy of the underlying supervised (classification) and unsupervised (clustering) methods is achieved by leaving out uncertain documents rather than assigning them to inappropriate topics or clusters with low confidence. We introduce a formal probabilistic model for ensemble-based meta methods and explain how it can be used for constructing estimators and for quality-oriented tuning. Furthermore, we provide a comprehensive experimental study of the proposed meta methodology and realistic use-case examples.

Download Full-text

Evaluation of Unsupervised Clustering Methods on Hyperspectral Image Data Sets

2018 IEEE International Conference on Progress in Informatics and Computing (PIC) ◽

10.1109/pic.2018.8706315 ◽

2018 ◽

Author(s):

Wei Zhang ◽

Zhichao Lian ◽

Chanying Huang

Keyword(s):

Hyperspectral Image ◽

Image Data ◽

Unsupervised Clustering ◽

Data Sets ◽

Clustering Methods ◽

Hyperspectral Image Data

Download Full-text

Identifying Promising Application Areas for Cyber-Physical and Complex Event Processing in Logistics Practice

Logistics ◽

10.3390/logistics2040023 ◽

2018 ◽

Vol 2 (4) ◽

pp. 23

Author(s):

Cyril Alias ◽

Frank Alarcón Olalla ◽

Hauke Iwersen ◽

Julius Ollesch ◽

Bernd Noche

Keyword(s):

Cyber Physical Systems ◽

Complex Event Processing ◽

Event Processing ◽

Use Case ◽

Logistics Industry ◽

Physical Systems ◽

Case Examples ◽

Promising Application ◽

Cost Efficient ◽

Huge Challenge

In the course of the ongoing era of digitization, cyber-physical systems and complex event processing belong to the most discussed technologies nowadays. The huge challenge that digitization is forming to the transportation and logistics sector is largely accepted by the responsible organizations. Despite initial steps being taken towards digitized value-creation, many professionals wonder about how to realize the ideas and stumble with the precise steps to be taken. With the vision of smart logistics in mind and cost-efficient technologies available, they require a systematic methodology to exploit the potentials accompanying digitization. With the help of an effective and targeted workshop procedure, potentially appropriate application areas with promising benefit potentials can be identified effectively. Such a workshop procedure needs to be a stepwise approach in order to carefully consider all the relevant aspects and to allow for organizational acceptance to grow. In three real-world use case examples from different areas of the transportation and logistics industry, promising applications of cyber-physical systems and complex event processing are identified and pertaining event patterns of critical situations developed in order to make realization easier at a later stage. Each use case example exhibits a frequently occurring problem that can be effectively addressed by using the above-mentioned technology.

Download Full-text

Battery Sizing for Different Loads and RES Production Scenarios through Unsupervised Clustering Methods

Forecasting ◽

10.3390/forecast3040041 ◽

2021 ◽

Vol 3 (4) ◽

pp. 663-681

Author(s):

Alfredo Nespoli ◽

Andrea Matteri ◽

Silvia Pretto ◽

Luca De De Ciechi ◽

Emanuele Ogliari

Keyword(s):

Power Generation ◽

Renewable Energy Sources ◽

Storage System ◽

Energy Storage System ◽

Hybrid Plant ◽

Unsupervised Clustering ◽

Clustering Methods ◽

Pv System ◽

Feasible Solutions ◽

Battery Energy

The increasing penetration of Renewable Energy Sources (RESs) in the energy mix is determining an energy scenario characterized by decentralized power production. Between RESs power generation technologies, solar PhotoVoltaic (PV) systems constitute a very promising option, but their production is not programmable due to the intermittent nature of solar energy. The coupling between a PV facility and a Battery Energy Storage System (BESS) allows to achieve a greater flexibility in power generation. However, the design phase of a PV+BESS hybrid plant is challenging due to the large number of possible configurations. The present paper proposes a preliminary procedure aimed at predicting a family of batteries which is suitable to be coupled with a given PV plant configuration. The proposed procedure is applied to new hypothetical plants built to fulfill the energy requirements of a commercial and an industrial load. The energy produced by the PV system is estimated on the basis of a performance analysis carried out on similar real plants. The battery operations are established through two decision-tree-like structures regulating charge and discharge respectively. Finally, an unsupervised clustering is applied to all the possible PV+BESS configurations in order to identify the family of feasible solutions.

Download Full-text

Unsupervised Clustering Methods for Medical Data: An Application to Thyroid Gland Data

Artificial Neural Networks and Neural Information Processing — ICANN/ICONIP 2003 - Lecture Notes in Computer Science ◽

10.1007/3-540-44989-2_83 ◽

2003 ◽

pp. 695-701 ◽

Cited By ~ 4

Author(s):

Songül Albayrak

Keyword(s):

Thyroid Gland ◽

Medical Data ◽

Unsupervised Clustering ◽

Clustering Methods

Download Full-text

An integrated approach for network traffic analysis using unsupervised clustering and supervised classification

International Journal of Internet Technology and Secured Transactions ◽

10.1504/ijitst.2019.102797 ◽

2019 ◽

Vol 9 (4) ◽

pp. 517

Author(s):

Kothandapani Chokkanathan ◽

S. Koteeswaran

Keyword(s):

Network Traffic ◽

Supervised Classification ◽

Integrated Approach ◽

Traffic Analysis ◽

Unsupervised Clustering ◽

Network Traffic Analysis

Download Full-text

Data Clustering

Web Data Management Practices ◽

10.4018/978-1-59904-228-2.ch001 ◽

2007 ◽

pp. 1-33 ◽

Cited By ~ 4

Author(s):

Dušan Husek ◽

Jaroslav Pokorny ◽

Hana Rezankova ◽

Václav Snasel

Keyword(s):

Information Retrieval ◽

Data Clustering ◽

Important Task ◽

Clustering Methods ◽

Web Documents ◽

Web Communities

Document and information retrieval (IR) is an important task for Web communities. In this chapter, we introduce some clustering methods and focus on their use for the clustering, classification, and retrieval of Web documents.

Download Full-text

Embodied ethnography in psychology: Learning points from expatriate migration research

Culture & Psychology ◽

10.1177/1354067x19898677 ◽

2020 ◽

Vol 26 (4) ◽

pp. 803-818

Author(s):

Sanna Schliewe

Keyword(s):

Participatory Research ◽

Use Case ◽

Psychological Processes ◽

Case Examples ◽

Sensory Experiences ◽

Migration Research ◽

Recent Developments ◽

Learning Points

Interviews and observation are often the preferred methods when psychologists conduct fieldwork. However, psychology can learn from recent developments in anthropology and sociology. Here researchers use their own embodied sensations in participatory research as a way to investigate less verbalized, more hidden, sensorial, and affective aspects of the life-worlds they are studying. In this article, I use case examples from research on privileged migrants (expatriates) to demonstrate how significant insights can emerge when we apply an embodied approach in our research. Migration is not only behavioral, social, verbal, or imaginative events but includes the migrant’s body—its sensory experiences and emotions. Thus, we need to embrace additional methods to investigate multifaceted psychological processes such as migration.

Download Full-text

A Probabilistic Model for Classification of Multiple-Record Web Documents

OOIS 2000 ◽

10.1007/978-1-4471-0299-1_29 ◽

2001 ◽

pp. 349-357

Author(s):

June Tang ◽

Yiu-Kai Ng

Keyword(s):

Probabilistic Model ◽

Web Documents

Download Full-text

The Influence of Growing Region on Fatty Acids and Sterol Composition of Iranian Olive Oils by Unsupervised Clustering Methods

Journal of the American Oil Chemists Society ◽

10.1007/s11746-011-1922-9 ◽

2011 ◽

Vol 89 (3) ◽

pp. 371-378 ◽

Cited By ~ 17

Author(s):

Z. Piravi-Vanak ◽

Jahan B. Ghasemi ◽

M. Ghavami ◽

H. Ezzatpanah ◽

E. Zolfonoun

Keyword(s):

Fatty Acids ◽

Sterol Composition ◽

Unsupervised Clustering ◽

Clustering Methods ◽

Olive Oils ◽

Growing Region

Download Full-text

Design and implementation of crawling algorithm to collect deep web information for web archiving

Data Technologies and Applications ◽

10.1108/dta-07-2017-0053 ◽

2018 ◽

Vol 52 (2) ◽

pp. 266-277 ◽

Cited By ~ 2

Author(s):

Hyo-Jung Oh ◽

Dong-Hyun Won ◽

Chonghyuck Kim ◽

Sung-Hee Park ◽

Yong Kim

Keyword(s):

Deep Web ◽

Web Crawler ◽

Web Archiving ◽

Web Browser ◽

Web Documents ◽

Content Type ◽

Web Document ◽

Web Information ◽

Web Crawlers ◽

The Web

Purpose The purpose of this paper is to describe the development of an algorithm for realizing web crawlers that automatically collect dynamically generated webpages from the deep web. Design/methodology/approach This study proposes and develops an algorithm to collect web information as if the web crawler gathers static webpages by managing script commands as links. The proposed web crawler actually experiments with the algorithm by collecting deep webpages. Findings Among the findings of this study is that if the actual crawling process provides search results as script pages, the outcome only collects the first page. However, the proposed algorithm can collect deep webpages in this case. Research limitations/implications To use a script as a link, a human must first analyze the web document. This study uses the web browser object provided by Microsoft Visual Studio as a script launcher, so it cannot collect deep webpages if the web browser object cannot launch the script, or if the web document contains script errors. Practical implications The research results show deep webs are estimated to have 450 to 550 times more information than surface webpages, and it is difficult to collect web documents. However, this algorithm helps to enable deep web collection through script runs. Originality/value This study presents a new method to be utilized with script links instead of adopting previous keywords. The proposed algorithm is available as an ordinary URL. From the conducted experiment, analysis of scripts on individual websites is needed to employ them as links.

Download Full-text