Restrictive Methods and Meta Methods for Thematically Focused Web Exploration

Author(s):  
Sergej Sizov ◽  
Stefan Siersdorfer

This chapter addresses the problem of automatically organizing heterogeneous collections of Web documents for the generation of thematically-focused expert search engines and portals. As a possible application scenario for our techniques, we consider a focused Web crawler that aims to populate topics of interest by automatically categorizing newly-fetched documents. A higher accuracy of the underlying supervised (classification) and unsupervised (clustering) methods is achieved by leaving out uncertain documents rather than assigning them to inappropriate topics or clusters with low confidence. We introduce a formal probabilistic model for ensemble-based meta methods and explain how it can be used for constructing estimators and for quality-oriented tuning. Furthermore, we provide a comprehensive experimental study of the proposed meta methodology and realistic use-case examples.

Logistics ◽  
2018 ◽  
Vol 2 (4) ◽  
pp. 23
Author(s):  
Cyril Alias ◽  
Frank Alarcón Olalla ◽  
Hauke Iwersen ◽  
Julius Ollesch ◽  
Bernd Noche

In the course of the ongoing era of digitization, cyber-physical systems and complex event processing belong to the most discussed technologies nowadays. The huge challenge that digitization is forming to the transportation and logistics sector is largely accepted by the responsible organizations. Despite initial steps being taken towards digitized value-creation, many professionals wonder about how to realize the ideas and stumble with the precise steps to be taken. With the vision of smart logistics in mind and cost-efficient technologies available, they require a systematic methodology to exploit the potentials accompanying digitization. With the help of an effective and targeted workshop procedure, potentially appropriate application areas with promising benefit potentials can be identified effectively. Such a workshop procedure needs to be a stepwise approach in order to carefully consider all the relevant aspects and to allow for organizational acceptance to grow. In three real-world use case examples from different areas of the transportation and logistics industry, promising applications of cyber-physical systems and complex event processing are identified and pertaining event patterns of critical situations developed in order to make realization easier at a later stage. Each use case example exhibits a frequently occurring problem that can be effectively addressed by using the above-mentioned technology.


Forecasting ◽  
2021 ◽  
Vol 3 (4) ◽  
pp. 663-681
Author(s):  
Alfredo Nespoli ◽  
Andrea Matteri ◽  
Silvia Pretto ◽  
Luca De De Ciechi ◽  
Emanuele Ogliari

The increasing penetration of Renewable Energy Sources (RESs) in the energy mix is determining an energy scenario characterized by decentralized power production. Between RESs power generation technologies, solar PhotoVoltaic (PV) systems constitute a very promising option, but their production is not programmable due to the intermittent nature of solar energy. The coupling between a PV facility and a Battery Energy Storage System (BESS) allows to achieve a greater flexibility in power generation. However, the design phase of a PV+BESS hybrid plant is challenging due to the large number of possible configurations. The present paper proposes a preliminary procedure aimed at predicting a family of batteries which is suitable to be coupled with a given PV plant configuration. The proposed procedure is applied to new hypothetical plants built to fulfill the energy requirements of a commercial and an industrial load. The energy produced by the PV system is estimated on the basis of a performance analysis carried out on similar real plants. The battery operations are established through two decision-tree-like structures regulating charge and discharge respectively. Finally, an unsupervised clustering is applied to all the possible PV+BESS configurations in order to identify the family of feasible solutions.


Author(s):  
Dušan Husek ◽  
Jaroslav Pokorny ◽  
Hana Rezankova ◽  
Václav Snasel

Document and information retrieval (IR) is an important task for Web communities. In this chapter, we introduce some clustering methods and focus on their use for the clustering, classification, and retrieval of Web documents.


2020 ◽  
Vol 26 (4) ◽  
pp. 803-818
Author(s):  
Sanna Schliewe

Interviews and observation are often the preferred methods when psychologists conduct fieldwork. However, psychology can learn from recent developments in anthropology and sociology. Here researchers use their own embodied sensations in participatory research as a way to investigate less verbalized, more hidden, sensorial, and affective aspects of the life-worlds they are studying. In this article, I use case examples from research on privileged migrants (expatriates) to demonstrate how significant insights can emerge when we apply an embodied approach in our research. Migration is not only behavioral, social, verbal, or imaginative events but includes the migrant’s body—its sensory experiences and emotions. Thus, we need to embrace additional methods to investigate multifaceted psychological processes such as migration.


2018 ◽  
Vol 52 (2) ◽  
pp. 266-277 ◽  
Author(s):  
Hyo-Jung Oh ◽  
Dong-Hyun Won ◽  
Chonghyuck Kim ◽  
Sung-Hee Park ◽  
Yong Kim

Purpose The purpose of this paper is to describe the development of an algorithm for realizing web crawlers that automatically collect dynamically generated webpages from the deep web. Design/methodology/approach This study proposes and develops an algorithm to collect web information as if the web crawler gathers static webpages by managing script commands as links. The proposed web crawler actually experiments with the algorithm by collecting deep webpages. Findings Among the findings of this study is that if the actual crawling process provides search results as script pages, the outcome only collects the first page. However, the proposed algorithm can collect deep webpages in this case. Research limitations/implications To use a script as a link, a human must first analyze the web document. This study uses the web browser object provided by Microsoft Visual Studio as a script launcher, so it cannot collect deep webpages if the web browser object cannot launch the script, or if the web document contains script errors. Practical implications The research results show deep webs are estimated to have 450 to 550 times more information than surface webpages, and it is difficult to collect web documents. However, this algorithm helps to enable deep web collection through script runs. Originality/value This study presents a new method to be utilized with script links instead of adopting previous keywords. The proposed algorithm is available as an ordinary URL. From the conducted experiment, analysis of scripts on individual websites is needed to employ them as links.


Sign in / Sign up

Export Citation Format

Share Document