Receipt

Tip decomposition is a crucial kernel for mining dense subgraphs in bipartite networks, with applications in spam detection, analysis of affiliation networks etc. It creates a hierarchy of vertex-induced subgraphs with varying densities determined by the participation of vertices in butterflies (2, 2-bicliques). To build the hierarchy, existing algorithms iteratively follow a delete-update (peeling) process: deleting vertices with the minimum number of butterflies and correspondingly updating the butterfly count of their 2-hop neighbors. The need to explore 2-hop neighborhood renders tip-decomposition computationally very expensive. Furthermore, the inherent sequentiality in peeling only minimum butterfly vertices makes derived parallel algorithms prone to heavy synchronization. In this paper, we propose a novel parallel tip-decomposition algorithm - REfine CoarsE-grained Independent Tasks (RECEIPT) that relaxes the peeling order restrictions by partitioning the vertices into multiple independent subsets that can be concurrently peeled. This enables RECEIPT to simultaneously achieve a high degree of parallelism and dramatic reduction in synchronizations. Further, RECEIPT employs a hybrid peeling strategy along with other optimizations that drastically reduce the amount of wedge exploration and execution time. We perform detailed experimental evaluation of RECEIPT on a shared-memory multicore server. It can process some of the largest publicly available bipartite datasets orders of magnitude faster than the state-of-the-art algorithms - achieving up to 1100× and 64× reduction in the number of thread synchronizations and traversed wedges, respectively. Using 36 threads, RECEIPT can provide up to 17.1× self-relative speedup.

Download Full-text

In-Memory Interval Joins

The VLDB Journal ◽

10.1007/s00778-020-00639-0 ◽

2021 ◽

Author(s):

Panagiotis Bouros ◽

Nikos Mamoulis ◽

Dimitrios Tsitsigkos ◽

Manolis Terrovitis

Keyword(s):

Parallel Computation ◽

State Of The Art ◽

Complex Data ◽

Plane Sweep ◽

Join Algorithm ◽

Sweep Algorithm ◽

Join Algorithms ◽

Domain Partitioning ◽

Complex Data Structure ◽

Independent Tasks

AbstractThe interval join is a popular operation in temporal, spatial, and uncertain databases. The majority of interval join algorithms assume that input data reside on disk and so, their focus is to minimize the I/O accesses. Recently, an in-memory approach based on plane sweep (PS) for modern hardware was proposed which greatly outperforms previous work. However, this approach relies on a complex data structure and its parallelization has not been adequately studied. In this article, we investigate in-memory interval joins in two directions. First, we explore the applicability of a largely ignored forward scan (FS)-based plane sweep algorithm, for single-threaded join evaluation. We propose four optimizations for FS that greatly reduce its cost, making it competitive or even faster than the state-of-the-art. Second, we study in depth the parallel computation of interval joins. We design a non-partitioning-based approach that determines independent tasks of the join algorithm to run in parallel. Then, we address the drawbacks of the previously proposed hash-based partitioning and suggest a domain-based partitioning approach that does not produce duplicate results. Within our approach, we propose a novel breakdown of the partition-joins into mini-joins to be scheduled in the available CPU threads and propose an adaptive domain partitioning, aiming at load balancing. We also investigate how the partitioning phase can benefit from modern parallel hardware. Our thorough experimental analysis demonstrates the advantage of our novel partitioning-based approach for parallel computation.

Download Full-text

Contextualized Filtering for Shared Cyber Threat Information

Sensors ◽

10.3390/s21144890 ◽

2021 ◽

Vol 21 (14) ◽

pp. 4890

Author(s):

Athanasios Dimitriadis ◽

Christos Prassas ◽

Jose Luis Flores ◽

Boonserm Kulvatunyou ◽

Nenad Ivezic ◽

...

Keyword(s):

Information Sharing ◽

Business Processes ◽

State Of The Art ◽

Contextual Information ◽

Coarse Grained ◽

Business Information ◽

Cyber Threat ◽

Domain Expertise ◽

Multi Level ◽

Filtering Approach

Cyber threat information sharing is an imperative process towards achieving collaborative security, but it poses several challenges. One crucial challenge is the plethora of shared threat information. Therefore, there is a need to advance filtering of such information. While the state-of-the-art in filtering relies primarily on keyword- and domain-based searching, these approaches require sizable human involvement and rarely available domain expertise. Recent research revealed the need for harvesting of business information to fill the gap in filtering, albeit it resulted in providing coarse-grained filtering based on the utilization of such information. This paper presents a novel contextualized filtering approach that exploits standardized and multi-level contextual information of business processes. The contextual information describes the conditions under which a given threat information is actionable from an organization perspective. Therefore, it can automate filtering by measuring the equivalence between the context of the shared threat information and the context of the consuming organization. The paper directly contributes to filtering challenge and indirectly to automated customized threat information sharing. Moreover, the paper proposes the architecture of a cyber threat information sharing ecosystem that operates according to the proposed filtering approach and defines the characteristics that are advantageous to filtering approaches. Implementation of the proposed approach can support compliance with the Special Publication 800-150 of the National Institute of Standards and Technology.

Download Full-text

Thinking Globally, Acting Locally: Distantly Supervised Global-to-Local Knowledge Selection for Background Based Conversation

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i05.6395 ◽

2020 ◽

Vol 34 (05) ◽

pp. 8697-8704

Author(s):

Pengjie Ren ◽

Zhumin Chen ◽

Christof Monz ◽

Jun Ma ◽

Maarten De Rijke

Keyword(s):

Local Knowledge ◽

State Of The Art ◽

Background Knowledge ◽

Global Perspective ◽

Conversational Context ◽

Human Evaluation ◽

Text Fragment ◽

Selection For ◽

High Degree ◽

Transition Vector

Background Based Conversation (BBCs) have been introduced to help conversational systems avoid generating overly generic responses. In a BBC, the conversation is grounded in a knowledge source. A key challenge in BBCs is Knowledge Selection (KS): given a conversational context, try to find the appropriate background knowledge (a text fragment containing related facts or comments, etc.) based on which to generate the next response. Previous work addresses KS by employing attention and/or pointer mechanisms. These mechanisms use a local perspective, i.e., they select a token at a time based solely on the current decoding state. We argue for the adoption of a global perspective, i.e., pre-selecting some text fragments from the background knowledge that could help determine the topic of the next response. We enhance KS in BBCs by introducing a Global-to-Local Knowledge Selection (GLKS) mechanism. Given a conversational context and background knowledge, we first learn a topic transition vector to encode the most likely text fragments to be used in the next response, which is then used to guide the local KS at each decoding timestamp. In order to effectively learn the topic transition vector, we propose a distantly supervised learning schema. Experimental results show that the GLKS model significantly outperforms state-of-the-art methods in terms of both automatic and human evaluation. More importantly, GLKS achieves this without requiring any extra annotations, which demonstrates its high degree of scalability.

Download Full-text

Evaluation of the effectiveness and efficiency of state-of-the-art features and models for automatic speech recognition error detection

Journal Of Big Data ◽

10.1186/s40537-020-00391-w ◽

2021 ◽

Vol 8 (1) ◽

Author(s):

Asmaa El Hannani ◽

Rahhal Errattahi ◽

Fatima Zahra Salmam ◽

Thomas Hain ◽

Hassan Ouahmane

Keyword(s):

Speech Recognition ◽

Automatic Speech Recognition ◽

Error Detection ◽

State Of The Art ◽

Rapid Development ◽

Unified Framework ◽

Human Machine Interaction ◽

Detection Analysis ◽

Extensive Evaluation ◽

Effectiveness And Efficiency

AbstractSpeech based human-machine interaction and natural language understanding applications have seen a rapid development and wide adoption over the last few decades. This has led to a proliferation of studies that investigate Error detection and classification in Automatic Speech Recognition (ASR) systems. However, different data sets and evaluation protocols are used, making direct comparisons of the proposed approaches (e.g. features and models) difficult. In this paper we perform an extensive evaluation of the effectiveness and efficiency of state-of-the-art approaches in a unified framework for both errors detection and errors type classification. We make three primary contributions throughout this paper: (1) we have compared our Variant Recurrent Neural Network (V-RNN) model with three other state-of-the-art neural based models, and have shown that the V-RNN model is the most effective classifier for ASR error detection in term of accuracy and speed, (2) we have compared four features’ settings, corresponding to different categories of predictor features and have shown that the generic features are particularly suitable for real-time ASR error detection applications, and (3) we have looked at the post generalization ability of our error detection framework and performed a detailed post detection analysis in order to perceive the recognition errors that are difficult to detect.

Download Full-text

Swarm-CG: Automatic Parametrization of Bonded Terms in Coarse-Grained Models of Simple to Complex Molecules via Fuzzy Self-Tuning Particle Swarm Optimization

10.26434/chemrxiv.12613427 ◽

2020 ◽

Author(s):

Charly Empereur-mot ◽

Luca Pesce ◽

Davide Bochicchio ◽

Claudio Perego ◽

Giovanni M. Pavan

Keyword(s):

Particle Swarm Optimization ◽

State Of The Art ◽

Structural Complexity ◽

Coarse Grained ◽

Complex Molecules ◽

Swarm Optimization ◽

Molecular Systems ◽

Self Tuning ◽

User Friendly ◽

Python Package

We present Swarm-CG, a versatile software for the automatic parametrization of bonded parameters in coarse-grained (CG) models. By coupling state-of-the-art metaheuristics to Boltzmann inversion, Swarm-CG performs accurate parametrization of bonded terms in CG models composed of up to 200 pseudoatoms within 4h-24h on standard desktop machines, using an AA trajectory as reference and default<br>settings of the software. The software benefits from a user-friendly interface and two different usage modes (default and advanced). We particularly expect Swarm-CG to support and facilitate the development of new CG models for the study of molecular systems interesting for bio- and nanotechnology.<br>Excellent performances are demonstrated using a benchmark of 9 molecules of diverse nature, structural complexity and size. Swarm-CG usage is ideal in combination with popular CG force<br>fields, such as e.g. MARTINI. However, we anticipate that in principle its versatility makes it well suited for the optimization of models built based also on other CG schemes. Swarm-CG is available with all its dependencies via the Python Package Index (PIP package: swarm-cg). Tutorials and demonstration data are available at: www.github.com/GMPavanLab/SwarmCG.

Download Full-text

Relevance-guided Supervision for OpenQA with ColBERT

Transactions of the Association for Computational Linguistics ◽

10.1162/tacl_a_00405 ◽

2021 ◽

Vol 9 ◽

pp. 929-944

Author(s):

Omar Khattab ◽

Christopher Potts ◽

Matei Zaharia

Keyword(s):

Question Answering ◽

State Of The Art ◽

Training Data ◽

Coarse Grained ◽

Retrieval Model ◽

Open Domain ◽

Weak Supervision ◽

Fine Grained ◽

Vector Representations ◽

Large Corpus

Abstract Systems for Open-Domain Question Answering (OpenQA) generally depend on a retriever for finding candidate passages in a large corpus and a reader for extracting answers from those passages. In much recent work, the retriever is a learned component that uses coarse-grained vector representations of questions and passages. We argue that this modeling choice is insufficiently expressive for dealing with the complexity of natural language questions. To address this, we define ColBERT-QA, which adapts the scalable neural retrieval model ColBERT to OpenQA. ColBERT creates fine-grained interactions between questions and passages. We propose an efficient weak supervision strategy that iteratively uses ColBERT to create its own training data. This greatly improves OpenQA retrieval on Natural Questions, SQuAD, and TriviaQA, and the resulting system attains state-of-the-art extractive OpenQA performance on all three datasets.

Download Full-text

Reverse Logistic Strategy for the Management of Tire Waste in Mexico and Russia: Review and Conceptual Model

Sustainability ◽

10.3390/su10103398 ◽

2018 ◽

Vol 10 (10) ◽

pp. 3398 ◽

Cited By ~ 10

Author(s):

Maria-Lizbeth Uriarte-Miranda ◽

Santiago-Omar Caballero-Morales ◽

Jose-Luis Martinez-Flores ◽

Patricia Cano-Olivos ◽

Anastasia-Alexandrovna Akulova

Keyword(s):

European Union ◽

Reverse Logistics ◽

State Of The Art ◽

The European Union ◽

Public Companies ◽

Reverse Logistic ◽

Private And Public ◽

Key Aspects ◽

Social Entities ◽

High Degree

Management of tire waste is an important aspect of sustainable development due to its environmental, economical and social impacts. Key aspects of Reverse Logistics (RL) and Green Logistics (GL), such as recycling, re-manufacturing and reusable packaging, can improve the management of tire waste and support sustainability. Although these processes have been performed with a high degree of efficiency in other countries such as Japan, Spain and Germany, the application in Mexico and Russia has faced setbacks due to the absence of guidelines regarding legislation, RL processes, and social responsibility. Within this context, the present work aims to develop an integrated RL model to improve on these processes by considering the RL models from Russia and Mexico. For this, a review focused on RL in Mexico, Russia, Japan and the European Union (EU) was performed. Hence, the integrated model considers regulations and policies performed in each country to assign responsibilities regarding RL processes for the management of tire waste. As discussed, the implementation of efficient RL processes for the management of tire waste depends of different social entities such as the user (customer), private and public companies, and manufacturing and state-of-the-art approaches to transform waste into different products (diversification) to consider the RL scheme as a total economic system.

Download Full-text

The Force Is Strong with This One (but Not That One): What Makes a Successful Star Wars Video Game Adaptation?

Arts ◽

10.3390/arts9040131 ◽

2020 ◽

Vol 9 (4) ◽

pp. 131

Author(s):

Matthew Barr

Keyword(s):

Video Game ◽

State Of The Art ◽

Star Wars ◽

George Lucas ◽

John Williams ◽

Potential Factors ◽

Old Republic ◽

Space Opera ◽

High Degree

The Star Wars films have probably spawned more video game adaptations than any other franchise. From the 1982 release of The Empire Strikes Back on the Atari 2600 to 2019’s Jedi: Fallen Order, around one hundred officially licensed Star Wars games have been published to date. Inevitably, the quality of these adaptations has varied, ranging from timeless classics such as Star Wars: Knights of the Old Republic, to such lamentable cash grabs as the Attack of the Clones movie tie-in. But what makes certain ludic adaptations of George Lucas’ space opera more successful than others? To answer this question, the critical response to some of the best-reviewed Star Wars games is analysed here, revealing a number of potential factors to consider, including the audio-visual quality of the games, the attendant story, and aspects of the gameplay. The tension between what constitutes a good game and what makes for a good Star Wars adaptation is also discussed. It is concluded that, while many well-received adaptations share certain characteristics—such as John Williams’ iconic score, a high degree of visual fidelity, and certain mythic story elements—the very best Star Wars games are those which advance the state of the art in video games, while simultaneously evoking something of Lucas’ cinematic saga.

Download Full-text

High Quality P-Doped μc-Si:H Films as Obtained by low Temperature Lpcvd of Disilane

MRS Proceedings ◽

10.1557/proc-283- ◽

1992 ◽

Vol 283 ◽

Cited By ~ 1

Author(s):

C. Manfredotti ◽

F. Fizzotti ◽

G. Amato ◽

L. Boarino ◽

M. Abbas

Keyword(s):

Chemical Vapor ◽

Elastic Recoil ◽

Volume Percentage ◽

Elastic Recoil Detection ◽

Detection Analysis ◽

Transmission Electron ◽

Pressure Chemical ◽

Convergent Beam ◽

P Type ◽

High Degree

ABSTRACTBoth B- and P- doped silicon films deposited by Low Pressure Chemical Vapor Deposition (LPCVD) at 300 °C (p-type) and 420 °C (n-type) have been characterized by optical absorption, Photothermal Deflection Spectroscopy (PDS), resistivity, Elastic Recoil Detection Analysis (ERDA), Transmission Electron Microscopy (TEM), Convergent-Beam Electron Diffraction (CBED) and Raman spectroscopy measurements. P-doped films, deposited at large PH3 flux rates, show a high degree of microcrystallinity, indicating that P activates the nucleation process even at low temperatures. In this case, values of activation energy of resistivity as low as 0.007 eV were obtained. Both TEM and RAMAN results confirm a volume percentage of micro crystallinity above 30%. On the contrary, B-doped samples are not microcrystalline at least in the doping range investigated, and show a behaviour not different from samples deposited by PECVD.

Download Full-text

In-Line Inspection of Engine Valve Seats Using a Non-Contact Range Sensor

Volume 1: Advanced Energy Systems; Advanced and Digital Manufacturing; Advanced Materials; Aerospace ◽

10.1115/esda2008-59006 ◽

2008 ◽

Author(s):

Vijay Srivatsan ◽

Bartosz Powałka ◽

Reuven Katz ◽

John Agapiou

Keyword(s):

Combustion Engine ◽

Cone Angle ◽

Valve Seat ◽

Cross Sectional ◽

Engine Valve ◽

Coordinate Measurement ◽

Range Sensor ◽

Minimum Number ◽

High Degree ◽

Analyze Data

This paper presents a methodology for the inspection of geometric features on an internal combustion engine valve seat. Inspection of valve seat geometry using a high-precision non-contact range sensor is investigated. A method that can extract the cone angle, the valve seat length and the roundness of the cone surface has been presented. In-line implementation requires a methodology to analyze data from a minimum number of parallel cross sectional profiles of the valve seat. An in-line valve seat inspection prototype machine with two axes of motion that utilizes the method presented in this paper is presented. Validation of the method on several valve seat samples shows a high degree of repeatability, and the results are comparable to coordinate measurement machine measurements of the same samples.

Download Full-text