Information diffusion, spreading of infectious diseases, and spreading of rumors are fundamental processes occurring in real-life networks. In many practical cases, one can observe when nodes become infected, but the underlying network, over which a contagion or information propagates, is hidden. Inferring properties of the underlying network is important since these properties can be used for constraining infections, forecasting, viral marketing, and so on. Moreover, for many applications, it is sufficient to recover only coarse high-level properties of this network rather than all its edges. This article conducts a systematic and extensive analysis of the following problem: Given only the infection times, find communities of highly interconnected nodes. This task significantly differs from the well-studied community detection problem since we do not observe a graph to be clustered. We carry out a thorough comparison between existing and new approaches on several large datasets and cover methodological challenges specific to this problem. One of the main conclusions is that the most stable performance and the most significant improvement on the current state-of-the-art are achieved by our proposed simple heuristic approaches agnostic to a particular graph structure and epidemic model. We also show that some well-known community detection algorithms can be enhanced by including edge weights based on the cascade data.
There are several methods for categorizing images, the most of which are statistical, geometric, model-based and structural methods. In this paper, a new method for describing images based on complex network models is presented. Each image contains a number of key points that can be identified through standard edge detection algorithms. To understand each image better, we can use these points to create a graph of the image. In order to facilitate the use of graphs, generated graphs are created in the form of a complex network of small-worlds. Complex grid features such as topological and dynamic features can be used to display image-related features. After generating this information, it normalizes them and uses them as suitable features for categorizing images. For this purpose, the generated information is given to the neural network. Based on these features and the use of neural networks, comparisons between new images are performed. The results of the article show that this method has a good performance in identifying similarities and finally categorizing them.
With the rise of deep learning technology, salient object detection algorithms based on convolutional neural networks (CNNs) are gradually replacing traditional methods. The majority of existing studies, however, focused on the integration of multi-scale features, thereby ignoring the characteristics of other significant features. To address this problem, we fully utilized the features to alleviate redundancy. In this paper, a novel CNN named local and global feature aggregation-aware network (LGFAN) has been proposed. It is a combination of the visual geometry group backbone for feature extraction, an attention module for high-quality feature filtering, and an aggregation module with a mechanism for rich salient features to ease the dilution process on the top-down pathway. Experimental results on five public datasets demonstrated that the proposed method improves computational efficiency while maintaining favorable performance.
Detection of partial discharge (PD) in switchgears requires extensive data collection and time-consuming analyses. Data from real live operational environments pose great challenges in the development of robust and efficient detection algorithms due to overlapping PDs and the strong presence of random white noise. This paper presents a novel approach using clustering for data cleaning and feature extraction of phase-resolved partial discharge (PRPD) plots derived from live operational data. A total of 452 PRPD 2D plots collected from distribution substations over a six-month period were used to test the proposed technique. The output of the clustering technique is evaluated on different types of machine learning classification techniques and the accuracy is compared using balanced accuracy score. The proposed technique extends the measurement abilities of a portable PD measurement tool for diagnostics of switchgear condition, helping utilities to quickly detect potential PD activities with minimal human manual analysis and higher accuracy.
In recent years, face detection has achieved considerable attention in the field of computer vision using traditional machine learning techniques and deep learning techniques. Deep learning is used to build the most recent and powerful face detection algorithms. However, partial face detection still remains to achieve remarkable performance. Partial faces are occluded due to hair, hat, glasses, hands, mobile phones, and side-angle-captured images. Fewer facial features can be identified from such images. In this paper, we present a deep convolutional neural network face detection method using the anchor boxes section strategy. We limited the number of anchor boxes and scales and chose only relevant to the face shape. The proposed model was trained and tested on a popular and challenging face detection benchmark dataset, i.e., Face Detection Dataset and Benchmark (FDDB), and can also detect partially covered faces with better accuracy and precision. Extensive experiments were performed, with evaluation metrics including accuracy, precision, recall, F1 score, inference time, and FPS. The results show that the proposed model is able to detect the face in the image, including occluded features, more precisely than other state-of-the-art approaches, achieving 94.8% accuracy and 98.7% precision on the FDDB dataset at 21 frames per second (FPS).
AbstractRecognition of anomalous events is a challenging but critical task in many scientific and industrial fields, especially when the properties of anomalies are unknown. In this paper, we introduce a new anomaly concept called “unicorn” or unique event and present a new, model-free, unsupervised detection algorithm to detect unicorns. The key component of the new algorithm is the Temporal Outlier Factor (TOF) to measure the uniqueness of events in continuous data sets from dynamic systems. The concept of unique events differs significantly from traditional outliers in many aspects: while repetitive outliers are no longer unique events, a unique event is not necessarily an outlier; it does not necessarily fall out from the distribution of normal activity. The performance of our algorithm was examined in recognizing unique events on different types of simulated data sets with anomalies and it was compared with the Local Outlier Factor (LOF) and discord discovery algorithms. TOF had superior performance compared to LOF and discord detection algorithms even in recognizing traditional outliers and it also detected unique events that those did not. The benefits of the unicorn concept and the new detection method were illustrated by example data sets from very different scientific fields. Our algorithm successfully retrieved unique events in those cases where they were already known such as the gravitational waves of a binary black hole merger on LIGO detector data and the signs of respiratory failure on ECG data series. Furthermore, unique events were found on the LIBOR data set of the last 30 years.
The era of the web has evolved and the industry strives to work better every day, the constant need for data to be accessible at a random moment is expanding, and with this expansion, the need to create a meaningful query technique in the web is a major concerns. To transmit meaningful data or rich semantics, machines/projects need to have the ability to reach the correct information and make adequate connections, this problem is addressed after the emergence of Web 3.0, the semantic web is developing and being collected an immense. Information to prepare, this passes the giant data management test, to provide an ideal result at any time needed. Accordingly, in this article, we present an ideal system for managing huge information using MapReduce structures that internally help an engine bring information using the strength of fair preparation using smaller map occupations and connection disclosure measures. Calculations for similarity can be challenging, this work performs five similarity detection algorithms and determines the time it takes to address the patterns that has to be a better choice in the calculation decision. The proposed framework is created using the most recent and widespread information design, that is, the JSON design, the HIVE query language to obtain and process the information planned according to the customer’s needs and calculations for the disclosure of the interface. Finally, the results on a web page is made available that helps a user stack json information and make connections somewhere in the range of dataset 1 and dataset 2. The results are examined in 2 different sets, the results show that the proposed approach helps to interconnect significantly faster; Regardless of how large the information is, the time it takes is not radically extended. The results demonstrate the interlinking of the dataset 1 and dataset 2 is most notable using LD and JW, the time required is ideal in both calculations, this paper has mechanized the method involved with interconnecting via a web page, where customers can merge two sets of data that should be associated and used.
AbstractIn this paper, we analyze a massive dataset with registers of the movement of vehicles in the bus rapid transit system Metrobús in Mexico City from February 2020 to April 2021. With these records and a division of the system into 214 geographical regions (segments), we characterize the vehicles’ activity through the statistical analysis of speeds in each zone. We use the Kullback–Leibler distance to compare the movement of vehicles in each segment and its evolution. The results for the dynamics in different zones are represented as a network where nodes define segments of the system Metrobús and edges describe similarity in the activity of vehicles. Community detection algorithms in this network allow the identification of patterns considering different levels of similarity in the distribution of speeds providing a framework for unsupervised classification of the movement of vehicles. The methods developed in this research are general and can be implemented to describe the activity of different transportation systems with detailed records of the movement of users or vehicles.
In low-resolution wide-area aerial imagery, object detection algorithms are categorized as feature extraction and machine learning approaches, where the former often requires a post-processing scheme to reduce false detections and the latter demands multi-stage learning followed by post-processing. In this paper, we present an approach on how to select post-processing schemes for aerial object detection. We evaluated combinations of each of ten vehicle detection algorithms with any of seven post-processing schemes, where the best three schemes for each algorithm were determined using average F-score metric. The performance improvement is quantified using basic information retrieval metrics as well as the classification of events, activities and relationships (CLEAR) metrics. We also implemented a two-stage learning algorithm using a hundred-layer densely connected convolutional neural network for small object detection and evaluated its degree of improvement when combined with the various post-processing schemes. The highest average F-scores after post-processing are 0.902, 0.704 and 0.891 for the Tucson, Phoenix and online VEDAI datasets, respectively. The combined results prove that our enhanced three-stage post-processing scheme achieves a mean average precision (mAP) of 63.9% for feature extraction methods and 82.8% for the machine learning approach.