ShadingNet: Image Intrinsics by Fine-Grained Shading Decomposition

International Journal of Computer Vision ◽

10.1007/s11263-021-01477-5 ◽

2021 ◽

Author(s):

Anil S. Baslamisli ◽

Partha Das ◽

Hoang-An Le ◽

Sezer Karaoglu ◽

Theo Gevers

Keyword(s):

Neural Network ◽

Large Scale ◽

State Of The Art ◽

Image Decomposition ◽

Natural Environments ◽

Decomposition Algorithms ◽

Ambient Light ◽

Fine Grained ◽

Large Scale Dataset ◽

Direct Illumination

AbstractIn general, intrinsic image decomposition algorithms interpret shading as one unified component including all photometric effects. As shading transitions are generally smoother than reflectance (albedo) changes, these methods may fail in distinguishing strong photometric effects from reflectance variations. Therefore, in this paper, we propose to decompose the shading component into direct (illumination) and indirect shading (ambient light and shadows) subcomponents. The aim is to distinguish strong photometric effects from reflectance variations. An end-to-end deep convolutional neural network (ShadingNet) is proposed that operates in a fine-to-coarse manner with a specialized fusion and refinement unit exploiting the fine-grained shading model. It is designed to learn specific reflectance cues separated from specific photometric effects to analyze the disentanglement capability. A large-scale dataset of scene-level synthetic images of outdoor natural environments is provided with fine-grained intrinsic image ground-truths. Large scale experiments show that our approach using fine-grained shading decompositions outperforms state-of-the-art algorithms utilizing unified shading on NED, MPI Sintel, GTA V, IIW, MIT Intrinsic Images, 3DRMS and SRD datasets.

Download Full-text

VIPPrint: Validating Synthetic Image Detection and Source Linking Methods on a Large Scale Dataset of Printed Documents

Journal of Imaging ◽

10.3390/jimaging7030050 ◽

2021 ◽

Vol 7 (3) ◽

pp. 50

Author(s):

Anselmo Ferreira ◽

Ehsan Nowroozi ◽

Mauro Barni

Keyword(s):

Large Scale ◽

State Of The Art ◽

Child Pornography ◽

Forensic Analysis ◽

Synthetic Image ◽

Image Detection ◽

Face Images ◽

Large Scale Dataset ◽

Scanned Images ◽

Analysis Of The Images

The possibility of carrying out a meaningful forensic analysis on printed and scanned images plays a major role in many applications. First of all, printed documents are often associated with criminal activities, such as terrorist plans, child pornography, and even fake packages. Additionally, printing and scanning can be used to hide the traces of image manipulation or the synthetic nature of images, since the artifacts commonly found in manipulated and synthetic images are gone after the images are printed and scanned. A problem hindering research in this area is the lack of large scale reference datasets to be used for algorithm development and benchmarking. Motivated by this issue, we present a new dataset composed of a large number of synthetic and natural printed face images. To highlight the difficulties associated with the analysis of the images of the dataset, we carried out an extensive set of experiments comparing several printer attribution methods. We also verified that state-of-the-art methods to distinguish natural and synthetic face images fail when applied to print and scanned images. We envision that the availability of the new dataset and the preliminary experiments we carried out will motivate and facilitate further research in this area.

Download Full-text

SketchGNN: Semantic Sketch Segmentation with Graph Neural Networks

ACM Transactions on Graphics ◽

10.1145/3450284 ◽

2021 ◽

Vol 40 (3) ◽

pp. 1-13

Author(s):

Lumin Yang ◽

Jiajie Zhuang ◽

Hongbo Fu ◽

Xiangzhi Wei ◽

Kun Zhou ◽

...

Keyword(s):

Neural Network ◽

Neural Networks ◽

Network Architecture ◽

Large Scale ◽

State Of The Art ◽

Semantic Segmentation ◽

Structure Information ◽

Graph Neural Networks ◽

Node Labels ◽

Point Level

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.

Download Full-text

SHEDR: An End-to-End Deep Neural Event Detection and Recommendation Framework for Hyperlocal News Using Social Media

INFORMS Journal on Computing ◽

10.1287/ijoc.2021.1112 ◽

2021 ◽

Author(s):

Yuheng Hu ◽

Yili Hong

Keyword(s):

Neural Network ◽

Social Media ◽

Deep Learning ◽

Event Detection ◽

Large Scale ◽

Short Term Memory ◽

State Of The Art ◽

Neural Network Models ◽

Neural Event ◽

End To End

Residents often rely on newspapers and television to gather hyperlocal news for community awareness and engagement. More recently, social media have emerged as an increasingly important source of hyperlocal news. Thus far, the literature on using social media to create desirable societal benefits, such as civic awareness and engagement, is still in its infancy. One key challenge in this research stream is to timely and accurately distill information from noisy social media data streams to community members. In this work, we develop SHEDR (social media–based hyperlocal event detection and recommendation), an end-to-end neural event detection and recommendation framework with a particular use case for Twitter to facilitate residents’ information seeking of hyperlocal events. The key model innovation in SHEDR lies in the design of the hyperlocal event detector and the event recommender. First, we harness the power of two popular deep neural network models, the convolutional neural network (CNN) and long short-term memory (LSTM), in a novel joint CNN-LSTM model to characterize spatiotemporal dependencies for capturing unusualness in a region of interest, which is classified as a hyperlocal event. Next, we develop a neural pairwise ranking algorithm for recommending detected hyperlocal events to residents based on their interests. To alleviate the sparsity issue and improve personalization, our algorithm incorporates several types of contextual information covering topic, social, and geographical proximities. We perform comprehensive evaluations based on two large-scale data sets comprising geotagged tweets covering Seattle and Chicago. We demonstrate the effectiveness of our framework in comparison with several state-of-the-art approaches. We show that our hyperlocal event detection and recommendation models consistently and significantly outperform other approaches in terms of precision, recall, and F-1 scores. Summary of Contribution: In this paper, we focus on a novel and important, yet largely underexplored application of computing—how to improve civic engagement in local neighborhoods via local news sharing and consumption based on social media feeds. To address this question, we propose two new computational and data-driven methods: (1) a deep learning–based hyperlocal event detection algorithm that scans spatially and temporally to detect hyperlocal events from geotagged Twitter feeds; and (2) A personalized deep learning–based hyperlocal event recommender system that systematically integrates several contextual cues such as topical, geographical, and social proximity to recommend the detected hyperlocal events to potential users. We conduct a series of experiments to examine our proposed models. The outcomes demonstrate that our algorithms are significantly better than the state-of-the-art models and can provide users with more relevant information about the local neighborhoods that they live in, which in turn may boost their community engagement.

Download Full-text

Multi-Task Driven Feature Models for Thermal Infrared Tracking

Proceedings of the AAAI Conference on Artificial Intelligence ◽

10.1609/aaai.v34i07.6828 ◽

2020 ◽

Vol 34 (07) ◽

pp. 11604-11611 ◽

Cited By ~ 3

Author(s):

Qiao Liu ◽

Xin Li ◽

Zhenyu He ◽

Nana Fan ◽

Di Yuan ◽

...

Keyword(s):

Large Scale ◽

State Of The Art ◽

Thermal Infrared ◽

Training Dataset ◽

Feature Models ◽

Relative Gain ◽

Tir Domain ◽

Fine Grained ◽

Rgb Images ◽

Infrared Tracking

Existing deep Thermal InfraRed (TIR) trackers usually use the feature models of RGB trackers for representation. However, these feature models learned on RGB images are neither effective in representing TIR objects nor taking fine-grained TIR information into consideration. To this end, we develop a multi-task framework to learn the TIR-specific discriminative features and fine-grained correlation features for TIR tracking. Specifically, we first use an auxiliary classification network to guide the generation of TIR-specific discriminative features for distinguishing the TIR objects belonging to different classes. Second, we design a fine-grained aware module to capture more subtle information for distinguishing the TIR objects belonging to the same class. These two kinds of features complement each other and recognize TIR objects in the levels of inter-class and intra-class respectively. These two feature models are learned using a multi-task matching framework and are jointly optimized on the TIR tracking task. In addition, we develop a large-scale TIR training dataset to train the network for adapting the model to the TIR domain. Extensive experimental results on three benchmarks show that the proposed algorithm achieves a relative gain of 10% over the baseline and performs favorably against the state-of-the-art methods. Codes and the proposed TIR dataset are available at https://github.com/QiaoLiuHit/MMNet.

Download Full-text

Precise No-Reference Image Quality Evaluation Based on Distortion Identification

ACM Transactions on Multimedia Computing Communications and Applications ◽

10.1145/3468872 ◽

2021 ◽

Vol 17 (3s) ◽

pp. 1-21

Author(s):

Chenggang Yan ◽

Tong Teng ◽

Yutao Liu ◽

Yongbing Zhang ◽

Haoqian Wang ◽

...

Keyword(s):

Neural Network ◽

Image Quality ◽

Quality Assessment ◽

Large Scale ◽

Quality Evaluation ◽

Image Quality Assessment ◽

State Of The Art ◽

Gaussian White Noise ◽

The State ◽

Reference Image

The difficulty of no-reference image quality assessment (NR IQA) often lies in the lack of knowledge about the distortion in the image, which makes quality assessment blind and thus inefficient. To tackle such issue, in this article, we propose a novel scheme for precise NR IQA, which includes two successive steps, i.e., distortion identification and targeted quality evaluation. In the first step, we employ the well-known Inception-ResNet-v2 neural network to train a classifier that classifies the possible distortion in the image into the four most common distortion types, i.e., Gaussian white noise (WN), Gaussian blur (GB), jpeg compression (JPEG), and jpeg2000 compression (JP2K). Specifically, the deep neural network is trained on the large-scale Waterloo Exploration database, which ensures the robustness and high performance of distortion classification. In the second step, after determining the distortion type of the image, we then design a specific approach to quantify the image distortion level, which can estimate the image quality specially and more precisely. Extensive experiments performed on LIVE, TID2013, CSIQ, and Waterloo Exploration databases demonstrate that (1) the accuracy of our distortion classification is higher than that of the state-of-the-art distortion classification methods, and (2) the proposed NR IQA method outperforms the state-of-the-art NR IQA methods in quantifying the image quality.

Download Full-text

Legal Judgment Prediction Based on Multiclass Information Fusion

Complexity ◽

10.1155/2020/3089189 ◽

2020 ◽

Vol 2020 ◽

pp. 1-12

Author(s):

Kongfan Zhu ◽

Rundong Guo ◽

Weifeng Hu ◽

Zeqiang Li ◽

Yujun Li

Keyword(s):

Information Fusion ◽

Real World ◽

Large Scale ◽

State Of The Art ◽

External Information ◽

Criminal Cases ◽

Law System ◽

Large Scale Dataset ◽

Assistant Systems ◽

Civil Law System

Legal judgment prediction (LJP), as an effective and critical application in legal assistant systems, aims to determine the judgment results according to the information based on the fact determination. In real-world scenarios, to deal with the criminal cases, judges not only take advantage of the fact description, but also consider the external information, such as the basic information of defendant and the court view. However, most existing works take the fact description as the sole input for LJP and ignore the external information. We propose a Transformer-Hierarchical-Attention-Multi-Extra (THME) Network to make full use of the information based on the fact determination. We conduct experiments on a real-world large-scale dataset of criminal cases in the civil law system. Experimental results show that our method outperforms state-of-the-art LJP methods on all judgment prediction tasks.

Download Full-text

Robust place recognition based on salient landmarks screening and convolutional neural network features

International Journal of Advanced Robotic Systems ◽

10.1177/1729881420966966 ◽

2020 ◽

Vol 17 (6) ◽

pp. 172988142096696

Author(s):

Jie Niu ◽

Kun Qian

Keyword(s):

Neural Network ◽

Convolutional Neural Network ◽

State Of The Art ◽

Environmental Modeling ◽

Superior Performance ◽

Similarity Measurement ◽

Natural Environments ◽

Place Recognition ◽

Average Precision ◽

Specific Object

In this work, we propose a robust place recognition measurement in natural environments based on salient landmark screening and convolutional neural network (CNN) features. First, the salient objects in the image are segmented as candidate landmarks. Then, a category screening network is designed to remove specific object types that are not suitable for environmental modeling. Finally, a three-layer CNN is used to get highly representative features of the salient landmarks. In the similarity measurement, a Siamese network is chosen to calculate the similarity between images. Experiments were conducted on three challenging benchmark place recognition datasets and superior performance was achieved compared to other state-of-the-art methods, including FABMAP, SeqSLAM, SeqCNNSLAM, and PlaceCNN. Our method obtains the best results on the precision–recall curves, and the average precision reaches 78.43%, which is the best of the comparison methods. This demonstrates that the CNN features on the screened salient landmarks can be against a strong viewpoint and condition variations.

Download Full-text

A Hybrid Network for Large-Scale Action Recognition from RGB and Depth Modalities

Sensors ◽

10.3390/s20113305 ◽

2020 ◽

Vol 20 (11) ◽

pp. 3305 ◽

Cited By ~ 1

Author(s):

Huogen Wang ◽

Zhanjie Song ◽

Wanqing Li ◽

Pichao Wang

Keyword(s):

Neural Network ◽

Action Recognition ◽

Canonical Correlation ◽

Large Scale ◽

State Of The Art ◽

Hybrid Network ◽

Support Vector ◽

Multiple Modalities ◽

Large Margin ◽

Percentage Points

The paper presents a novel hybrid network for large-scale action recognition from multiple modalities. The network is built upon the proposed weighted dynamic images. It effectively leverages the strengths of the emerging Convolutional Neural Network (CNN) and Recurrent Neural Network (RNN) based approaches to specifically address the challenges that occur in large-scale action recognition and are not fully dealt with by the state-of-the-art methods. Specifically, the proposed hybrid network consists of a CNN based component and an RNN based component. Features extracted by the two components are fused through canonical correlation analysis and then fed to a linear Support Vector Machine (SVM) for classification. The proposed network achieved state-of-the-art results on the ChaLearn LAP IsoGD, NTU RGB+D and Multi-modal & Multi-view & Interactive ( M 2 I ) datasets and outperformed existing methods by a large margin (over 10 percentage points in some cases).

Download Full-text

Optimization of Deep Neural Network based Hand Gesture Classification Model for Large-Scale Dataset

Proceedings of the 2nd International Conference on Big Data Technologies - ICBDT2019 ◽

10.1145/3358528.3358571 ◽

2019 ◽

Author(s):

Jie Yu ◽

Tao Xu ◽

Zhiquan Feng ◽

Weifeng Wang ◽

Li Ma ◽

...

Keyword(s):

Neural Network ◽

Large Scale ◽

Deep Neural Network ◽

Classification Model ◽

Hand Gesture ◽

Large Scale Dataset ◽

Gesture Classification

Download Full-text

Truss Decomposition using Triangle Graphs

10.21203/rs.3.rs-819379/v1 ◽

2021 ◽

Author(s):

Mohsen Rezvani ◽

Mojtaba Rezvani

Keyword(s):

Social Networks ◽

Large Scale ◽

State Of The Art ◽

Decomposition Algorithms ◽

Scalable Algorithm ◽

Large Scale Network ◽

Triangle Graph ◽

Speed Up ◽

Scale Network ◽

High Degree

Abstract Recent studies have shown that social networks exhibit interesting characteristics such as community structures, i.e., vertexes can be clustered into communities that are densely connected together and loosely connected to other vertices. In order to identify communities, several definitions have been proposed that can characterize the density of connections among vertices in the networks. Dense triangle cores, also known as $k$-trusses, are subgraphs in which every edge participates at least $k-2$ triangles (a clique of size 3), exhibiting a high degree of cohesiveness among vertices. There are a number of research works that propose $k$-truss decomposition algorithms. However, existing in-memory algorithms for computing $k$-truss are inefficient for handling today’s massive networks. In this paper, we propose an efficient, yet scalable algorithm for finding $k$-trusses in a large-scale network. To this end, we propose a new structure, called triangle graph to speed up the process of finding the $k$-trusses and prove the correctness and efficiency of our method. We also evaluate the performance of the proposed algorithms through extensive experiments using real-world networks. The results of comprehensive experiments show that the proposed algorithms outperform the state-of-the-art methods by several orders of magnitudes in running time.

Download Full-text