scholarly journals Separate in Latent Space: Unsupervised Single Image Layer Separation

2020 ◽  
Vol 34 (07) ◽  
pp. 11661-11668 ◽  
Author(s):  
Yunfei Liu ◽  
Feng Lu

Many real world vision tasks, such as reflection removal from a transparent surface and intrinsic image decomposition, can be modeled as single image layer separation. However, this problem is highly ill-posed, requiring accurately aligned and hard to collect triplet data to train the CNN models. To address this problem, this paper proposes an unsupervised method that requires no ground truth data triplet in training. At the core of the method are two assumptions about data distributions in the latent spaces of different layers, based on which a novel unsupervised layer separation pipeline can be derived. Then the method can be constructed based on the GANs framework with self-supervision and cycle consistency constraints, etc. Experimental results demonstrate its successfulness in outperforming existing unsupervised methods in both synthetic and real world tasks. The method also shows its ability to solve a more challenging multi-layer separation task.

Author(s):  
Risheng Liu ◽  
Zhiying Jiang ◽  
Xin Fan ◽  
Haojie Li ◽  
Zhongxuan Luo

2021 ◽  
Vol 14 (6) ◽  
pp. 997-1005
Author(s):  
Sandeep Tata ◽  
Navneet Potti ◽  
James B. Wendt ◽  
Lauro Beltrão Costa ◽  
Marc Najork ◽  
...  

Extracting structured information from templatic documents is an important problem with the potential to automate many real-world business workflows such as payment, procurement, and payroll. The core challenge is that such documents can be laid out in virtually infinitely different ways. A good solution to this problem is one that generalizes well not only to known templates such as invoices from a known vendor, but also to unseen ones. We developed a system called Glean to tackle this problem. Given a target schema for a document type and some labeled documents of that type, Glean uses machine learning to automatically extract structured information from other documents of that type. In this paper, we describe the overall architecture of Glean, and discuss three key data management challenges : 1) managing the quality of ground truth data, 2) generating training data for the machine learning model using labeled documents, and 3) building tools that help a developer rapidly build and improve a model for a given document type. Through empirical studies on a real-world dataset, we show that these data management techniques allow us to train a model that is over 5 F1 points better than the exact same model architecture without the techniques we describe. We argue that for such information-extraction problems, designing abstractions that carefully manage the training data is at least as important as choosing a good model architecture.


Author(s):  
Jose Paredes ◽  
Gerardo Simari ◽  
Maria Vanina Martinez ◽  
Marcelo Falappa

In traditional databases, the entity resolution problem (which is also known as deduplication), refers to the task of mapping multiple manifestations of virtual objects to its corresponding real-world entity. When addressing this problem, in both theory and practice, it is widely assumed that such sets of virtual object appear as the result of clerical errors, transliterations, missing or updated attributes, abbreviations, and so forth. In this paper, we address this problem under the assumption that this situation is caused by malicious actors operating in domains in which they do not wish to be identified, such as hacker forums and markets in which the participants are motivated to remain semi-anonymous (though they wish to keep their true identities secret, they find it useful for customers to identify their products and services). We are therefore in the presence of a different, even more challenging problem that we refer to as adversarial deduplication. In this paper, we study this problem via examples that arise from real-world data on malicious hacker forums and markets arising from collaborations with a cyber threat intelligence company focusing on understanding this kind of behavior. We argue that it is very difficult---if not impossible---to find ground truth data on which to build solutions to this problem, and develop a set of preliminary experiments based on training machine learning classifiers that leverage text analysis to detect potential cases of duplicate entities. Our results are encouraging as a first step towards building tools that human analysts can use to enhance their capabilities towards fighting cyber threats.


Sensors ◽  
2021 ◽  
Vol 21 (23) ◽  
pp. 7922
Author(s):  
Xin Jiang ◽  
Chunlei Zhao ◽  
Ming Zhu ◽  
Zhicheng Hao ◽  
Wen Gao

Single image dehazing is a highly challenging ill-posed problem. Existing methods including both prior-based and learning-based heavily rely on the conceptual simplified atmospheric scattering model by estimating the so-called medium transmission map and atmospheric light. However, the formation of haze in the real world is much more complicated and inaccurate estimations further degrade the dehazing performance with color distortion, artifacts and insufficient haze removal. Moreover, most dehazing networks treat spatial-wise and channel-wise features equally, but haze is practically unevenly distributed across an image, thus regions with different haze concentrations require different attentions. To solve these problems, we propose an end-to-end trainable densely connected residual spatial and channel attention network based on the conditional generative adversarial framework to directly restore a haze-free image from an input hazy image, without explicitly estimation of any atmospheric scattering parameters. Specifically, a novel residual attention module is proposed by combining spatial attention and channel attention mechanism, which could adaptively recalibrate spatial-wise and channel-wise feature weights by considering interdependencies among spatial and channel information. Such a mechanism allows the network to concentrate on more useful pixels and channels. Meanwhile, the dense network can maximize the information flow along features from different levels to encourage feature reuse and strengthen feature propagation. In addition, the network is trained with a multi-loss function, in which contrastive loss and registration loss are novel refined to restore sharper structures and ensure better visual quality. Experimental results demonstrate that the proposed method achieves the state-of-the-art performance on both public synthetic datasets and real-world images with more visually pleasing dehazed results.


Information ◽  
2018 ◽  
Vol 9 (8) ◽  
pp. 189 ◽  
Author(s):  
Jose Paredes ◽  
Gerardo Simari ◽  
Maria Martinez ◽  
Marcelo Falappa

In traditional databases, the entity resolution problem (which is also known as deduplication) refers to the task of mapping multiple manifestations of virtual objects to their corresponding real-world entities. When addressing this problem, in both theory and practice, it is widely assumed that such sets of virtual objects appear as the result of clerical errors, transliterations, missing or updated attributes, abbreviations, and so forth. In this paper, we address this problem under the assumption that this situation is caused by malicious actors operating in domains in which they do not wish to be identified, such as hacker forums and markets in which the participants are motivated to remain semi-anonymous (though they wish to keep their true identities secret, they find it useful for customers to identify their products and services). We are therefore in the presence of a different, and even more challenging, problem that we refer to as adversarial deduplication. In this paper, we study this problem via examples that arise from real-world data on malicious hacker forums and markets arising from collaborations with a cyber threat intelligence company focusing on understanding this kind of behavior. We argue that it is very difficult—if not impossible—to find ground truth data on which to build solutions to this problem, and develop a set of preliminary experiments based on training machine learning classifiers that leverage text analysis to detect potential cases of duplicate entities. Our results are encouraging as a first step towards building tools that human analysts can use to enhance their capabilities towards fighting cyber threats.


Author(s):  
Jose Paredes ◽  
Gerardo Simari ◽  
Maria Vanina Martinez ◽  
Marcelo Falappa

In traditional databases, the entity resolution problem (which is also known as deduplication), refers to the task of mapping multiple manifestations of virtual objects to its corresponding real-world entity. When addressing this problem, in both theory and practice, it is widely assumed that such sets of virtual object appear as the result of clerical errors, transliterations, missing or updated attributes, abbreviations, and so forth. In this paper, we address this problem under the assumption that this situation is caused by malicious actors operating in domains in which they do not wish to be identified, such as hacker forums and markets in which the participants are motivated to remain semi-anonymous (though they wish to keep their true identities secret, they find it useful for customers to identify their products and services). We are therefore in the presence of a different, even more challenging problem that we refer to as adversarial deduplication. In this paper, we study this problem via examples that arise from real-world data on malicious hacker forums and markets arising from collaborations with a cyber threat intelligence company focusing on understanding this kind of behavior. We argue that it is very difficult---if not impossible---to find ground truth data on which to build solutions to this problem, and develop a set of preliminary experiments based on training machine learning classifiers that leverage text analysis to detect potential cases of duplicate entities. Our results are encouraging as a first step towards building tools that human analysts can use to enhance their capabilities towards fighting cyber threats.


2019 ◽  
Author(s):  
Gerhard Aigner ◽  
Bernd Grimm ◽  
Christian Lederer ◽  
Martin Daumer

Background. Physical activity (PA) is increasingly being recognized as a major factor related to the development or prevention of many diseases, as an intervention to cure or delay disease and for patient assessment in diagnostics, as a clinical outcome measure or clinical trial endpoint. Thus, wearable sensors and signal algorithms to monitor PA in the free-living environment (real-world) are becoming popular in medicine and clinical research. This is especially true for walking speed, a parameter of PA behaviour with increasing evidence to serve as a patient outcome and clinical trial endpoint in many diseases. The development and validation of sensor signal algorithms for PA classification, in particular walking, and deriving specific PA parameters, such as real world walking speed depends on the availability of large reference data sets with ground truth values. In this study a novel, reliable, scalable (high throughput), user-friendly device and method to generate such ground truth data for real world walking speed, other physical activity types and further gait-related parameters in a real-world environment is described and validated. Methods. A surveyor’s wheel was instrumented with a rotating 3D accelerometer (actibelt). A signal processing algorithm is described to derive distance and speed values. In addition, a high-resolution camera was attached via an active gimbal to video record context and detail. Validation was performed in the following main parts: 1) walking distance measurement is compared to the wheel’s built-in mechanical counter, 2) walking speed measurement is analysed on a treadmill at various speed settings, 3) speed measurement accuracy is analysed by an independent certified calibration laboratory - accreditation by DAkkS applying standardised test procedures. Results: The mean relative error for distance measurements between our method and the built-in counter was 0.12%. Comparison of the speed values algorithmically extracted from accelerometry data and true treadmill speed revealed a mean adjusted absolute error of 0.01 m/s (relative error: 0.71 %). The calibration laboratory found a mean relative error between values algorithmically extracted from accelerometry data and laboratory gold standard of 0.36% (0.17-0.64 min/max), which is below the resolution of the laboratory. An official certificate was issued. Discussion. Error values were a magnitude smaller than the any clinically important difference for walking speed. Conclusion. Besides the high accuracy, the presented method can be deployed in a real world setting and allows to be integrated into the digital data flow.


2019 ◽  
Vol 6 (3) ◽  
pp. 279-289 ◽  
Author(s):  
Congyue Deng ◽  
Jiahui Huang ◽  
Yong-Liang Yang

AbstractModeling the complete geometry of general shapes from a single image is an ill-posed problem. User hints are often incorporated to resolve ambiguities and provide guidance during the modeling process. In this work, we present a novel interactive approach for extracting high-quality freeform shapes from a single image. This is inspired by the popular lofting technique in many CAD systems, and only requires minimal user input. Given an input image, the user only needs to sketch several projected cross sections, provide a “main axis”, and specify some geometric relations. Our algorithm then automatically optimizes the common normal to the sections with respect to these constraints, and interpolates between the sections, resulting in a high-quality 3D model that conforms to both the original image and the user input. The entire modeling session is efficient and intuitive. We demonstrate the effectiveness of our approach based on qualitative tests on a variety of images, and quantitative comparisons with the ground truth using synthetic images.


Sign in / Sign up

Export Citation Format

Share Document