scholarly journals Multi-Task Classification and Segmentation for Explicable Capsule Endoscopy Diagnostics

2021 ◽  
Vol 8 ◽  
Author(s):  
Zishang Kong ◽  
Min He ◽  
Qianjiang Luo ◽  
Xiansong Huang ◽  
Pengxu Wei ◽  
...  

Capsule endoscopy is a leading diagnostic tool for small bowel lesions which faces certain challenges such as time-consuming interpretation and harsh optical environment inside the small intestine. Specialists unavoidably waste lots of time on searching for a high clearness degree image for accurate diagnostics. However, current clearness degree classification methods are based on either traditional attributes or an unexplainable deep neural network. In this paper, we propose a multi-task framework, called the multi-task classification and segmentation network (MTCSN), to achieve joint learning of clearness degree (CD) and tissue semantic segmentation (TSS) for the first time. In the MTCSN, the CD helps to generate better refined TSS, while TSS provides an explicable semantic map to better classify the CD. In addition, we present a new benchmark, named the Capsule-Endoscopy Crohn’s Disease dataset, which introduces the challenges faced in the real world including motion blur, excreta occlusion, reflection, and various complex alimentary scenes that are widely acknowledged in endoscopy examination. Extensive experiments and ablation studies report the significant performance gains of the MTCSN over state-of-the-art methods.

2020 ◽  
Vol 34 (07) ◽  
pp. 10460-10469 ◽  
Author(s):  
Ankan Bansal ◽  
Sai Saketh Rambhatla ◽  
Abhinav Shrivastava ◽  
Rama Chellappa

We present an approach for detecting human-object interactions (HOIs) in images, based on the idea that humans interact with functionally similar objects in a similar manner. The proposed model is simple and efficiently uses the data, visual features of the human, relative spatial orientation of the human and the object, and the knowledge that functionally similar objects take part in similar interactions with humans. We provide extensive experimental validation for our approach and demonstrate state-of-the-art results for HOI detection. On the HICO-Det dataset our method achieves a gain of over 2.5% absolute points in mean average precision (mAP) over state-of-the-art. We also show that our approach leads to significant performance gains for zero-shot HOI detection in the seen object setting. We further demonstrate that using a generic object detector, our model can generalize to interactions involving previously unseen objects.


Author(s):  
Jingjing Li ◽  
Mengmeng Jing ◽  
Ke Lu ◽  
Lei Zhu ◽  
Yang Yang ◽  
...  

Zero-shot learning (ZSL) and cold-start recommendation (CSR) are two challenging problems in computer vision and recommender system, respectively. In general, they are independently investigated in different communities. This paper, however, reveals that ZSL and CSR are two extensions of the same intension. Both of them, for instance, attempt to predict unseen classes and involve two spaces, one for direct feature representation and the other for supplementary description. Yet there is no existing approach which addresses CSR from the ZSL perspective. This work, for the first time, formulates CSR as a ZSL problem, and a tailor-made ZSL method is proposed to handle CSR. Specifically, we propose a Lowrank Linear Auto-Encoder (LLAE), which challenges three cruxes, i.e., domain shift, spurious correlations and computing efficiency, in this paper. LLAE consists of two parts, a low-rank encoder maps user behavior into user attributes and a symmetric decoder reconstructs user behavior from user attributes. Extensive experiments on both ZSL and CSR tasks verify that the proposed method is a win-win formulation, i.e., not only can CSR be handled by ZSL models with a significant performance improvement compared with several conventional state-of-the-art methods, but the consideration of CSR can benefit ZSL as well.


Author(s):  
Paul D. Wilcox ◽  
Anthony J. Croxford ◽  
Nicolas Budyn ◽  
Rhodri L. T. Bevan ◽  
Jie Zhang ◽  
...  

State-of-the-art ultrasonic non-destructive evaluation (NDE) uses an array to rapidly generate multiple, information-rich views at each test position on a safety-critical component. However, the information for detecting potential defects is dispersed across views, and a typical inspection may involve thousands of test positions. Interpretation requires painstaking analysis by a skilled operator. In this paper, various methods for fusing multi-view data are developed. Compared with any one single view, all methods are shown to yield significant performance gains, which may be related to the general and edge cases for NDE. In the general case, a defect is clearly detectable in at least one individual view, but the view(s) depends on the defect location and orientation. Here, the performance gain from data fusion is mainly the result of the selective use of information from the most appropriate view(s) and fusion provides a means to substantially reduce operator burden. The edge cases are defects that cannot be reliably detected in any one individual view without false alarms. Here, certain fusion methods are shown to enable detection with reduced false alarms. In this context, fusion allows NDE capability to be extended with potential implications for the design and operation of engineering assets.


2021 ◽  
Vol 13 (6) ◽  
pp. 1049
Author(s):  
Cheng Liao ◽  
Han Hu ◽  
Haifeng Li ◽  
Xuming Ge ◽  
Min Chen ◽  
...  

Most of the existing approaches to the extraction of buildings from high-resolution orthoimages consider the problem as semantic segmentation, which extracts a pixel-wise mask for buildings and trains end-to-end with manually labeled building maps. However, as buildings are highly structured, such a strategy suffers several problems, such as blurred boundaries and the adhesion to close objects. To alleviate the above problems, we proposed a new strategy that also considers the contours of the buildings. Both the contours and structures of the buildings are jointly learned in the same network. The contours are learnable because the boundary of the mask labels of buildings implicitly represents the contours of buildings. We utilized the building contour information embedded in the labels to optimize the representation of building boundaries, then combined the contour information with multi-scale semantic features to enhance the robustness to image spatial resolution. The experimental results showed that the proposed method achieved 91.64%, 81.34%, and 74.51% intersection over union (IoU) on the WHU, Aerial, and Massachusetts building datasets, and outperformed the state-of-the-art (SOTA) methods. It significantly improved the accuracy of building boundaries, especially for the edges of adjacent buildings. The code is made publicly available.


Author(s):  
Nian-Ze Lee ◽  
Yen-Shi Wang ◽  
Jie-Hong R. Jiang

Stochastic Boolean satisfiability (SSAT) is an expressive language to formulate decision problems with randomness. Solving SSAT formulas has the same PSPACE-complete computational complexity as solving quantified Boolean formulas (QBFs). Despite its broad applications and profound theoretical values, SSAT has received relatively little attention compared to QBF. In this paper, we focus on exist-random quantified SSAT formulas, also known as E-MAJSAT, which is a special fragment of SSAT commonly applied in probabilistic conformant planning, posteriori hypothesis, and maximum expected utility. Based on clause selection, a recently proposed QBF technique, we propose an algorithm to solve E-MAJSAT. Moreover, our method can provide an approximate solution to E-MAJSAT with a lower bound when an exact answer is too expensive to compute. Experiments show that the proposed algorithm achieves significant performance gains and memory savings over the state-of-the-art SSAT solvers on a number of benchmark formulas, and provides useful lower bounds for cases where prior methods fail to compute exact answers.


2020 ◽  
Vol 34 (04) ◽  
pp. 5981-5988
Author(s):  
Yunhao Tang ◽  
Shipra Agrawal

In this work, we show that discretizing action space for continuous control is a simple yet powerful technique for on-policy optimization. The explosion in the number of discrete actions can be efficiently addressed by a policy with factorized distribution across action dimensions. We show that the discrete policy achieves significant performance gains with state-of-the-art on-policy optimization algorithms (PPO, TRPO, ACKTR) especially on high-dimensional tasks with complex dynamics. Additionally, we show that an ordinal parameterization of the discrete distribution can introduce the inductive bias that encodes the natural ordering between discrete actions. This ordinal architecture further significantly improves the performance of PPO/TRPO.


Processes ◽  
2021 ◽  
Vol 9 (1) ◽  
pp. 87
Author(s):  
Ali Umut Şen ◽  
Helena Pereira

In recent years, there has been a surge of interest in char production from lignocellulosic biomass due to the fact of char’s interesting technological properties. Global char production in 2019 reached 53.6 million tons. Barks are among the most important and understudied lignocellulosic feedstocks that have a large potential for exploitation, given bark global production which is estimated to be as high as 400 million cubic meters per year. Chars can be produced from barks; however, in order to obtain the desired char yields and for simulation of the pyrolysis process, it is important to understand the differences between barks and woods and other lignocellulosic materials in addition to selecting a proper thermochemical method for bark-based char production. In this state-of-the-art review, after analyzing the main char production methods, barks were characterized for their chemical composition and compared with other important lignocellulosic materials. Following these steps, previous bark-based char production studies were analyzed, and different barks and process types were evaluated for the first time to guide future char production process designs based on bark feedstock. The dry and wet pyrolysis and gasification results of barks revealed that application of different particle sizes, heating rates, and solid residence times resulted in highly variable char yields between the temperature range of 220 °C and 600 °C. Bark-based char production should be primarily performed via a slow pyrolysis route, considering the superior surface properties of slow pyrolysis chars.


Sensors ◽  
2021 ◽  
Vol 21 (6) ◽  
pp. 1977
Author(s):  
Ricardo Oliveira ◽  
Liliana M. Sousa ◽  
Ana M. Rocha ◽  
Rogério Nogueira ◽  
Lúcia Bilro

In this work, we demonstrate for the first time the capability to inscribe long-period gratings (LPGs) with UV radiation using simple and low cost amplitude masks fabricated with a consumer grade 3D printer. The spectrum obtained for a grating with 690 µm period and 38 mm length presented good quality, showing sharp resonances (i.e., 3 dB bandwidth < 3 nm), low out-of-band loss (~0.2 dB), and dip losses up to 18 dB. Furthermore, the capability to select the resonance wavelength has been demonstrated using different amplitude mask periods. The customization of the masks makes it possible to fabricate gratings with complex structures. Additionally, the simplicity in 3D printing an amplitude mask solves the problem of the lack of amplitude masks on the market and avoids the use of high resolution motorized stages, as is the case of the point-by-point technique. Finally, the 3D printed masks were also used to induce LPGs using the mechanical pressing method. Due to the better resolution of these masks compared to ones described on the state of the art, we were able to induce gratings with higher quality, such as low out-of-band loss (0.6 dB), reduced spectral ripples, and narrow bandwidths (~3 nm).


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Shreeya Sriram ◽  
Shitij Avlani ◽  
Matthew P. Ward ◽  
Shreyas Sen

AbstractContinuous multi-channel monitoring of biopotential signals is vital in understanding the body as a whole, facilitating accurate models and predictions in neural research. The current state of the art in wireless technologies for untethered biopotential recordings rely on radiative electromagnetic (EM) fields. In such transmissions, only a small fraction of this energy is received since the EM fields are widely radiated resulting in lossy inefficient systems. Using the body as a communication medium (similar to a ’wire’) allows for the containment of the energy within the body, yielding order(s) of magnitude lower energy than radiative EM communication. In this work, we introduce Animal Body Communication (ABC), which utilizes the concept of using the body as a medium into the domain of untethered animal biopotential recording. This work, for the first time, develops the theory and models for animal body communication circuitry and channel loss. Using this theoretical model, a sub-inch$$^3$$ 3 [1″ × 1″ × 0.4″], custom-designed sensor node is built using off the shelf components which is capable of sensing and transmitting biopotential signals, through the body of the rat at significantly lower powers compared to traditional wireless transmissions. In-vivo experimental analysis proves that ABC successfully transmits acquired electrocardiogram (EKG) signals through the body with correlation $$>99\%$$ > 99 % when compared to traditional wireless communication modalities, with a 50$$\times$$ × reduction in power consumption.


2021 ◽  
Vol 40 (3) ◽  
pp. 1-13
Author(s):  
Lumin Yang ◽  
Jiajie Zhuang ◽  
Hongbo Fu ◽  
Xiangzhi Wei ◽  
Kun Zhou ◽  
...  

We introduce SketchGNN , a convolutional graph neural network for semantic segmentation and labeling of freehand vector sketches. We treat an input stroke-based sketch as a graph with nodes representing the sampled points along input strokes and edges encoding the stroke structure information. To predict the per-node labels, our SketchGNN uses graph convolution and a static-dynamic branching network architecture to extract the features at three levels, i.e., point-level, stroke-level, and sketch-level. SketchGNN significantly improves the accuracy of the state-of-the-art methods for semantic sketch segmentation (by 11.2% in the pixel-based metric and 18.2% in the component-based metric over a large-scale challenging SPG dataset) and has magnitudes fewer parameters than both image-based and sequence-based methods.


Sign in / Sign up

Export Citation Format

Share Document