image streams
Recently Published Documents


TOTAL DOCUMENTS

67
(FIVE YEARS 19)

H-INDEX

10
(FIVE YEARS 3)

Author(s):  
Raúl Pedro Aceñero Eixarch ◽  
Raúl Díaz-Usechi Laplaza ◽  
Rafael Berlanga Llavori

This paper presents a study about screening large radiological image streams produced in hospitals for earlier detection of lung nodules. Being one of the most difficult classification tasks in the literature, our objective is to measure how well state-of-the-art classifiers can screen out the images stream to keep as many positive cases as possible in an output stream to be inspected by clinicians. We performed several experiments with different image resolutions and training datasets from different sources, always taking ResNet-152 as the base neural network. Results over existing datasets show that, contrary to other diseases like pneumonia, detecting nodules is a hard task when using only radiographies. Indeed, final diagnosis by clinicians is usually performed with much more precise images like computed tomographies.


2021 ◽  
Author(s):  
Huzheng Yang ◽  
Shanghang Zhang ◽  
Yifan Wu ◽  
Yuanning Li ◽  
Shi Gu

This report provides a review of our submissions to the Algonauts Challenge 2021. In this challenge, neural responses in the visual cortex were recorded using functional neuroimaging when participants were watching naturalistic videos. The goal of the challenge is to develop voxel-wise encoding models which predict such neural signals based on the input videos. Here we built an ensemble of models that extract representations based on the input videos from 4 perspectives: image streams, motion, edges, and audio. We showed that adding new modules into the ensemble consistently improved our prediction performance. Our methods achieved state-of-the-art performance on both the mini track and the full track tasks.


Author(s):  
Mojtaba Khanzadeh ◽  
Matthew Dantin ◽  
Wenmeng Tian ◽  
Matthew W. Priddy ◽  
Haley Doude ◽  
...  

Abstract The objective of this research is to study an effective thermal history prediction method for additive manufacturing (AM) processes using thermal image streams in a layer-wise manner. The need for immaculate integration of in-process sensing and data-driven approaches to monitor process dynamics in AM has been clearly stated in blueprint reports released by various U.S. agencies such as NIST and DoD over the past five years. Reliable physics-based models have been developed to delineate the underlying thermo-mechanical dynamics of AM processes; however, the computational cost is extremely high. We propose a tensor-based surrogate modeling methodology to predict the layer-wise relationship in the thermal history of the AM parts, which is time-efficient compared to available physics-based prediction models. We construct a network-tensor structure for freeform shapes based on thermal image streams obtained in metal-based AM process. Subsequently, we simplify the network-tensor structure by concatenating images to reach layer-wise structure. Succeeding layers are predicted based on antecedent layer using the tensor regression model. Generalized multilinear structure, called the higher-order partial least squares (HOPLS) is used to estimate the tensor regression model parameters. Through proposed method, high-dimensional thermal history of AM components can be predicted accurately in a computationally efficient manner. The proposed thermal history prediction is applied on simulated thermal images from finite element method (FEM) simulations. This shows that the proposed model can be used to enhance their performance alongside simulation-based models.


Sensors ◽  
2021 ◽  
Vol 21 (10) ◽  
pp. 3465
Author(s):  
Madina Abdrakhmanova ◽  
Askat Kuzdeuov ◽  
Sheikh Jarju ◽  
Yerbolat Khassanov ◽  
Michael Lewis ◽  
...  

We present SpeakingFaces as a publicly-available large-scale multimodal dataset developed to support machine learning research in contexts that utilize a combination of thermal, visual, and audio data streams; examples include human–computer interaction, biometric authentication, recognition systems, domain transfer, and speech recognition. SpeakingFaces is comprised of aligned high-resolution thermal and visual spectra image streams of fully-framed faces synchronized with audio recordings of each subject speaking approximately 100 imperative phrases. Data were collected from 142 subjects, yielding over 13,000 instances of synchronized data (∼3.8 TB). For technical validation, we demonstrate two baseline examples. The first baseline shows classification by gender, utilizing different combinations of the three data streams in both clean and noisy environments. The second example consists of thermal-to-visual facial image translation, as an instance of domain transfer.


2021 ◽  
Vol 5 (2) ◽  
pp. 50-61
Author(s):  
Uroš Hudomalj ◽  
Christopher Mandla ◽  
Markus Plattner

This paper presents FPGA implementations of image filtering and image averaging – two widely applied image preprocessing algorithms. The implementations are targeted for real time processing of high frame rate and high resolution image streams. The developed implementations are evaluated in terms of resource usage, power consumption, and achievable frame rates. For the evaluation, Microsemi’s Smartfusion2 Advanced Development Kit is used. It includes a SmartFusion2 M2S150 SoC FPGA. The performance of the developed implementation of image filtering algorithm is compared to a solution provided by MATLAB’s Vision HDL Toolbox, which is evaluated on the same platform. The performance of the developed implementations are also compared with FPGA implementations found in existing publications, although those are evaluated on different FPGA platforms. Difficulties with performance comparison between implementations on different platforms are addressed and limitations of processing image streams with FPGA platforms discussed.


PLoS ONE ◽  
2021 ◽  
Vol 16 (2) ◽  
pp. e0246336
Author(s):  
Håkan Wieslander ◽  
Carolina Wählby ◽  
Ida-Maria Sintorn

Microscopy imaging experiments generate vast amounts of data, and there is a high demand for smart acquisition and analysis methods. This is especially true for transmission electron microscopy (TEM) where terabytes of data are produced if imaging a full sample at high resolution, and analysis can take several hours. One way to tackle this issue is to collect a continuous stream of low resolution images whilst moving the sample under the microscope, and thereafter use this data to find the parts of the sample deemed most valuable for high-resolution imaging. However, such image streams are degraded by both motion blur and noise. Building on deep learning based approaches developed for deblurring videos of natural scenes we explore the opportunities and limitations of deblurring and denoising images captured from a fast image stream collected by a TEM microscope. We start from existing neural network architectures and make adjustments of convolution blocks and loss functions to better fit TEM data. We present deblurring results on two real datasets of images of kidney tissue and a calibration grid. Both datasets consist of low quality images from a fast image stream captured by moving the sample under the microscope, and the corresponding high quality images of the same region, captured after stopping the movement at each position to let all motion settle. We also explore the generalizability and overfitting on real and synthetically generated data. The quality of the restored images, evaluated both quantitatively and visually, show that using deep learning for image restoration of TEM live image streams has great potential but also comes with some limitations.


Sensors ◽  
2021 ◽  
Vol 21 (3) ◽  
pp. 888
Author(s):  
Xiqi Wang ◽  
Shunyi Zheng ◽  
Ce Zhang ◽  
Rui Li ◽  
Li Gui

Accurate and efficient text detection in natural scenes is a fundamental yet challenging task in computer vision, especially when dealing with arbitrarily-oriented texts. Most contemporary text detection methods are designed to identify horizontal or approximately horizontal text, which cannot satisfy practical detection requirements for various real-world images such as image streams or videos. To address this lacuna, we propose a novel method called Rotational You Only Look Once (R-YOLO), a robust real-time convolutional neural network (CNN) model to detect arbitrarily-oriented texts in natural image scenes. First, a rotated anchor box with angle information is used as the text bounding box over various orientations. Second, features of various scales are extracted from the input image to determine the probability, confidence, and inclined bounding boxes of the text. Finally, Rotational Distance Intersection over Union Non-Maximum Suppression is used to eliminate redundancy and acquire detection results with the highest accuracy. Experiments on benchmark comparison are conducted upon four popular datasets, i.e., ICDAR2015, ICDAR2013, MSRA-TD500, and ICDAR2017-MLT. The results indicate that the proposed R-YOLO method significantly outperforms state-of-the-art methods in terms of detection efficiency while maintaining high accuracy; for example, the proposed R-YOLO method achieves an F-measure of 82.3% at 62.5 fps with 720 p resolution on the ICDAR2015 dataset.


2020 ◽  
Vol 26 (4) ◽  
pp. 3037-3055
Author(s):  
Neal A Patel ◽  
Perry N Alagappan ◽  
Chuanbo Pan ◽  
Peter Karth

Here we present a mobile application that accurately determines the distance between an optical sensor and the human corneal limbus for visual acuity assessment. The application uses digital image processing and randomized circle detection to locate the cornea. Then, a reference scaling measurement is employed to calculate distance from the sensor to a user. To determine accuracy and generalizability, testing was conducted both with 200 static images, 25 images each of males and females for four ethnic groups from a facial image database, and live image streams from a test subject. Average absolute corneal radius error over 10 trials for the static images was 6.36%, while average absolute distance error for the live image streams was less than 1%. Subsequently, distance measurements were used to scale letter sizes for a Snellen Chart-based visual acuity assessment. This system enables monitoring of chronic retinal diseases, as patients can quickly and accurately measure their visual acuity through the mobile eye exam suite.


Sign in / Sign up

Export Citation Format

Share Document