scene image
Recently Published Documents


TOTAL DOCUMENTS

235
(FIVE YEARS 60)

H-INDEX

12
(FIVE YEARS 3)

2021 ◽  
Vol 40 (12-14) ◽  
pp. 1435-1466
Author(s):  
Danny Driess ◽  
Jung-Su Ha ◽  
Marc Toussaint

In this article, we propose deep visual reasoning, which is a convolutional recurrent neural network that predicts discrete action sequences from an initial scene image for sequential manipulation problems that arise, for example, in task and motion planning (TAMP). Typical TAMP problems are formalized by combining reasoning on a symbolic, discrete level (e.g., first-order logic) with continuous motion planning such as nonlinear trajectory optimization. The action sequences represent the discrete decisions on a symbolic level, which, in turn, parameterize a nonlinear trajectory optimization problem. Owing to the great combinatorial complexity of possible discrete action sequences, a large number of optimization/motion planning problems have to be solved to find a solution, which limits the scalability of these approaches. To circumvent this combinatorial complexity, we introduce deep visual reasoning: based on a segmented initial image of the scene, a neural network directly predicts promising discrete action sequences such that ideally only one motion planning problem has to be solved to find a solution to the overall TAMP problem. Our method generalizes to scenes with many and varying numbers of objects, although being trained on only two objects at a time. This is possible by encoding the objects of the scene and the goal in (segmented) images as input to the neural network, instead of a fixed feature vector. We show that the framework can not only handle kinematic problems such as pick-and-place (as typical in TAMP), but also tool-use scenarios for planar pushing under quasi-static dynamic models. Here, the image-based representation enables generalization to other shapes than during training. Results show runtime improvements of several orders of magnitudes by, in many cases, removing the need to search over the discrete action sequences.


2021 ◽  
pp. 561-570
Author(s):  
Mridul Ghosh ◽  
Somnath Chatterjee ◽  
Himadri Mukherjee ◽  
Shibaprasad Sen ◽  
Sk Md Obaidullah

2021 ◽  
Vol 2021 ◽  
pp. 1-11
Author(s):  
Xiaofeng Yang

The noise pollution in tourist street view images is caused by various reasons. A major challenge that researchers have been facing is to find a way to effectively remove noise. Although in the past few decades people have proposed many methods of denoising tourist street scene images, the research on denoising technology of tourist street scene images is still not outdated. There is no doubt that it has become a basic and important research topic in the field of digital image processing. The evolutionary diffusion method based on partial differential equations is helpful to improve the quality of noisy tourist street scene images. This method can process tourist street scene images according to people’s expected diffusion behavior. The adaptive total variation model proposed in this paper is improved on the basis of the total variation model and the Gaussian thermal diffusion model. We analyze the classic variational PDE-based denoising model and get a unified variational PDE energy functional model. We also give a detailed analysis of the diffusion performance of the total variational model and then propose an adaptive total variational diffusion model. By improving the diffusion coefficient and introducing a curvature operator that can distinguish details such as edges, it can effectively denoise the tourist street scene image, and it also has a good effect on avoiding the step effect. Through the improvement of the ROF model, the loyalty term and regular term of the model are parameterized, the adaptive total variation denoising model of this paper is established, and a detailed analysis is carried out. The experimental results show that compared with some traditional denoising models, the model in this paper can effectively suppress the step effect in the denoising process, while protecting the texture details of the edge area of the tourist street scene image. In addition, the model in this paper is superior to traditional denoising models in terms of denoising performance and texture structure protection.


2021 ◽  
Vol 13 (21) ◽  
pp. 4379
Author(s):  
Cuiping Shi ◽  
Xinlei Zhang ◽  
Jingwei Sun ◽  
Liguo Wang

For remote sensing scene image classification, many convolution neural networks improve the classification accuracy at the cost of the time and space complexity of the models. This leads to a slow running speed for the model and cannot realize a trade-off between the model accuracy and the model running speed. As the network deepens, it is difficult to extract the key features with a sample double branched structure, and it also leads to the loss of shallow features, which is unfavorable to the classification of remote sensing scene images. To solve this problem, we propose a dual branch multi-level feature dense fusion-based lightweight convolutional neural network (BMDF-LCNN). The network structure can fully extract the information of the current layer through 3 × 3 depthwise separable convolution and 1 × 1 standard convolution, identity branches, and fuse with the features extracted from the previous layer 1 × 1 standard convolution, thus avoiding the loss of shallow information due to network deepening. In addition, we propose a downsampling structure that is more suitable for extracting the shallow features of the network by using the pooled branch to downsample and the convolution branch to compensate for the pooled features. Experiments were carried out on four open and challenging remote sensing image scene data sets. The experimental results show that the proposed method has higher classification accuracy and lower model complexity than some state-of-the-art classification methods and realizes the trade-off between model accuracy and model running speed.


2021 ◽  
Vol 2021 ◽  
pp. 1-7
Author(s):  
Tianfang Ma ◽  
Shuoyan Liu

At present, in the process of image synthesis information acquisition of urban virtual geographic scene, there is information complexity. The existing acquisition technology is easy to disturb in the process of information positioning and transmission, resulting in large acquisition delay and affecting the final synthesis quality of the image. In response to the above problems, the method of synchronous information acquisition for urban virtual geographic scene image synthesis based on wireless network technology is studied. Based on the construction of an urban virtual geographic environment, the spatial localization of the sign target of geographic scene image synthesis is carried out, and the optical flow method is used to register the urban geographic scene images. Based on the greedy algorithm of the beacon to design the synchronous wireless network route for urban virtual scene image synthesis, and after the grid division of the wireless network in the synthesis information acquisition area, the packets of each acquisition node are acquired and transmitted according to the designed wireless network route to realize the synchronous acquisition of image synthesis information. The comparison experimental data show that the acquisition delay of the studied information synchronization acquisition method is less than 0.5 s, the acquisition synchronization rate is significantly improved, and the quality of the synthesized images is better by applying the information acquired by the method, and the practical use is better.


2021 ◽  
Author(s):  
Zhang Chuyin ◽  
Zhao Hui Koh ◽  
Regan Gallagher ◽  
Shinji Nishimoto ◽  
Naotsugu Tsuchiya

Previous studies have established a view that human observers can only perceive coarse information from a natural scene image when it is presented rapidly (<100ms, masked). In these studies, participants were often forced to choose an answer from options that experimenters preselected. These options can underestimate what participants experience and can report on it. Here, we used a novel free-report paradigm to examine what people can freely report following a rapidly presented natural scene image (67/133/267ms, masked). N = 670 online participants typed up to five words to report what they saw in the image together with confidence of the respective responses. We developed a novel index, Intersubjective Agreement (IA). IA quantifies how specifically the response words were used to describe the target image, with a high value meaning the word is not often reported for other images. IA eliminates the need for experimenters to preselect response options. With IA, unlike commonly believed, we demonstrated that participants reported highly specific and detailed aspects of the briefly (even at 67ms, masked) shown image. Further, IA is positively correlated with confidence, indicating metacognitive conscious access to the reported aspects of the image. These new findings challenge the dominant view that the content of rapid scene experience is limited to global and coarse gist. Our novel paradigm opens a door to investigate various contents of consciousness with a free-report paradigm.


Author(s):  
Wenfang Zhang ◽  
Chi Xu

The feature resolution of traditional methods for fuzzy image denoising is low, for the sake of improve the strepitus removal and investigation ability of defocused blurred night images, a strepitus removal algorithm based on bilateral filtering is suggested. The method include the following steps of: Building an out-of-focus blurred night scene image acquisition model with grid block feature matching of the out-of-focus blurred night scene image; Carrying out information enhancement processing of the out-of-focus blurred night scene image by adopting a high-resolution image detail feature enhancement technology; Collecting edge contour feature quantity of the out-of-focus blurred night scene image; Carrying out grid block feature matching design of the out-of-focus blurred night scene image by adopting a bilateral filtering information reconstruction technology; And building the gray-level histogram information location model of the out-of-focus blurred night scene image. Fuzzy pixel information fusion investigation method is used to collect gray features of defocused blurred night images. According to the feature collection results, bilateral filtering algorithm is used to automatically optimize the strepitus removal of defocused blurred night images. The simulation results show that the out-of-focus blurred night scene image using this method for machine learning has better strepitus removal performance, shorter time cost and higher export peak signal-to-strepitus proportion.


Sign in / Sign up

Export Citation Format

Share Document