Urban street scene analysis using lightweight multi-level multi-path feature aggregation network

2021 ◽  
Vol 17 (3) ◽  
pp. 249-271
Author(s):  
Tanmay Singha ◽  
Duc-Son Pham ◽  
Aneesh Krishna

Urban street scene analysis is an important problem in computer vision with many off-line models achieving outstanding semantic segmentation results. However, it is an ongoing challenge for the research community to develop and optimize the deep neural architecture with real-time low computing requirements whilst maintaining good performance. Balancing between model complexity and performance has been a major hurdle with many models dropping too much accuracy for a slight reduction in model size and unable to handle high-resolution input images. The study aims to address this issue with a novel model, named M2FANet, that provides a much better balance between model’s efficiency and accuracy for scene segmentation than other alternatives. The proposed optimised backbone helps to increase model’s efficiency whereas, suggested Multi-level Multi-path (M2) feature aggregation approach enhances model’s performance in the real-time environment. By exploiting multi-feature scaling technique, M2FANet produces state-of-the-art results in resource-constrained situations by handling full input resolution. On the Cityscapes benchmark data set, the proposed model produces 68.5% and 68.3% class accuracy on validation and test sets respectively, whilst having only 1.3 million parameters. Compared with all real-time models of less than 5 million parameters, the proposed model is the most competitive in both performance and real-time capability.

Author(s):  
Pranav Kale ◽  
Mayuresh Panchpor ◽  
Saloni Dingore ◽  
Saloni Gaikwad ◽  
Prof. Dr. Laxmi Bewoor

In today's world, deep learning fields are getting boosted with increasing speed. Lot of innovations and different algorithms are being developed. In field of computer vision, related to autonomous driving sector, traffic signs play an important role to provide real time data of an environment. Different algorithms were developed to classify these Signs. But performance still needs to improve for real time environment. Even the computational power required to train such model is high. In this paper, Convolutional Neural Network model is used to Classify Traffic Sign. The experiments are conducted on a real-world data set with images and videos captured from ordinary car driving as well as on GTSRB dataset [15] available on Kaggle. This proposed model is able to outperform previous models and resulted with accuracy of 99.6% on validation set. This idea has been granted Innovation Patent by Australian IP to Authors of this Research Paper. [24]


2020 ◽  
Vol 13 (5) ◽  
pp. 957-964
Author(s):  
Siva Rama Krishna ◽  
Mohammed Ali Hussain

Background: In recent years, the computational memory and energy conservation have become a major problem in cloud computing environment due to the increase in data size and computing resources. Since, most of the different cloud providers offer different cloud services and resources use limited number of user’s applications. Objective: The main objective of this work is to design and implement a cloud resource allocation and resources scheduling model in the cloud environment. Methods: In the proposed model, a novel cloud server to resource management technique is proposed on real-time cloud environment to minimize the cost and time. In this model different types of cloud resources and its services are scheduled using multi-level objective constraint programming. Proposed cloud server-based resource allocation model is based on optimization functions to minimize the resource allocation time and cost. Results: Experimental results proved that the proposed model has high computational resource allocation time and cost compared to the existing resource allocation models. Conclusion: This cloud service and resource optimization model is efficiently implemented and tested in real-time cloud instances with different types of services and resource sets.


Author(s):  
Ping Kuang ◽  
Tingsong Ma ◽  
Fan Li ◽  
Ziwei Chen

Pedestrian detection provides manager of a smart city with a great opportunity to manage their city effectively and automatically. Specifically, pedestrian detection technology can improve our secure environment and make our traffic more efficient. In this paper, all of our work both modification and improvement are made based on YOLO, which is a real-time Convolutional Neural Network detector. In our work, we extend YOLO’s original network structure, and also give a new definition of loss function to boost the performance for pedestrian detection, especially when the targets are small, and that is exactly what YOLO is not good at. In our experiment, the proposed model is tested on INRIA, UCF YouTube Action Data Set and Caltech Pedestrian Detection Benchmark. Experimental results indicate that after our modification and improvement, the revised YOLO network outperforms the original version and also is better than other solutions.


2020 ◽  
Vol 2 (1) ◽  
pp. 25-34 ◽  
Author(s):  
Dr. Dhaya R.

Monitoring of traffic and unprecedented violence has become very much necessary in the urban as well as the rural areas, so the paper attempts to develop a CCTV surveillance for unprecedented violence and traffic monitoring. The proffered method performs the synchronization of the videos and does proper alliance employing the algorithms of motion detection and contour filtering. The steps in motion detection identifies the movement of the objects such as vehicles and unprecedented activities whereas the filtering is used to identify the object itself using its color. The synchronization and the alignment process affords to provide the details of the each objects on the scenario. The proposed algorithm is developed in Java which assists its model using its library that is open source. The validation of the proposed model was carried out using the data set acquired from real time and results were acquired. Moreover the results acquired were compared with the algorithms that were created in the early stages, the comparison proved that the proffered model was capable of obtaining a consecutive quick outcomes of 12.3912 *factor than the existing methods for the resolution of the video used in testing was 240.01x 320.01 with 40 frames per second with cameras of high definition. Further the results acquired were computed to run the application of the embedded CPU and the GPU processors.


2016 ◽  
Vol 14 (1) ◽  
pp. 172988141668270 ◽  
Author(s):  
Di Guo ◽  
Fuchun Sun ◽  
Tao Kong ◽  
Huaping Liu

Grasping has always been a great challenge for robots due to its lack of the ability to well understand the perceived sensing data. In this work, we propose an end-to-end deep vision network model to predict possible good grasps from real-world images in real time. In order to accelerate the speed of the grasp detection, reference rectangles are designed to suggest potential grasp locations and then refined to indicate robotic grasps in the image. With the proposed model, the graspable scores for each location in the image and the corresponding predicted grasp rectangles can be obtained in real time at a rate of 80 frames per second on a graphic processing unit. The model is evaluated on a real robot-collected data set and different reference rectangle settings are compared to yield the best detection performance. The experimental results demonstrate that the proposed approach can assist the robot to learn the graspable part of the object from the image in a fast manner.


2019 ◽  
Vol XVI (2) ◽  
pp. 1-11
Author(s):  
Farrukh Jamal ◽  
Hesham Mohammed Reyad ◽  
Soha Othman Ahmed ◽  
Muhammad Akbar Ali Shah ◽  
Emrah Altun

A new three-parameter continuous model called the exponentiated half-logistic Lomax distribution is introduced in this paper. Basic mathematical properties for the proposed model were investigated which include raw and incomplete moments, skewness, kurtosis, generating functions, Rényi entropy, Lorenz, Bonferroni and Zenga curves, probability weighted moment, stress strength model, order statistics, and record statistics. The model parameters were estimated by using the maximum likelihood criterion and the behaviours of these estimates were examined by conducting a simulation study. The applicability of the new model is illustrated by applying it on a real data set.


Author(s):  
Kyungkoo Jun

Background & Objective: This paper proposes a Fourier transform inspired method to classify human activities from time series sensor data. Methods: Our method begins by decomposing 1D input signal into 2D patterns, which is motivated by the Fourier conversion. The decomposition is helped by Long Short-Term Memory (LSTM) which captures the temporal dependency from the signal and then produces encoded sequences. The sequences, once arranged into the 2D array, can represent the fingerprints of the signals. The benefit of such transformation is that we can exploit the recent advances of the deep learning models for the image classification such as Convolutional Neural Network (CNN). Results: The proposed model, as a result, is the combination of LSTM and CNN. We evaluate the model over two data sets. For the first data set, which is more standardized than the other, our model outperforms previous works or at least equal. In the case of the second data set, we devise the schemes to generate training and testing data by changing the parameters of the window size, the sliding size, and the labeling scheme. Conclusion: The evaluation results show that the accuracy is over 95% for some cases. We also analyze the effect of the parameters on the performance.


Author(s):  
Dhilsath Fathima.M ◽  
S. Justin Samuel ◽  
R. Hari Haran

Aim: This proposed work is used to develop an improved and robust machine learning model for predicting Myocardial Infarction (MI) could have substantial clinical impact. Objectives: This paper explains how to build machine learning based computer-aided analysis system for an early and accurate prediction of Myocardial Infarction (MI) which utilizes framingham heart study dataset for validation and evaluation. This proposed computer-aided analysis model will support medical professionals to predict myocardial infarction proficiently. Methods: The proposed model utilize the mean imputation to remove the missing values from the data set, then applied principal component analysis to extract the optimal features from the data set to enhance the performance of the classifiers. After PCA, the reduced features are partitioned into training dataset and testing dataset where 70% of the training dataset are given as an input to the four well-liked classifiers as support vector machine, k-nearest neighbor, logistic regression and decision tree to train the classifiers and 30% of test dataset is used to evaluate an output of machine learning model using performance metrics as confusion matrix, classifier accuracy, precision, sensitivity, F1-score, AUC-ROC curve. Results: Output of the classifiers are evaluated using performance measures and we observed that logistic regression provides high accuracy than K-NN, SVM, decision tree classifiers and PCA performs sound as a good feature extraction method to enhance the performance of proposed model. From these analyses, we conclude that logistic regression having good mean accuracy level and standard deviation accuracy compared with the other three algorithms. AUC-ROC curve of the proposed classifiers is analyzed from the output figure.4, figure.5 that logistic regression exhibits good AUC-ROC score, i.e. around 70% compared to k-NN and decision tree algorithm. Conclusion: From the result analysis, we infer that this proposed machine learning model will act as an optimal decision making system to predict the acute myocardial infarction at an early stage than an existing machine learning based prediction models and it is capable to predict the presence of an acute myocardial Infarction with human using the heart disease risk factors, in order to decide when to start lifestyle modification and medical treatment to prevent the heart disease.


2019 ◽  
Vol 55 (13) ◽  
pp. 742-745 ◽  
Author(s):  
Kang Yang ◽  
Huihui Song ◽  
Kaihua Zhang ◽  
Jiaqing Fan

Sensors ◽  
2020 ◽  
Vol 20 (6) ◽  
pp. 1633 ◽  
Author(s):  
Beom-Su Kim ◽  
Sangdae Kim ◽  
Kyong Hoon Kim ◽  
Tae-Eung Sung ◽  
Babar Shah ◽  
...  

Many applications are able to obtain enriched information by employing a wireless multimedia sensor network (WMSN) in industrial environments, which consists of nodes that are capable of processing multimedia data. However, as many aspects of WMSNs still need to be refined, this remains a potential research area. An efficient application needs the ability to capture and store the latest information about an object or event, which requires real-time multimedia data to be delivered to the sink timely. Motivated to achieve this goal, we developed a new adaptive QoS routing protocol based on the (m,k)-firm model. The proposed model processes captured information by employing a multimedia stream in the (m,k)-firm format. In addition, the model includes a new adaptive real-time protocol and traffic handling scheme to transmit event information by selecting the next hop according to the flow status as well as the requirement of the (m,k)-firm model. Different from the previous approach, two level adjustment in routing protocol and traffic management are able to increase the number of successful packets within the deadline as well as path setup schemes along the previous route is able to reduce the packet loss until a new path is established. Our simulation results demonstrate that the proposed schemes are able to improve the stream dynamic success ratio and network lifetime compared to previous work by meeting the requirement of the (m,k)-firm model regardless of the amount of traffic.


Sign in / Sign up

Export Citation Format

Share Document