scholarly journals MODIFICATION OF ALEXNET ARCHITECTURE FOR DETECTION OF CAR PARKING AVAILABILITY IN VIDEO CCTV

2020 ◽  
Vol 13 (2) ◽  
pp. 47-55
Author(s):  
Evan Tanuwijaya ◽  
Chastine Fatichah

The difficulty of finding a parking space in public places, especially during peak hours is a problem experienced by drivers. To assist the driver in finding parking space availability, a system is needed to monitor parking availability. One study to detect the availability of parking lots utilizing CCTV. However, research on the availability of parking spaces on CCTV data has several problems, detecting parking slots that are done manually to be inefficient when applied to different parking lots. Also, research to detect the availability of parking lots using the Convolution Neural Network (CNN) method with existing architecture has many parameters. Therefore, this study proposes a system to detect the availability of car parking lots using You Only Look Once (YOLO) V3 for marking the parking space and proposed a new architecture CNN called Lite AlexNet which has few parameters than other methods to speed up the process of detecting parking space availability. The best accuracy of the marking stage using YOLO V3 is 92.31% where the weather was cloudy. For the proposed Lite AlexNet get the best time training average which is 7 second compare to other existing methods and the average accuracy in every condition is 92.33% better than other methods.

2021 ◽  
Vol 2021 ◽  
pp. 1-8
Author(s):  
Wu Yanchen

Recent advancements in deep learning offer an effective approach for the study in machine vision using optical images. In this paper, a convolution neural network is used to deal with the target task of sonar detection, and the performance of each neural network model in the sonar image detection and recognition task of underwater box and tire is compared. The simulation results show that the neural network method proposed in this paper is better than the traditional machine learning methods and SSD network models. The average accuracy of the proposed method for sonar image target recognition is 93%, and the detection time of a single image is only 0.3 seconds.


2020 ◽  
Vol 7 (5) ◽  
pp. 191517
Author(s):  
Ganchao Bao ◽  
Yuan Wei ◽  
Xin Sun ◽  
Hongli Zhang

Answer selection is one of the key steps in many question answering (QA) applications. In this paper, a new deep model with two kinds of attention is proposed for answer selection: the double attention recurrent convolution neural network (DARCNN). Double attention means self-attention and cross-attention. The design inspiration of this model came from the transformer in the domain of machine translation. Self-attention can directly calculate dependencies between words regardless of the distance. However, self-attention ignores the distinction between its surrounding words and other words. Thus, we design a decay self-attention that prioritizes local words in a sentence. In addition, cross-attention is established to achieve interaction between question and candidate answer. With the outputs of self-attention and decay self-attention, we can get two kinds of interactive information via cross-attention. Finally, using the feature vectors of the question and answer, elementwise multiplication is used to combine with them and multilayer perceptron is used to predict the matching score. Experimental results on four QA datasets containing Chinese and English show that DARCNN performs better than other answer selection models, thereby demonstrating the effectiveness of self-attention, decay self-attention and cross-attention in answer selection tasks.


Sensors ◽  
2021 ◽  
Vol 21 (1) ◽  
pp. 259
Author(s):  
Kang Zhang ◽  
Shengchang Lan ◽  
Guiyuan Zhang

The purpose of this paper was to investigate the effect of a training state-of-the-art convolution neural network (CNN) for millimeter-wave radar-based hand gesture recognition (MR-HGR). Focusing on the small training dataset problem in MR-HGR, this paper first proposed to transfer the knowledge with the CNN models in computer vision to MR-HGR by fine-tuning the models with radar data samples. Meanwhile, for the different data modality in MR-HGR, a parameterized representation of temporal space-velocity (TSV) spectrogram was proposed as an integrated data modality of the time-evolving hand gesture features in the radar echo signals. The TSV spectrograms representing six common gestures in human–computer interaction (HCI) from nine volunteers were used as the data samples in the experiment. The evaluated models included ResNet with 50, 101, and 152 layers, DenseNet with 121, 161 and 169 layers, as well as light-weight MobileNet V2 and ShuffleNet V2, mostly proposed by many latest publications. In the experiment, not only self-testing (ST), but also more persuasive cross-testing (CT), were implemented to evaluate whether the fine-tuned models generalize to the radar data samples. The CT results show that the best fine-tuned models can reach to an average accuracy higher than 93% with a comparable ST average accuracy almost 100%. Moreover, in order to alleviate the problem caused by private gesture habits, an auxiliary test was performed by augmenting four shots of the gestures with the heaviest misclassifications into the training set. This enriching test is similar with the scenario that a tablet reacts to a new user. The results of two different volunteer in the enriching test shows that the average accuracy of the enriched gesture can be improved from 55.59% and 65.58% to 90.66% and 95.95% respectively. Compared with some baseline work in MR-HGR, the investigation by this paper can be beneficial in promoting MR-HGR in future industry applications and consumer electronic design.


2019 ◽  
Vol 6 (4) ◽  
pp. 413 ◽  
Author(s):  
Sisco Jupiyandi ◽  
Fadhil Rizqullah Saniputra ◽  
Yoga Pratama ◽  
Muhammad Robby Dharmawan ◽  
Imam Cholissodin

<p class="Abstrak">Besarnya lahan pada parkir dan jumlah kendaraan roda empat dalam hal ini adalah mobil, dapat menjadi kendala bagi pengendara lain dalam mengetahui posisi parkir mana yang masih dapat digunakan. Sistem pengembangan perparkiran yang ada masih kurang maksimal dalam memanfaatkan lahan dan efisiensi waktunya. Berdasarkan banyaknya kendaraan mobil yang semakin bertambah, maka kebutuhan akan lahan parkir juga semakin dibutuhkan. Banyak sekali sistem yang belum dapat menangani berbagai permasalahan yang ada. Sistem ini dapat mengetahui jumlah slot pada lahan parkir dengan akurat sehingga memudahkan pengelola. Selain itu sistem ini juga dikembangkan agar waktu pencarian lahan parkir oleh pengguna parkir bisa sangat cepat. Sistem ini menggunakan penerapan pemrograman GPU yang dikombinasi dengan <em>Modified</em> Yolo (M-Yolo). GPU pada M-Yolo dibutuhkan untuk mengolah citra sekaligus mengolah data untuk mendeteksi citra mobil dan jumlah mobil secara paralel. Hasil uji coba menunjukkan bahwa dengan menggunakan GPU dibandingkan dengan CPU dapat mempercepat waktu komputasi rata-rata sebesar 0,179 detik dengan rata-rata akurasi sebesar 100%.</p><p><em><strong>Abstract</strong></em></p><p class="Abstract"><em>The width of parking lot and the number of cars in the parking lot can be an obstacle for motorists to know the parking area in which part is still empty. Parking systems that exist at this time are still not maximal in the utilization of parking lots and time efficiency. Based on the number of vehicles that are growing, then the need for parking space is also more needed. Many of the existing parking systems have not been able to handle the various problems. This system can know the number of slots on the parking lot, making it easier for operators to know the empty parking lot. In addition, this system will also be designed so that parking time search by parking users doesn’t take a long time. This system uses implementation of GPU programming mixed with Modified Yolo (M-Yolo). GPU on M-Yolo is needed to process images while processing data to detect car and the number of cars using parallel computing. The test results show that using the GPU compared to the CPU can speed up the average computing time by 0.179 seconds and it obtained an average accuracy of 100%.</em></p><p><em><strong><br /></strong></em></p>


Author(s):  
Pengyuan Bai ◽  
Hua Xu ◽  
Li Sun

The recognition of modulation schemes for communication signals is an important part of communication surveillance and spectrum monitoring. An algorithm based on deep learning and spectrum texture is proposed to recognize modulation schemes. Based on imperceptible differences among various spectrums of modulation schemes, the algorithm uses Convolution Neural Network to capture the features of image texture and thus classify the features with a SOFTMAX classifier. The experiment shows the algorithm performs better than traditional algorithm based on feature parameters, while the features captured can better reveal the signal detail and reduces effort on feature parameter design.


2021 ◽  
Vol 12 ◽  
Author(s):  
Hua Zhang ◽  
Ruoyun Gou ◽  
Jili Shang ◽  
Fangyao Shen ◽  
Yifan Wu ◽  
...  

Speech emotion recognition (SER) is a difficult and challenging task because of the affective variances between different speakers. The performances of SER are extremely reliant on the extracted features from speech signals. To establish an effective features extracting and classification model is still a challenging task. In this paper, we propose a new method for SER based on Deep Convolution Neural Network (DCNN) and Bidirectional Long Short-Term Memory with Attention (BLSTMwA) model (DCNN-BLSTMwA). We first preprocess the speech samples by data enhancement and datasets balancing. Secondly, we extract three-channel of log Mel-spectrograms (static, delta, and delta-delta) as DCNN input. Then the DCNN model pre-trained on ImageNet dataset is applied to generate the segment-level features. We stack these features of a sentence into utterance-level features. Next, we adopt BLSTM to learn the high-level emotional features for temporal summarization, followed by an attention layer which can focus on emotionally relevant features. Finally, the learned high-level emotional features are fed into the Deep Neural Network (DNN) to predict the final emotion. Experiments on EMO-DB and IEMOCAP database obtain the unweighted average recall (UAR) of 87.86 and 68.50%, respectively, which are better than most popular SER methods and demonstrate the effectiveness of our propose method.


Author(s):  
Triando Hamonangan Saragih ◽  
Diny Melsye Nurul Fajri ◽  
Wayan Firdaus Mahmudy ◽  
Abdul Latief Abadi ◽  
Yusuf Priyo Anggodo

<p><span>Jatropha is a plant that has many functions, but this plant can be attacked by various diseases. Expert systems can be applied in identifying so that can help both farmers and extension workers to identify the disease. one of method that can be used is Extreme Learning Machine. Extreme Learning Machine is a method of learning in Neural Network which has a one-time iteration concept in each process. In this study get a maximum accuracy of 66.67% with an average accuracy of 60.61%. This proves the identification using Extreme Learning Machine is better than the comparison method that has been done before.</span></p>


Author(s):  
Januar Adi Putra ◽  
Nanik Suciati ◽  
Arya Yudhi Wijaya

[Id]Local binary pattern adalah sebuah kode biner yang menggambarkan pola tekstur lokal. Hal ini dibangun dengan lingkungan batas dengan nilai abu-abu dari pusatnya. Local binary pattern tradisional memiliki beberapa kelemahan yakni varian terhadap rotasi dan pada saat proses thresholding pixel sensitif terhadap noise. Pada penelitian ini diusulkan sebuah metode ektraksi fitur baru untuk mengatasi masalah tersebut, metode tersebut disebut full neighbour local binary pattern (fnlbp). Metode ini nantinya akan dikombinasikan dengan discrete wavelet transform untuk ektraksi fitur dari citra mammogram dengan metode klasifikasi adalah Backpropagation Neural Network (BPNN). Berdasar ujicoba yang telah dilakukan metode usulan mendapatkan rata-rata akurasi yang lebih baik daripada metode local binary pattern tradisional baik yang dikombinasi dengan discrete wavelet transform ataupun tidak. Performa metode usulan full neighbour local binary pattern dapat menghasilkan akurasi yang sempurna yakni 100% baik pada saat menggunakan discrete wavelet transform ataupun tidak, sedangkan akurasi terendah yang didapat adalah 90.49%.Kata Kunci: Ekstraksi fitur, local binary pattern, wavelet, klasifikasi mammogram.[En]Traditional local binary pattern have some disadvantages which is a variant of the rotation and during the thresholding process the pixel is sensitive to noise. At this study the authors proposed a new method of features extraction to solve that problem and this method called full neighbor local binary pattern (fnlbp). This method will be combined with discrete wavelet transform to extract the features of the mammogram image and the classification method is Backpro- pagation Neural Network (BPNN). Based on experiments the result of proposed method in an average accuracy is better than traditional methods of local binary pattern which combined with discrete wavelet transform or not. The performance of the proposed method of full neighbor local binary pattern can produce perfect accuracy that is 100%, this accuracy is reached when using discrete wavelet transform or not, while the lowest accuracy obtained is 90.49%.


Sign in / Sign up

Export Citation Format

Share Document