scholarly journals Hybrid deep neural network for Bangla automated image descriptor

Author(s):  
Md Asifuzzaman Jishan ◽  
Khan Raqib Mahmud ◽  
Abul Kalam Al Azad ◽  
Md Shahabub Alam ◽  
Anif Minhaz Khan

Automated image to text generation is a computationally challenging computer vision task which requires sufficient comprehension of both syntactic and semantic meaning of an image to generate a meaningful description. Until recent times, it has been studied to a limited scope due to the lack of visual-descriptor dataset and functional models to capture intrinsic complexities involving features of an image. In this study, a novel dataset was constructed by generating Bangla textual descriptor from visual input, called Bangla Natural Language Image to Text (BNLIT), incorporating 100 classes with annotation. A deep neural network-based image captioning model was proposed to generate image description. The model employs Convolutional Neural Network (CNN) to classify the whole dataset, while Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) capture the sequential semantic representation of text-based sentences and generate pertinent description based on the modular complexities of an image. When tested on the new dataset, the model accomplishes significant enhancement of centrality execution for image semantic recovery assignment. For the experiment of that task, we implemented a hybrid image captioning model, which achieved a remarkable result for a new self-made dataset, and that task was new for the Bangladesh perspective. In brief, the model provided benchmark precision in the characteristic Bangla syntax reconstruction and comprehensive numerical analysis of the model execution results on the dataset.

Author(s):  
Anish Banda

Abstract: In the model we proposed, we examine the deep neural networks-based image caption generation technique. We give image as input to the model, the technique give output in three different forms i.e., sentence in three different languages describing the image, mp3 audio file and an image file is also generated. In this model, we use the techniques of both computer vision and natural language processing. We are aiming to develop a model using the techniques of Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) to build a model to generate a Caption. Target image is compared with the training images, we have a large dataset containing the training images, this is done by convolutional neural network. This model generates a decent description utilizing the trained data. To extract features from images we need encoder, we use CNN as encoder. To decode the description of image generated we use LSTM. To evaluate the accuracy of generated caption we use BLEU metric algorithm. It grades the quality of content generated. Performance is calculated by the standard calculation matrices. Keywords: CNN, RNN, LSTM, BLEU score, encoder, decoder, captions, image description.


2020 ◽  
Vol 50 (8) ◽  
pp. 2339-2351 ◽  
Author(s):  
Tianshi Wang ◽  
Li Liu ◽  
Naiwen Liu ◽  
Huaxiang Zhang ◽  
Long Zhang ◽  
...  

Kybernetes ◽  
2019 ◽  
Vol 49 (9) ◽  
pp. 2335-2348 ◽  
Author(s):  
Milad Yousefi ◽  
Moslem Yousefi ◽  
Masood Fathi ◽  
Flavio S. Fogliatto

Purpose This study aims to investigate the factors affecting daily demand in an emergency department (ED) and to provide a forecasting tool in a public hospital for horizons of up to seven days. Design/methodology/approach In this study, first, the important factors to influence the demand in EDs were extracted from literature then the relevant factors to the study are selected. Then, a deep neural network is applied to constructing a reliable predictor. Findings Although many statistical approaches have been proposed for tackling this issue, better forecasts are viable by using the abilities of machine learning algorithms. Results indicate that the proposed approach outperforms statistical alternatives available in the literature such as multiple linear regression, autoregressive integrated moving average, support vector regression, generalized linear models, generalized estimating equations, seasonal ARIMA and combined ARIMA and linear regression. Research limitations/implications The authors applied this study in a single ED to forecast patient visits. Applying the same method in different EDs may give a better understanding of the performance of the model to the authors. The same approach can be applied in any other demand forecasting after some minor modifications. Originality/value To the best of the knowledge, this is the first study to propose the use of long short-term memory for constructing a predictor of the number of patient visits in EDs.


2020 ◽  
Vol 10 (16) ◽  
pp. 5622
Author(s):  
Zitong Zhou ◽  
Yanyang Zi ◽  
Jingsong Xie ◽  
Jinglong Chen ◽  
Tong An

The escalator is one of the most popular travel methods in public places, and the safe working of the escalator is significant. Accurately predicting the escalator failure time can provide scientific guidance for maintenance to avoid accidents. However, failure data have features of short length, non-uniform sampling, and random interference, which makes the data modeling difficult. Therefore, a strategy that combines data quality enhancement with deep neural networks is proposed for escalator failure time prediction in this paper. First, a comprehensive selection indicator (CSI) that can describe the stationarity and complexity of time series is established to select inherently excellent failure sequences. According to the CSI, failure sequences with high stationarity and low complexity are selected as the referenced sequences to enhance the quality of other failure sequences by using dynamic time warping preprocessing. Secondly, a deep neural network combining the advantages of a convolutional neural network and long short-term memory is built to train and predict quality-enhanced failure sequences. Finally, the failure-recall record of six escalators used for 6 years is analyzed by using the proposed method as a case study, and the results show that the proposed method can reduce the average prediction error of failure time to less than one month.


2018 ◽  
Vol 2018 ◽  
pp. 1-20 ◽  
Author(s):  
Abdullah-Al Nahid ◽  
Mohamad Ali Mehrabi ◽  
Yinan Kong

Breast Cancer is a serious threat and one of the largest causes of death of women throughout the world. The identification of cancer largely depends on digital biomedical photography analysis such as histopathological images by doctors and physicians. Analyzing histopathological images is a nontrivial task, and decisions from investigation of these kinds of images always require specialised knowledge. However, Computer Aided Diagnosis (CAD) techniques can help the doctor make more reliable decisions. The state-of-the-art Deep Neural Network (DNN) has been recently introduced for biomedical image analysis. Normally each image contains structural and statistical information. This paper classifies a set of biomedical breast cancer images (BreakHis dataset) using novel DNN techniques guided by structural and statistical information derived from the images. Specifically a Convolutional Neural Network (CNN), a Long-Short-Term-Memory (LSTM), and a combination of CNN and LSTM are proposed for breast cancer image classification. Softmax and Support Vector Machine (SVM) layers have been used for the decision-making stage after extracting features utilising the proposed novel DNN models. In this experiment the best Accuracy value of 91.00% is achieved on the 200x dataset, the best Precision value 96.00% is achieved on the 40x dataset, and the best F-Measure value is achieved on both the 40x and 100x datasets.


Processes ◽  
2021 ◽  
Vol 10 (1) ◽  
pp. 55
Author(s):  
Jae Eon Kwon ◽  
Tanvir Alam Shifat ◽  
Akeem Bayo Kareem ◽  
Jang-Wook Hur

Switched-mode power supply (SMPS) has been of vital importance majorly in power management of industrial equipment with much-improved efficiency and reliability. Given the diverse range on loading and operating conditions of SMPS, several anomalies can occur in the device resulting to over-voltage, overloading, erratic atmospheric conditions, etc. Electrical over-stress (EOS) is one of the commonly used causes of failure among power electronic devices. Since there is a limitation for the SMPS in terms of input voltage and current (two methods of controlling an SMPS), the device has been subjected to an accelerated aging test using EOS. This study presents a two-fold approach to evaluate the overall state of health of SMPS using an integration of extended Kalman filter (EKF) and deep neural network. Firstly, the EKF algorithm would assist in fusing fault features to acquire an comprehensive degradation trend. Secondly, the degradation pattern of the SMPS has been monitored for four different electrical loadings, and a bi-directional long short-term memory (BiLSTM) deep neural network is trained for future predictions. The proposed model provides a unique approach and accuracy in SMPS fault indication with the aid of electrical parameters.


2021 ◽  
Vol 2021 ◽  
pp. 1-12
Author(s):  
Yangzi Zhao

The stock market is affected by economic market, policy, and other factors, and its internal change law is extremely complex. With the rapid development of the stock market and the expansion of the scale of investors, the stock market has produced a large number of transaction data, which makes it more difficult to obtain valuable information. Because deep neural network is good at dealing with the prediction problems with large amount of data and complex nonlinear mapping relationship, this paper proposes an attention-guided deep neural network stock prediction algorithm. This paper synthesizes the daily stock social media text emotion index and stock technology index as the data source and applies them to the long-term and short-term memory neural network (LSTM) model to predict the stock market. The stock emotion index is extracted by constructing a social text classification emotion model of bidirectional long-term and short-term memory neural network (Bi-LSTM) based on attention mechanism and glove word vector representation algorithm. In addition, a dimensionality reduction model based on decision tree (DT) and principal component analysis (PCA) is constructed to reduce the dimensionality of stock technical indicators and extract the main data information. Furthermore, this paper proposes a model based on nasNet for pattern recognition. The recognition results can be used to automatically identify short-term K-line patterns, predict reliable trading signals, and help investors customize short-term high-efficiency investment strategies. The experimental results show that the prediction accuracy of the proposed algorithm can reach 98.6%, which has high application value.


2019 ◽  
Author(s):  
Kangkang Zhang ◽  
Tong Liu ◽  
Shengjing Song ◽  
Xin Zhao ◽  
Shijun Sun ◽  
...  

AbstractAcquiring clear and usable audio recordings is critical for acoustic analysis of animal vocalizations. Bioacoustics studies commonly face the problem of overlapping signals, but the issue is often ignored, as there is currently no satisfactory solution. This study presents a bi-directional long short-term memory (BLSTM) network to separate overlapping bat calls and reconstruct waveform audio sounds. The separation quality was evaluated using seven temporal-spectrum parameters. The applicability of this method for bat calls was assessed using six different species. In addition, clustering analysis was conducted with separated echolocation calls from each population. Results showed that all syllables in the overlapping calls were separated with high robustness across species. A comparison between the seven temporal-spectrum parameters showed no significant difference and negligible deviation between the extracted and original calls, indicating high separation quality. Clustering analysis of the separated echolocation calls also produced an accuracy of 93.8%, suggesting the reconstructed waveform sounds could be reliably used. These results suggest the proposed technique is a convenient and automated approach for separating overlapping calls using a BLSTM network. This powerful deep neural network approach has the potential to solve complex problems in bioacoustics.Author summaryIn recent years, the development of recording techniques and devices in animal acoustic experiment and population monitoring has led to a sharp increase in the volume of sound data. However, the collected sound would be overlapped because of the existence of multiple individuals, which laid restrictions on taking full advantage of experiment data. Besides, more convenient and automatic methods are needed to cope with the large datasets in animal acoustics. The echolocation calls and communication calls of bats are variable and often overlapped with each other both in the recordings from field and laboratory, which provides an excellent template for research on animal sound separation. Here, we firstly solved the problem of overlapping calls in bats successfully based on deep neural network. We built a network to separate the overlapping calls of six bat species. All the syllables in overlapping calls were separated and we found no significant difference between the separated syllables with non-overlapping syllables. We also demonstrated an instance of applying our method on species classification. Our study provides a useful and efficient model for sound data processing in acoustic research and the proposed method has the potential to be generalized to other animal species.


2018 ◽  
Vol 35 (3) ◽  
pp. 445-470 ◽  
Author(s):  
Xiaoxiao Liu ◽  
Qingyang Xu ◽  
Ning Wang

Sign in / Sign up

Export Citation Format

Share Document