scholarly journals To the question of realization of machine vision technology based on FPGA-architectures

2021 ◽  
Vol 2142 (1) ◽  
pp. 012022
Author(s):  
K A Timakov

Abstract In the last few years, machine learning and machine vision technologies have started to gain more and more popularity. This industry occupies one of the leading positions in the field of information technology. The paper is devoted to the development of a machine vision algorithm based on new generations of FPGAs for recognizing handwritten Cyrillic characters in images and video streams, in particular. The article raises the issues of using FPGA as an image segmentation accelerator and organizing work with the video stream, choosing the most suitable FPGA platform, creating training samples of handwritten characters, and working with the convolutional neural network AlexNet.

2021 ◽  
Vol 4 (1) ◽  
Author(s):  
Peter M. Maloca ◽  
Philipp L. Müller ◽  
Aaron Y. Lee ◽  
Adnan Tufail ◽  
Konstantinos Balaskas ◽  
...  

AbstractMachine learning has greatly facilitated the analysis of medical data, while the internal operations usually remain intransparent. To better comprehend these opaque procedures, a convolutional neural network for optical coherence tomography image segmentation was enhanced with a Traceable Relevance Explainability (T-REX) technique. The proposed application was based on three components: ground truth generation by multiple graders, calculation of Hamming distances among graders and the machine learning algorithm, as well as a smart data visualization (‘neural recording’). An overall average variability of 1.75% between the human graders and the algorithm was found, slightly minor to 2.02% among human graders. The ambiguity in ground truth had noteworthy impact on machine learning results, which could be visualized. The convolutional neural network balanced between graders and allowed for modifiable predictions dependent on the compartment. Using the proposed T-REX setup, machine learning processes could be rendered more transparent and understandable, possibly leading to optimized applications.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-13
Author(s):  
Feng-Ping An ◽  
Jun-e Liu

Medical image segmentation is a key technology for image guidance. Therefore, the advantages and disadvantages of image segmentation play an important role in image-guided surgery. Traditional machine learning methods have achieved certain beneficial effects in medical image segmentation, but they have problems such as low classification accuracy and poor robustness. Deep learning theory has good generalizability and feature extraction ability, which provides a new idea for solving medical image segmentation problems. However, deep learning has problems in terms of its application to medical image segmentation: one is that the deep learning network structure cannot be constructed according to medical image characteristics; the other is that the generalizability y of the deep learning model is weak. To address these issues, this paper first adapts a neural network to medical image features by adding cross-layer connections to a traditional convolutional neural network. In addition, an optimized convolutional neural network model is established. The optimized convolutional neural network model can segment medical images using the features of two scales simultaneously. At the same time, to solve the generalizability problem of the deep learning model, an adaptive distribution function is designed according to the position of the hidden layer, and then the activation probability of each layer of neurons is set. This enhances the generalizability of the dropout model, and an adaptive dropout model is proposed. This model better addresses the problem of the weak generalizability of deep learning models. Based on the above ideas, this paper proposes a medical image segmentation algorithm based on an optimized convolutional neural network with adaptive dropout depth calculation. An ultrasonic tomographic image and lumbar CT medical image were separately segmented by the method of this paper. The experimental results show that not only are the segmentation effects of the proposed method improved compared with those of the traditional machine learning and other deep learning methods but also the method has a high adaptive segmentation ability for various medical images. The research work in this paper provides a new perspective for research on medical image segmentation.


Author(s):  
Satoru Tsuiki ◽  
Takuya Nagaoka ◽  
Tatsuya Fukuda ◽  
Yuki Sakamoto ◽  
Fernanda R. Almeida ◽  
...  

Abstract Purpose In 2-dimensional lateral cephalometric radiographs, patients with severe obstructive sleep apnea (OSA) exhibit a more crowded oropharynx in comparison with non-OSA. We tested the hypothesis that machine learning, an application of artificial intelligence (AI), could be used to detect patients with severe OSA based on 2-dimensional images. Methods A deep convolutional neural network was developed (n = 1258; 90%) and tested (n = 131; 10%) using data from 1389 (100%) lateral cephalometric radiographs obtained from individuals diagnosed with severe OSA (n = 867; apnea hypopnea index > 30 events/h sleep) or non-OSA (n = 522; apnea hypopnea index < 5 events/h sleep) at a single center for sleep disorders. Three kinds of data sets were prepared by changing the area of interest using a single image: the original image without any modification (full image), an image containing a facial profile, upper airway, and craniofacial soft/hard tissues (main region), and an image containing part of the occipital region (head only). A radiologist also performed a conventional manual cephalometric analysis of the full image for comparison. Results The sensitivity/specificity was 0.87/0.82 for full image, 0.88/0.75 for main region, 0.71/0.63 for head only, and 0.54/0.80 for the manual analysis. The area under the receiver-operating characteristic curve was the highest for main region 0.92, for full image 0.89, for head only 0.70, and for manual cephalometric analysis 0.75. Conclusions A deep convolutional neural network identified individuals with severe OSA with high accuracy. Future research on this concept using AI and images can be further encouraged when discussing triage of OSA.


2021 ◽  
Vol 7 (2) ◽  
pp. 37
Author(s):  
Isah Charles Saidu ◽  
Lehel Csató

We present a sample-efficient image segmentation method using active learning, we call it Active Bayesian UNet, or AB-UNet. This is a convolutional neural network using batch normalization and max-pool dropout. The Bayesian setup is achieved by exploiting the probabilistic extension of the dropout mechanism, leading to the possibility to use the uncertainty inherently present in the system. We set up our experiments on various medical image datasets and highlight that with a smaller annotation effort our AB-UNet leads to stable training and better generalization. Added to this, we can efficiently choose from an unlabelled dataset.


Sensors ◽  
2019 ◽  
Vol 19 (1) ◽  
pp. 210 ◽  
Author(s):  
Zied Tayeb ◽  
Juri Fedjaev ◽  
Nejla Ghaboosi ◽  
Christoph Richter ◽  
Lukas Everding ◽  
...  

Non-invasive, electroencephalography (EEG)-based brain-computer interfaces (BCIs) on motor imagery movements translate the subject’s motor intention into control signals through classifying the EEG patterns caused by different imagination tasks, e.g., hand movements. This type of BCI has been widely studied and used as an alternative mode of communication and environmental control for disabled patients, such as those suffering from a brainstem stroke or a spinal cord injury (SCI). Notwithstanding the success of traditional machine learning methods in classifying EEG signals, these methods still rely on hand-crafted features. The extraction of such features is a difficult task due to the high non-stationarity of EEG signals, which is a major cause by the stagnating progress in classification performance. Remarkable advances in deep learning methods allow end-to-end learning without any feature engineering, which could benefit BCI motor imagery applications. We developed three deep learning models: (1) A long short-term memory (LSTM); (2) a spectrogram-based convolutional neural network model (CNN); and (3) a recurrent convolutional neural network (RCNN), for decoding motor imagery movements directly from raw EEG signals without (any manual) feature engineering. Results were evaluated on our own publicly available, EEG data collected from 20 subjects and on an existing dataset known as 2b EEG dataset from “BCI Competition IV”. Overall, better classification performance was achieved with deep learning models compared to state-of-the art machine learning techniques, which could chart a route ahead for developing new robust techniques for EEG signal decoding. We underpin this point by demonstrating the successful real-time control of a robotic arm using our CNN based BCI.


Author(s):  
E. Yu. Shchetinin

The recognition of human emotions is one of the most relevant and dynamically developing areas of modern speech technologies, and the recognition of emotions in speech (RER) is the most demanded part of them. In this paper, we propose a computer model of emotion recognition based on an ensemble of bidirectional recurrent neural network with LSTM memory cell and deep convolutional neural network ResNet18. In this paper, computer studies of the RAVDESS database containing emotional speech of a person are carried out. RAVDESS-a data set containing 7356 files. Entries contain the following emotions: 0 – neutral, 1 – calm, 2 – happiness, 3 – sadness, 4 – anger, 5 – fear, 6 – disgust, 7 – surprise. In total, the database contains 16 classes (8 emotions divided into male and female) for a total of 1440 samples (speech only). To train machine learning algorithms and deep neural networks to recognize emotions, existing audio recordings must be pre-processed in such a way as to extract the main characteristic features of certain emotions. This was done using Mel-frequency cepstral coefficients, chroma coefficients, as well as the characteristics of the frequency spectrum of audio recordings. In this paper, computer studies of various models of neural networks for emotion recognition are carried out on the example of the data described above. In addition, machine learning algorithms were used for comparative analysis. Thus, the following models were trained during the experiments: logistic regression (LR), classifier based on the support vector machine (SVM), decision tree (DT), random forest (RF), gradient boosting over trees – XGBoost, convolutional neural network CNN, recurrent neural network RNN (ResNet18), as well as an ensemble of convolutional and recurrent networks Stacked CNN-RNN. The results show that neural networks showed much higher accuracy in recognizing and classifying emotions than the machine learning algorithms used. Of the three neural network models presented, the CNN + BLSTM ensemble showed higher accuracy.


Sign in / Sign up

Export Citation Format

Share Document