scholarly journals An Hybrid Attention-Based System for the Prediction of Facial Attributes

Author(s):  
Souad Khellat-Kihel ◽  
Zhenan Sun ◽  
Massimo Tistarelli

AbstractRecent research on face analysis has demonstrated the richness of information embedded in feature vectors extracted from a deep convolutional neural network. Even though deep learning achieved a very high performance on several challenging visual tasks, such as determining the identity, age, gender and race, it still lacks a well grounded theory which allows to properly understand the processes taking place inside the network layers. Therefore, most of the underlying processes are unknown and not easy to control. On the other hand, the human visual system follows a well understood process in analyzing a scene or an object, such as a face. The direction of the eye gaze is repeatedly directed, through purposively planned saccadic movements, towards salient regions to capture several details. In this paper we propose to capitalize on the knowledge of the saccadic human visual processes to design a system to predict facial attributes embedding a biologically-inspired network architecture, the HMAX. The architecture is tailored to predict attributes with different textural information and conveying different semantic meaning, such as attributes related and unrelated to the subject’s identity. Salient points on the face are extracted from the outputs of the S2 layer of the HMAX architecture and fed to a local texture characterization module based on LBP (Local Binary Pattern). The resulting feature vector is used to perform a binary classification on a set of pre-defined visual attributes. The devised system allows to distill a very informative, yet robust, representation of the imaged faces, allowing to obtain high performance but with a much simpler architecture as compared to a deep convolutional neural network. Several experiments performed on publicly available, challenging, large datasets demonstrate the validity of the proposed approach.

2018 ◽  
Vol 38 (6) ◽  
Author(s):  
Binbin Wang ◽  
Li Xiao ◽  
Yang Liu ◽  
Jing Wang ◽  
Beihong Liu ◽  
...  

There is a disparity between the increasing application of digital retinal imaging to neonatal ocular screening and slowly growing number of pediatric ophthalmologists. Assistant tools that can automatically detect ocular disorders may be needed. In present study, we develop a deep convolutional neural network (DCNN) for automated classification and grading of retinal hemorrhage. We used 48,996 digital fundus images from 3770 newborns with retinal hemorrhage of different severity (grade 1, 2 and 3) and normal controls from a large cross-sectional investigation in China. The DCNN was trained for automated grading of retinal hemorrhage (multiclass classification problem: hemorrhage-free and grades 1, 2 and 3) and then validated for its performance level. The DCNN yielded an accuracy of 97.85 to 99.96%, and the area under the receiver operating characteristic curve was 0.989–1.000 in the binary classification of neonatal retinal hemorrhage (i.e., one classification vs. the others). The overall accuracy with regard to the multiclass classification problem was 97.44%. This is the first study to show that a DCNN can detect and grade neonatal retinal hemorrhage at high performance levels. Artificial intelligence will play more positive roles in ocular healthcare of newborns and children.


2021 ◽  
Vol 11 (15) ◽  
pp. 6845
Author(s):  
Abu Sayeed ◽  
Jungpil Shin ◽  
Md. Al Mehedi Hasan ◽  
Azmain Yakin Srizon ◽  
Md. Mehedi Hasan

As it is the seventh most-spoken language and fifth most-spoken native language in the world, the domain of Bengali handwritten character recognition has fascinated researchers for decades. Although other popular languages i.e., English, Chinese, Hindi, Spanish, etc. have received many contributions in the area of handwritten character recognition, Bengali has not received many noteworthy contributions in this domain because of the complex curvatures and similar writing fashions of Bengali characters. Previously, studies were conducted by using different approaches based on traditional learning, and deep learning. In this research, we proposed a low-cost novel convolutional neural network architecture for the recognition of Bengali characters with only 2.24 to 2.43 million parameters based on the number of output classes. We considered 8 different formations of CMATERdb datasets based on previous studies for the training phase. With experimental analysis, we showed that our proposed system outperformed previous works by a noteworthy margin for all 8 datasets. Moreover, we tested our trained models on other available Bengali characters datasets such as Ekush, BanglaLekha, and NumtaDB datasets. Our proposed architecture achieved 96–99% overall accuracies for these datasets as well. We believe our contributions will be beneficial for developing an automated high-performance recognition tool for Bengali handwritten characters.


2022 ◽  
Vol 10 (1) ◽  
pp. 0-0

Brain tumor is a severe cancer disease caused by uncontrollable and abnormal partitioning of cells. Timely disease detection and treatment plans lead to the increased life expectancy of patients. Automated detection and classification of brain tumor are a more challenging process which is based on the clinician’s knowledge and experience. For this fact, one of the most practical and important techniques is to use deep learning. Recent progress in the fields of deep learning has helped the clinician’s in medical imaging for medical diagnosis of brain tumor. In this paper, we present a comparison of Deep Convolutional Neural Network models for automatically binary classification query MRI images dataset with the goal of taking precision tools to health professionals based on fined recent versions of DenseNet, Xception, NASNet-A, and VGGNet. The experiments were conducted using an MRI open dataset of 3,762 images. Other performance measures used in the study are the area under precision, recall, and specificity.


Author(s):  
Kun Xu ◽  
Shunming Li ◽  
Jinrui Wang ◽  
Zenghui An ◽  
Yu Xin

Deep learning method is gradually applied in the field of mechanical equipment fault diagnosis because it can learn complex and useful features automatically from the vibration signals. Among the many intelligent diagnostic models, convolutional neural network has been gradually applied to intelligent fault diagnosis of bearings due to its advantages of local connection and weight sharing. However, there are still some drawbacks. (1) The training process of convolutional neural network is slow and unstable. It has more training parameters. (2) It cannot perform well under different working conditions, such as noisy environment and different workloads. In this paper, a novel model named adaptive and fast convolutional neural network with wide receptive field is presented to overcome the aforementioned deficiencies. The prime innovations include the following. First, a deep convolutional neural network architecture is constructed using the scaled exponential linear unit activation function and global average pooling. The model has fewer training parameters and can converge rapidly and stably. Second, the model has a wide receptive field with two medium and three small length convolutional kernels. It also has high diagnostic accuracy and robustness when the environment is noisy and workloads are changed compared with other models. Furthermore, to demonstrate how the wide receptive field convolutional neural network model works, the reasons for high model performance are analyzed and the learned features are also visualized. Finally, the wide receptive field convolutional neural network model is verified by the vibration dataset collected in the background of high noise, and the results indicate that it has high diagnostic performance.


2020 ◽  
Vol 7 ◽  
Author(s):  
Hayden Gunraj ◽  
Linda Wang ◽  
Alexander Wong

The coronavirus disease 2019 (COVID-19) pandemic continues to have a tremendous impact on patients and healthcare systems around the world. In the fight against this novel disease, there is a pressing need for rapid and effective screening tools to identify patients infected with COVID-19, and to this end CT imaging has been proposed as one of the key screening methods which may be used as a complement to RT-PCR testing, particularly in situations where patients undergo routine CT scans for non-COVID-19 related reasons, patients have worsening respiratory status or developing complications that require expedited care, or patients are suspected to be COVID-19-positive but have negative RT-PCR test results. Early studies on CT-based screening have reported abnormalities in chest CT images which are characteristic of COVID-19 infection, but these abnormalities may be difficult to distinguish from abnormalities caused by other lung conditions. Motivated by this, in this study we introduce COVIDNet-CT, a deep convolutional neural network architecture that is tailored for detection of COVID-19 cases from chest CT images via a machine-driven design exploration approach. Additionally, we introduce COVIDx-CT, a benchmark CT image dataset derived from CT imaging data collected by the China National Center for Bioinformation comprising 104,009 images across 1,489 patient cases. Furthermore, in the interest of reliability and transparency, we leverage an explainability-driven performance validation strategy to investigate the decision-making behavior of COVIDNet-CT, and in doing so ensure that COVIDNet-CT makes predictions based on relevant indicators in CT images. Both COVIDNet-CT and the COVIDx-CT dataset are available to the general public in an open-source and open access manner as part of the COVID-Net initiative. While COVIDNet-CT is not yet a production-ready screening solution, we hope that releasing the model and dataset will encourage researchers, clinicians, and citizen data scientists alike to leverage and build upon them.


2019 ◽  
Author(s):  
Nicholas K. DeWind

SummaryHumans and many non-human animals have the “number sense,” an ability to estimate the number of items in a set without counting. This innate sense of number is hypothesized to provide a foundation for more complex numerical and mathematical concepts. Here I investigated whether we also share the number sense with a deep convolutional neural network (DCNN) trained for object recognition. These in silico networks have revolutionized machine learning over the last seven years, allowing computers to reach human-level performance on object recognition tasks for the first time. Their architecture is based on the structure of mammalian visual cortex, and after they are trained, they provide a highly predictive model of responses in primate visual cortex, suggesting deep homologies. I found that the DCNN demonstrates three key hallmarks of the number sense: numerosity-selective units (analogous to biological neurons), the behavioral ratio effect, and ordinality over representational space. Because the DCNN was not trained to enumerate, I conclude that the number sense is an emergent property of the network, the result of some combination of the network architecture and the constraint to develop the complex representational structure necessary for object recognition. By analogy I conclude that the number sense in animals was not necessarily the result of direct selective pressure to enumerate but might have “come for free” with the evolution of a complex visual system that evolved to identify objects and scenes in the real world.


Sign in / Sign up

Export Citation Format

Share Document