scholarly journals Multidimensional face representation in deep convolutional neural network reveals the mechanism underlying AI racism

2020 ◽  
Author(s):  
Jinhua Tian ◽  
Hailun Xie ◽  
Siyuan Hu ◽  
Jia Liu

AbstractThe increasingly popular application of AI runs the risks of amplifying social bias, such as classifying non-white faces to animals. Recent research has attributed the bias largely to data for training. However, the underlying mechanism is little known, and therefore strategies to rectify the bias are unresolved. Here we examined a typical deep convolutional neural network (DCNN), VGG-Face, which was trained with a face dataset consisting of more white faces than black and Asian faces. The transfer learning result showed significantly better performance in identifying white faces, just like the well-known social bias in human, the other-race effect (ORE). To test whether the effect resulted from the imbalance of face images, we retrained the VGG-Face with a dataset containing more Asian faces, and found a reverse ORE that the newly-trained VGG-Face preferred Asian faces over white faces in identification accuracy. In addition, when the number of Asian faces and white faces were matched in the dataset, the DCNN did not show any bias. To further examine how imbalanced image input led to the ORE, we performed the representational similarity analysis on VGG-Face’s activation. We found that when the dataset contained more white faces, the representation of white faces was more distinct, indexed by smaller ingroup similarity and larger representational Euclidean distance. That is, white faces were scattered more sparsely in the representational face space of the VGG-Face than the other faces. Importantly, the distinctiveness of faces was positively correlated with the identification accuracy, which explained the ORE observed in the VGG-Face. In sum, our study revealed the mechanism underlying the ORE in DCNNs, which provides a novel approach of study AI ethics. In addition, the face multidimensional representation theory discovered in human was found also applicable to DCNNs, advocating future studies to apply more cognitive theories to understand DCNN’s behavior.

2021 ◽  
Vol 15 ◽  
Author(s):  
Jinhua Tian ◽  
Hailun Xie ◽  
Siyuan Hu ◽  
Jia Liu

The increasingly popular application of AI runs the risk of amplifying social bias, such as classifying non-white faces as animals. Recent research has largely attributed this bias to the training data implemented. However, the underlying mechanism is poorly understood; therefore, strategies to rectify the bias are unresolved. Here, we examined a typical deep convolutional neural network (DCNN), VGG-Face, which was trained with a face dataset consisting of more white faces than black and Asian faces. The transfer learning result showed significantly better performance in identifying white faces, similar to the well-known social bias in humans, the other-race effect (ORE). To test whether the effect resulted from the imbalance of face images, we retrained the VGG-Face with a dataset containing more Asian faces, and found a reverse ORE that the newly-trained VGG-Face preferred Asian faces over white faces in identification accuracy. Additionally, when the number of Asian faces and white faces were matched in the dataset, the DCNN did not show any bias. To further examine how imbalanced image input led to the ORE, we performed a representational similarity analysis on VGG-Face's activation. We found that when the dataset contained more white faces, the representation of white faces was more distinct, indexed by smaller in-group similarity and larger representational Euclidean distance. That is, white faces were scattered more sparsely in the representational face space of the VGG-Face than the other faces. Importantly, the distinctiveness of faces was positively correlated with identification accuracy, which explained the ORE observed in the VGG-Face. In summary, our study revealed the mechanism underlying the ORE in DCNNs, which provides a novel approach to studying AI ethics. In addition, the face multidimensional representation theory discovered in humans was also applicable to DCNNs, advocating for future studies to apply more cognitive theories to understand DCNNs' behavior.


Author(s):  
Zhixian Chen ◽  
Jialin Tang ◽  
Xueyuan Gong ◽  
Qinglang Su

In order to improve the low accuracy of the face recognition methods in the case of e-health, this paper proposed a novel face recognition approach, which is based on convolutional neural network (CNN). In detail, through resolving the convolutional kernel, rectified linear unit (ReLU) activation function, dropout, and batch normalization, this novel approach reduces the number of parameters of the CNN model, improves the non-linearity of the CNN model, and alleviates overfitting of the CNN model. In these ways, the accuracy of face recognition is increased. In the experiments, the proposed approach is compared with principal component analysis (PCA) and support vector machine (SVM) on ORL, Cohn-Kanade, and extended Yale-B face recognition data set, and it proves that this approach is promising.


2020 ◽  
Author(s):  
Shan Xu ◽  
Yiyuan Zhang ◽  
Zonglei Zhen ◽  
Jia Liu

AbstractCan we recognize faces with zero experience on faces? This question is critical because it examines the role of experiences in the formation of domain-specific modules in the brain. Investigation with humans and non-human animals on this issue cannot easily dissociate the effect of the visual experience from that of the hardwired domain-specificity. Therefore the present study built a model of selective deprivation of the experience on faces with a representative deep convolutional neural network, AlexNet, by removing all images containing faces from its training stimuli. This model did not show significant deficits in face categorization and discrimination, and face-selective modules automatically emerged. However, the deprivation reduced the domain-specificity of the face module. In sum, our study provides undisputable evidence on the role of nature versus nurture in developing the domain-specific modules that domain-specificity may evolve from non-specific experience without genetic predisposition, and is further fine-tuned by domain-specific experience.


2019 ◽  
Vol 11 (15) ◽  
pp. 1774 ◽  
Author(s):  
Yaning Yi ◽  
Zhijie Zhang ◽  
Wanchang Zhang ◽  
Chuanrong Zhang ◽  
Weidong Li ◽  
...  

Urban building segmentation is a prevalent research domain for very high resolution (VHR) remote sensing; however, various appearances and complicated background of VHR remote sensing imagery make accurate semantic segmentation of urban buildings a challenge in relevant applications. Following the basic architecture of U-Net, an end-to-end deep convolutional neural network (denoted as DeepResUnet) was proposed, which can effectively perform urban building segmentation at pixel scale from VHR imagery and generate accurate segmentation results. The method contains two sub-networks: One is a cascade down-sampling network for extracting feature maps of buildings from the VHR image, and the other is an up-sampling network for reconstructing those extracted feature maps back to the same size of the input VHR image. The deep residual learning approach was adopted to facilitate training in order to alleviate the degradation problem that often occurred in the model training process. The proposed DeepResUnet was tested with aerial images with a spatial resolution of 0.075 m and was compared in performance under the exact same conditions with six other state-of-the-art networks—FCN-8s, SegNet, DeconvNet, U-Net, ResUNet and DeepUNet. Results of extensive experiments indicated that the proposed DeepResUnet outperformed the other six existing networks in semantic segmentation of urban buildings in terms of visual and quantitative evaluation, especially in labeling irregular-shape and small-size buildings with higher accuracy and entirety. Compared with the U-Net, the F1 score, Kappa coefficient and overall accuracy of DeepResUnet were improved by 3.52%, 4.67% and 1.72%, respectively. Moreover, the proposed DeepResUnet required much fewer parameters than the U-Net, highlighting its significant improvement among U-Net applications. Nevertheless, the inference time of DeepResUnet is slightly longer than that of the U-Net, which is subject to further improvement.


2020 ◽  
Vol 8 (4) ◽  
pp. 78-95
Author(s):  
Neeru Jindal ◽  
Harpreet Kaur

Doctored video generation with easily accessible editing software has proven to be a major problem in maintaining its authenticity. This article is focused on a highly efficient method for the exposure of inter-frame tampering in the videos by means of deep convolutional neural network (DCNN). The proposed algorithm will detect forgery without requiring additional pre-embedded information of the frame. The other significance from pre-existing learning techniques is that the algorithm classifies the forged frames on the basis of the correlation between the frames and the observed abnormalities using DCNN. The decoders used for batch normalization of input improve the training swiftness. Simulation results obtained on REWIND and GRIP video dataset with an average accuracy of 98% shows the superiority of the proposed algorithm as compared to the existing one. The proposed algorithm is capable of detecting the forged content in You Tube compressed video with an accuracy reaching up to 100% for GRIP dataset and 98.99% for REWIND dataset.


2020 ◽  
Vol 12 (3) ◽  
pp. 77-95
Author(s):  
Amar B. Deshmukh ◽  
N. Usha Rani

One of the major challenges faced by video surveillance is recognition from low-resolution videos or person identification. Image enhancement methods play a significant role in enhancing the resolution of the video. This article introduces a technique for face super resolution based on a deep convolutional neural network (Deep CNN). At first, the video frames are extracted from the input video and the face detection is performed using the Viola-Jones algorithm. The detected face image and the scaling factors are fed into the Fractional-Grey Wolf Optimizer (FGWO)-based kernel weighted regression model and the proposed Deep CNN separately. Finally, the results obtained from both the techniques are integrated using a fuzzy logic system, offering a face image with enhanced resolution. Experimentation is carried out using the UCSD face video dataset, and the effectiveness of the proposed Deep CNN is checked depending on the block size and the upscaling factor values and is evaluated to be the best when compared to other existing techniques with an improved SDME value of 80.888.


Sign in / Sign up

Export Citation Format

Share Document