scholarly journals Accelerating Convolutional Neural Network Training for Colon Histopathology Images by Customizing Deep Learning Framework

Author(s):  
Toto HARYANTO ◽  
Heru SUHARTANTO ◽  
Aniati MURNI ARYMURTHY ◽  
Kusmardi KUSMARDI
2020 ◽  
Vol 69 (1) ◽  
pp. 24-34 ◽  
Author(s):  
Mohammad K. Al-Sharman ◽  
Yahya Zweiri ◽  
Mohammad Abdel Kareem Jaradat ◽  
Raghad Al-Husari ◽  
Dongming Gan ◽  
...  

IEEE Access ◽  
2020 ◽  
Vol 8 ◽  
pp. 129889-129898
Author(s):  
Xin Dong ◽  
Yizhao Zhou ◽  
Lantian Wang ◽  
Jingfeng Peng ◽  
Yanbo Lou ◽  
...  

2020 ◽  
Vol 12 (5) ◽  
pp. 1-15
Author(s):  
Zhenghao Han ◽  
Li Li ◽  
Weiqi Jin ◽  
Xia Wang ◽  
Gangcheng Jiao ◽  
...  

Sensors ◽  
2020 ◽  
Vol 21 (1) ◽  
pp. 174
Author(s):  
Minkoo Kang ◽  
Gyeongsik Yang ◽  
Yeonho Yoo ◽  
Chuck Yoo

This paper presents “Proactive Congestion Notification” (PCN), a congestion-avoidance technique for distributed deep learning (DDL). DDL is widely used to scale out and accelerate deep neural network training. In DDL, each worker trains a copy of the deep learning model with different training inputs and synchronizes the model gradients at the end of each iteration. However, it is well known that the network communication for synchronizing model parameters is the main bottleneck in DDL. Our key observation is that the DDL architecture makes each worker generate burst traffic every iteration, which causes network congestion and in turn degrades the throughput of DDL traffic. Based on this observation, the key idea behind PCN is to prevent potential congestion by proactively regulating the switch queue length before DDL burst traffic arrives at the switch, which prepares the switches for handling incoming DDL bursts. In our evaluation, PCN improves the throughput of DDL traffic by 72% on average.


2021 ◽  
Author(s):  
◽  
Martin Mundt

Deep learning with neural networks seems to have largely replaced traditional design of computer vision systems. Automated methods to learn a plethora of parameters are now used in favor of previously practiced selection of explicit mathematical operators for a specific task. The entailed promise is that practitioners no longer need to take care of every individual step, but rather focus on gathering big amounts of data for neural network training. As a consequence, both a shift in mindset towards a focus on big datasets, as well as a wave of conceivable applications based exclusively on deep learning can be observed. This PhD dissertation aims to uncover some of the only implicitly mentioned or overlooked deep learning aspects, highlight unmentioned assumptions, and finally introduce methods to address respective immediate weaknesses. In the author’s humble opinion, these prevalent shortcomings can be tied to the fact that the involved steps in the machine learning workflow are frequently decoupled. Success is predominantly measured based on accuracy measures designed for evaluation with static benchmark test sets. Individual machine learning workflow components are assessed in isolation with respect to available data, choice of neural network architecture, and a particular learning algorithm, rather than viewing the machine learning system as a whole in context of a particular application. Correspondingly, in this dissertation, three key challenges have been identified: 1. Choice and flexibility of a neural network architecture. 2. Identification and rejection of unseen unknown data to avoid false predictions. 3. Continual learning without forgetting of already learned information. These latter challenges have already been crucial topics in older literature, alas, seem to require a renaissance in modern deep learning literature. Initially, it may appear that they pose independent research questions, however, the thesis posits that the aspects are intertwined and require a joint perspective in machine learning based systems. In summary, the essential question is thus how to pick a suitable neural network architecture for a specific task, how to recognize which data inputs belong to this context, which ones originate from potential other tasks, and ultimately how to continuously include such identified novel data in neural network training over time without overwriting existing knowledge. Thus, the central emphasis of this dissertation is to build on top of existing deep learning strengths, yet also acknowledge mentioned weaknesses, in an effort to establish a deeper understanding of interdependencies and synergies towards the development of unified solution mechanisms. For this purpose, the main portion of the thesis is in cumulative form. The respective publications can be grouped according to the three challenges outlined above. Correspondingly, chapter 1 is focused on choice and extendability of neural network architectures, analyzed in context of popular image classification tasks. An algorithm to automatically determine neural network layer width is introduced and is first contrasted with static architectures found in the literature. The importance of neural architecture design is then further showcased on a real-world application of defect detection in concrete bridges. Chapter 2 is comprised of the complementary ensuing questions of how to identify unknown concepts and subsequently incorporate them into continual learning. A joint central mechanism to distinguish unseen concepts from what is known in classification tasks, while enabling consecutive training without forgetting or revisiting older classes, is proposed. Once more, the role of the chosen neural network architecture is quantitatively reassessed. Finally, chapter 3 culminates in an overarching view, where developed parts are connected. Here, an extensive survey further serves the purpose to embed the gained insights in the broader literature landscape and emphasizes the importance of a common frame of thought. The ultimately presented approach thus reflects the overall thesis’ contribution to advance neural network based machine learning towards a unified solution that ties together choice of neural architecture with the ability to learn continually and the capability to automatically separate known from unknown data.


Sign in / Sign up

Export Citation Format

Share Document