A Technique for Approximate Communication in Network-on-Chips for Image Classification

10.36227/techrxiv.16438506 ◽

2021 ◽

Author(s):

Yuechen Chen ◽

Shanshan Liu ◽

Fabrizio Lombardi ◽

Ahmed Louri

Keyword(s):

Quality Control ◽

Power Consumption ◽

Image Classification ◽

Classification Accuracy ◽

Packet Size ◽

Data Approximation ◽

Communication Technique ◽

Network Latency ◽

On Chip ◽

Approximate Communication

Approximation is an effective technique for reducing power consumption and latency of on-chip communication in many computing applications. However, existing approximation techniques either achieve modest improvements in these metrics or require retraining after approximation, such when convolutional neural networks (CNNs) are employed. Since classifying many images introduces intensive on-chip communication, reductions in both network latency and power consumption are highly desired. In this paper, we propose an approximate communication technique (ACT) to improve the efficiency of on-chip communications for image classification applications. The proposed technique exploits the error-tolerance of the image classification process to reduce power consumption and latency of on-chip communications, resulting in better overall performance for image classification computation. This is achieved by incorporating novel quality control and data approximation mechanisms that reduce the packet size. In particular, the proposed quality control mechanisms identify the error-resilient variables and automatically adjust the error thresholds of the variables based on the image classification accuracy. The proposed data approximation mechanisms significantly reduce packet size when the variables are transmitted. The proposed technique reduces the number of flits in each data packet as well as the on-chip communication, while maintaining an excellent image classification accuracy. The cycle-accurate simulation results show that ACT achieves 23% in network latency reduction and 24% in dynamic power reduction compared to the existing approximate communication technique with less than 0.99% classification accuracy loss.

Download Full-text

On Performance Optimization and Quality Control for Approximate-communication-enabled Networks-on-Chip

IEEE Transactions on Computers ◽

10.1109/tc.2020.3027182 ◽

2020 ◽

pp. 1-1

Author(s):

Siyuan Xiao ◽

Xiaohang Wang ◽

Maurizio Palesi ◽

Amit Singh ◽

Liang Wang ◽

...

Keyword(s):

Quality Control ◽

Performance Optimization ◽

Networks On Chip ◽

On Chip ◽

Approximate Communication

Download Full-text

Area-efficient programmable arbiter for inter-layer communications in 3-D network-on-chip

Open Computer Science ◽

10.2478/s13537-012-0006-8 ◽

2012 ◽

Vol 2 (1) ◽

Cited By ~ 1

Author(s):

Mohammad Khan ◽

Abdul Ansari

Keyword(s):

Power Consumption ◽

Clock Cycle ◽

Network On Chip ◽

Fixed Number ◽

Clock Frequency ◽

Mesh Topology ◽

Communication Technique ◽

Ip Cores ◽

On Chip ◽

Area Efficient

AbstractThe Network-on-Chip (NoC) is an emerging communication technique for System-on-Chip (SoC) communications. The NoC uses multiple processors, usually targeted for embedded applications and other applications [3, 13]. Performance of the bus is degraded by the increasing number of processing elements and transaction oriented model [13]. This has attracted much attention for applying wireless network protocols as CDMA, TDMA, and dTDMA in SoC. The TDMA systems use a fixed number of timeslots. This protocol wastes bandwidth when some timeslots are allocated but not used. The dynamic TDMA (dTDMA) bus arbiter dynamically grows and shrinks the number of timeslots to match the number of active transmitters [14]. In this paper, we present a design of area-efficient switch for inter-layer communications in 3-D NoC. The arbitration logic in the switch is based on a programmable priority encoder. A 640-bit message with uniform random destination data pattern was injected per IP per machine clock cycle. We have obtained the maximum clock frequency of 2.09 GHz for 96(4 × 8 × 3) IP cores connected in a mesh topology. The presented architecture demonstrates their superior functionality in terms of speed, latency, area, and power consumption as compared with the existing implementation [14]. The maximum power consumption of the proposed area-efficient programmable arbiter is 0.625 mW. The design is synthesized using 180nm TSMC Technology.

Download Full-text

Efficient Instruction and Data Caching for High Performance Embedded Processors

Jornada de Jóvenes Investigadores del I3A ◽

10.26754/jji-i3a.201201788 ◽

1970 ◽

pp. 9

Author(s):

A. Ferrerón Labari ◽

D. Suárez Gracia ◽

V. Viñals Yúfera

Keyword(s):

Embedded Systems ◽

Power Consumption ◽

Low Power ◽

Interconnection Networks ◽

High Performance ◽

Critical Issue ◽

Content Management ◽

Structure Design ◽

Portable Devices ◽

On Chip

In the last years, embedded systems have evolved so that they offer capabilities we could only find before in high performance systems. Portable devices already have multiprocessors on-chip (such as PowerPC 476FP or ARM Cortex A9 MP), usually multi-threaded, and a powerful multi-level cache memory hierarchy on-chip. As most of these systems are battery-powered, the power consumption becomes a critical issue. Achieving high performance and low power consumption is a high complexity challenge where some proposals have been already made. Suarez et al. proposed a new cache hierarchy on-chip, the LP-NUCA (Low Power NUCA), which is able to reduce the access latency taking advantage of NUCA (Non-Uniform Cache Architectures) properties. The key points are decoupling the functionality, and utilizing three specialized networks on-chip. This structure has been proved to be efficient for data hierarchies, achieving a good performance and reducing the energy consumption. On the other hand, instruction caches have different requirements and characteristics than data caches, contradicting the low-power embedded systems requirements, especially in SMT (simultaneous multi-threading) environments. We want to study the benefits of utilizing small tiled caches for the instruction hierarchy, so we propose a new design, ID-LP-NUCAs. Thus, we need to re-evaluate completely our previous design in terms of structure design, interconnection networks (including topologies, flow control and routing), content management (with special interest in hardware/software content allocation policies), and structure sharing. In CMP environments (chip multiprocessors) with parallel workloads, coherence plays an important role, and must be taken into consideration.

Download Full-text

Ultracompact and low-power-consumption silicon thermo-optic switch for high-speed data

Nanophotonics ◽

10.1515/nanoph-2020-0496 ◽

2020 ◽

Vol 10 (2) ◽

pp. 937-945

Author(s):

Ruihuan Zhang ◽

Yu He ◽

Yong Zhang ◽

Shaohua An ◽

Qingming Zhu ◽

...

Keyword(s):

Power Consumption ◽

Low Power ◽

High Speed ◽

High Performance ◽

Pulse Amplitude ◽

Telecommunication Networks ◽

Low Power Consumption ◽

Power Efficient ◽

High Speed Data ◽

On Chip

AbstractUltracompact and low-power-consumption optical switches are desired for high-performance telecommunication networks and data centers. Here, we demonstrate an on-chip power-efficient 2 × 2 thermo-optic switch unit by using a suspended photonic crystal nanobeam structure. A submilliwatt switching power of 0.15 mW is obtained with a tuning efficiency of 7.71 nm/mW in a compact footprint of 60 μm × 16 μm. The bandwidth of the switch is properly designed for a four-level pulse amplitude modulation signal with a 124 Gb/s raw data rate. To the best of our knowledge, the proposed switch is the most power-efficient resonator-based thermo-optic switch unit with the highest tuning efficiency and data ever reported.

Download Full-text

Exploring a New Adaptive Routing Based on the Dijkstra Algorithm in Optical Networks-on-Chip

Micromachines ◽

10.3390/mi12010054 ◽

2021 ◽

Vol 12 (1) ◽

pp. 54

Author(s):

Yan-Li Zheng ◽

Ting-Ting Song ◽

Jun-Xiong Chai ◽

Xiao-Ping Yang ◽

Meng-Meng Yu ◽

...

Keyword(s):

Power Consumption ◽

Power Control ◽

Optical Networks ◽

Output Power ◽

Network Performance ◽

Transmission Loss ◽

Adaptive Routing ◽

Dijkstra Algorithm ◽

Networks On Chip ◽

On Chip

The photoelectric hybrid network has been proposed to achieve the ultrahigh bandwidth, lower delay, and less power consumption for chip multiprocessor (CMP) systems. However, a large number of optical elements used in optical networks-on-chip (ONoCs) generate high transmission loss which will influence network performance severely and increase power consumption. In this paper, the Dijkstra algorithm is adopted to realize adaptive routing with minimum transmission loss of link and reduce the output power of the link transmitter in mesh-based ONoCs. The numerical simulation results demonstrate that the transmission loss of a link in optimized power control based on the Dijkstra algorithm could be maximally reduced compared with traditional power control based on the dimensional routing algorithm. Additionally, it has a greater advantage in saving the average output power of optical transmitter compared to the adaptive power control in previous studies, while the network size expands. With the aid of simulation software OPNET, the network performance simulations in an optimized network revealed that the end-to-end (ETE) latency and throughput are not vastly reduced in regard to a traditional network. Hence, the optimized power control proposed in this paper can greatly reduce the power consumption of s network without having a big impact on network performance.

Download Full-text

1.0 V-0.18 µm CMOS Tunable Low Pass Filters with 73 dB DR for On-Chip Sensing Acquisition Systems

Electronics ◽

10.3390/electronics10050563 ◽

2021 ◽

Vol 10 (5) ◽

pp. 563

Author(s):

Jorge Pérez-Bailón ◽

Belén Calvo ◽

Nicolás Medrano

Keyword(s):

Power Consumption ◽

Dynamic Range ◽

Low Voltage ◽

Cutoff Frequency ◽

Cmos Technology ◽

Active Area ◽

Current Steering ◽

Low Pass ◽

On Chip ◽

Low Pass Filters

This paper presents a new approach based on the use of a Current Steering (CS) technique for the design of fully integrated Gm–C Low Pass Filters (LPF) with sub-Hz to kHz tunable cut-off frequencies and an enhanced power-area-dynamic range trade-off. The proposed approach has been experimentally validated by two different first-order single-ended LPFs designed in a 0.18 µm CMOS technology powered by a 1.0 V single supply: a folded-OTA based LPF and a mirrored-OTA based LPF. The first one exhibits a constant power consumption of 180 nW at 100 nA bias current with an active area of 0.00135 mm2 and a tunable cutoff frequency that spans over 4 orders of magnitude (~100 mHz–152 Hz @ CL = 50 pF) preserving dynamic figures greater than 78 dB. The second one exhibits a power consumption of 1.75 µW at 500 nA with an active area of 0.0137 mm2 and a tunable cutoff frequency that spans over 5 orders of magnitude (~80 mHz–~1.2 kHz @ CL = 50 pF) preserving a dynamic range greater than 73 dB. Compared with previously reported filters, this proposal is a competitive solution while satisfying the low-voltage low-power on-chip constraints, becoming a preferable choice for general-purpose reconfigurable front-end sensor interfaces.

Download Full-text

An Imbalanced Image Classification Method for the Cell Cycle Phase

Information ◽

10.3390/info12060249 ◽

2021 ◽

Vol 12 (6) ◽

pp. 249

Author(s):

Xin Jin ◽

Yuanwen Zou ◽

Zhongbing Huang

Keyword(s):

Cell Cycle ◽

Deep Learning ◽

Image Classification ◽

Classification Accuracy ◽

Data Augmentation ◽

Cycle Phase ◽

Generative Adversarial Network ◽

Adversarial Network ◽

Cellular Life

The cell cycle is an important process in cellular life. In recent years, some image processing methods have been developed to determine the cell cycle stages of individual cells. However, in most of these methods, cells have to be segmented, and their features need to be extracted. During feature extraction, some important information may be lost, resulting in lower classification accuracy. Thus, we used a deep learning method to retain all cell features. In order to solve the problems surrounding insufficient numbers of original images and the imbalanced distribution of original images, we used the Wasserstein generative adversarial network-gradient penalty (WGAN-GP) for data augmentation. At the same time, a residual network (ResNet) was used for image classification. ResNet is one of the most used deep learning classification networks. The classification accuracy of cell cycle images was achieved more effectively with our method, reaching 83.88%. Compared with an accuracy of 79.40% in previous experiments, our accuracy increased by 4.48%. Another dataset was used to verify the effect of our model and, compared with the accuracy from previous results, our accuracy increased by 12.52%. The results showed that our new cell cycle image classification system based on WGAN-GP and ResNet is useful for the classification of imbalanced images. Moreover, our method could potentially solve the low classification accuracy in biomedical images caused by insufficient numbers of original images and the imbalanced distribution of original images.

Download Full-text

Deep Learning-Based Hepatocellular Carcinoma Histopathology Image Classification: Accuracy versus Training Dataset Size

IEEE Access ◽

10.1109/access.2021.3060765 ◽

2021 ◽

pp. 1-1

Author(s):

Yu-Shiang Lin ◽

Pei-Hsin Huang ◽

Yung-Yaw Chen

Keyword(s):

Hepatocellular Carcinoma ◽

Deep Learning ◽

Image Classification ◽

Classification Accuracy ◽

Training Dataset ◽

Dataset Size

Download Full-text

Hyperspectral Image Classification Based on Multi-Scale Residual Network with Attention Mechanism

Remote Sensing ◽

10.3390/rs13030335 ◽

2021 ◽

Vol 13 (3) ◽

pp. 335

Author(s):

Yuhao Qing ◽

Wenyi Liu

Keyword(s):

Neural Network ◽

Deep Learning ◽

Convolutional Neural Network ◽

Image Classification ◽

Classification Accuracy ◽

Hyperspectral Image ◽

Principal Component ◽

Hyperspectral Image Classification ◽

Deep Network ◽

Multi Scale

In recent years, image classification on hyperspectral imagery utilizing deep learning algorithms has attained good results. Thus, spurred by that finding and to further improve the deep learning classification accuracy, we propose a multi-scale residual convolutional neural network model fused with an efficient channel attention network (MRA-NET) that is appropriate for hyperspectral image classification. The suggested technique comprises a multi-staged architecture, where initially the spectral information of the hyperspectral image is reduced into a two-dimensional tensor, utilizing a principal component analysis (PCA) scheme. Then, the constructed low-dimensional image is input to our proposed ECA-NET deep network, which exploits the advantages of its core components, i.e., multi-scale residual structure and attention mechanisms. We evaluate the performance of the proposed MRA-NET on three public available hyperspectral datasets and demonstrate that, overall, the classification accuracy of our method is 99.82 %, 99.81%, and 99.37, respectively, which is higher compared to the corresponding accuracy of current networks such as 3D convolutional neural network (CNN), three-dimensional residual convolution structure (RES-3D-CNN), and space–spectrum joint deep network (SSRN).

Download Full-text