An Efficient Hardware Architecture with Adjustable Precision and Extensible Range to Implement Sigmoid and Tanh Functions

The efficient and precise hardware implementations of tanh and sigmoid functions play an important role in various neural network algorithms. Different applications have different requirements for accuracy. However, it is difficult for traditional methods to achieve adjustable precision. Therefore, we propose an efficient-hardware, adjustable-precision and high-speed architecture to implement them for the first time. Firstly, we present two methods to implement sigmoid and tanh functions. One is based on the rotation mode of hyperbolic CORDIC and the vector mode of linear CORDIC (called RHC-VLC), another is based on the carry-save method and the vector mode of linear CORDIC (called CSM-VLC). We validate the two methods by MATLAB and RTL implementations. Synthesized under the TSMC 40 nm CMOS technology, we find that a special case AR∣VR(3,0), based on RHC-VLC method, has the area of 4290.98 μm2 and the power of 1.69 mW at the frequency of 1.5 GHz. However, under the same frequency, AR∣VC(3) (a special case based on CSM-VLC method) costs 3196.36 μm2 area and 1.38 mW power. They are both superior to existing methods for implementing such an architecture with adjustable precision.

Download Full-text

A Hidden DCT-Based Invisible Watermarking Method for Low-Cost Hardware Implementations

Electronics ◽

10.3390/electronics10121465 ◽

2021 ◽

Vol 10 (12) ◽

pp. 1465

Author(s):

Yuxuan Wang ◽

Yuanyong Luo ◽

Zhongfeng Wang ◽

Hongbing Pan

Keyword(s):

Resource Sharing ◽

Power Efficiency ◽

High Speed ◽

Low Cost ◽

Color Space ◽

Computational Cost ◽

Cmos Technology ◽

Hardware Implementations ◽

Invisible Watermarking ◽

Speed Performance

This paper presents an invisible and robust watermarking method and its hardware implementation. The proposed architecture is based on the discrete cosine transform (DCT) algorithm. Novel techniques are applied as well to reduce the computational cost of DCT and color space conversion to achieve low-cost and high-speed performance. Besides, a watermark embedder and a blind extractor are implemented in the same circuit using a resource-sharing method. Our approach is compatible with various watermarking embedding ratios, such as 1/16 and 1/64, with a PSNR of over 45 and the NC value of 1. After Joint Photographic Experts Group (JPEG) compression with a quality factor (QF) of 50, our method can achieve an NC value of 0.99. Results from a design compiler (DC) with TSMC-90 nm CMOS technology show that our design can achieve the frequency of 2.32 GHz with the area consumption of 304,980.08 μm2 and power consumption of 508.1835 mW. For the FPGA implementation, our method achieved a frequency of 421.94 MHz. Compared with the state-of-the-art works, our design improved the frequency by 4.26 times, saved 90.2% on area and increased the power efficiency by more than 1000 fold.

Download Full-text

Isolation and HPLC Determination of the Chemical Components of Gentianella acuta (Michx.) Hulten

Current Analytical Chemistry ◽

10.2174/1573411014666180730113804 ◽

2018 ◽

Vol 15 (1) ◽

pp. 21-33

Author(s):

Ying Wei ◽

Yongqiao Liu ◽

Yifan Hele ◽

Weiwei Sun ◽

Yang Wang ◽

...

Keyword(s):

Medicinal Plant ◽

High Speed ◽

Chemical Constituents ◽

Folk Medicine ◽

Gradient Elution ◽

Chemical Components ◽

Isolation And Characterization ◽

Major Chemical ◽

Main Components ◽

First Time

Background: Gentianella acuta (Michx.) Hulten is an important type of medicinal plant found in several Chinese provinces. It has been widely used in folk medicine to treat various illnesses. However, there is not enough detailed information about the chemical constituents of this plant or methods for their content determination. Objective: The focus of this work is the isolation and characterization of the major chemical constituents of Gentianella acuta, and developing an analytical method for their determination. Methods: The components of Gentianella acuta were isolated using (1) ethanol extraction and adsorption on macroporous resin. (2) and ethyl acetate extraction and high speed countercurrent chromatography. A HPLC-DAD method was developed using a C18 column and water-acetonitrile as the mobile phase. Based on compound polarities, both isocratic and gradient elution methods were developed. Results: A total of 29 compounds were isolated from this plant, of which 17 compounds were isolated from this genus for the first time. The main components in this plant were found to be xanthones. The HPLC-DAD method was developed and validated for their determination, and found to show good sensitivity and reliability. Conclusion: The results of this work add to the limited body of work available on this important medicinal plant. The findings will be useful for further investigation and development of Gentianella acuta for its valuable medicinal properties.

Download Full-text

Projected Performance of Sub-10 nm GaN-based Double Gate MOSFETs

Circulation in Computer Science ◽

10.22632/ccs-2017-251-50 ◽

2017 ◽

Vol 2 (2) ◽

pp. 15-19 ◽

Cited By ~ 3

Author(s):

Md. Saud Al Faisal ◽

Md. Rokib Hasan ◽

Marwan Hossain ◽

Mohammad Saiful Islam

Keyword(s):

High Speed ◽

Field Effect Transistors ◽

Cmos Technology ◽

Channel Length ◽

Capacitive Coupling ◽

Oxide Semiconductor ◽

Double Gate ◽

Short Channel ◽

Channel Effects ◽

Silvaco Atlas

GaN-based double gate metal-oxide semiconductor field-effect transistors (DG-MOSFETs) in sub-10 nm regime have been designed for the next generation logic applications. To rigorously evaluate the device performance, non-equilibrium Green’s function formalism are performed using SILVACO ATLAS. The device is turn on at gate voltage, VGS =1 V while it is going to off at VGS = 0 V. The ON-state and OFF-state drain currents are found as 12 mA/μm and ~10-8 A/μm, respectively at the drain voltage, VDS = 0.75 V. The sub-threshold slope (SS) and drain induced barrier lowering (DIBL) are ~69 mV/decade and ~43 mV/V, which are very compatible with the CMOS technology. To improve the figure of merits of the proposed device, source to gate (S-G) and gate to drain (G-D) distances are varied which is mentioned as underlap. The lengths are maintained equal for both sides of the gate. The SS and DIBL are decreased with increasing the underlap length (LUN). Though the source to drain resistance is increased for enhancing the channel length, the underlap architectures exhibit better performance due to reduced capacitive coupling between the contacts (S-G and G-D) which minimize the short channel effects. Therefore, the proposed GaN-based DG-MOSFETs as one of the excellent promising candidates to substitute currently used MOSFETs for future high speed applications.

Download Full-text

Getting High: High Fidelity Simulation of High Granularity Calorimeters with High Speed

Computing and Software for Big Science ◽

10.1007/s41781-021-00056-0 ◽

2021 ◽

Vol 5 (1) ◽

Author(s):

Erik Buhmann ◽

Sascha Diefenbacher ◽

Engin Eren ◽

Frank Gaede ◽

Gregor Kasieczka ◽

...

Keyword(s):

Particle Physics ◽

High Speed ◽

Deep Neural Networks ◽

High Fidelity ◽

High Fidelity Simulation ◽

Physical Processes ◽

Electromagnetic Showers ◽

Information Bottleneck ◽

First Time ◽

Processing Network

AbstractAccurate simulation of physical processes is crucial for the success of modern particle physics. However, simulating the development and interaction of particle showers with calorimeter detectors is a time consuming process and drives the computing needs of large experiments at the LHC and future colliders. Recently, generative machine learning models based on deep neural networks have shown promise in speeding up this task by several orders of magnitude. We investigate the use of a new architecture—the Bounded Information Bottleneck Autoencoder—for modelling electromagnetic showers in the central region of the Silicon-Tungsten calorimeter of the proposed International Large Detector. Combined with a novel second post-processing network, this approach achieves an accurate simulation of differential distributions including for the first time the shape of the minimum-ionizing-particle peak compared to a full Geant4 simulation for a high-granularity calorimeter with 27k simulated channels. The results are validated by comparing to established architectures. Our results further strengthen the case of using generative networks for fast simulation and demonstrate that physically relevant differential distributions can be described with high accuracy.

Download Full-text

The Demonstration of S2P (Serial-to-Parallel) Converter with Address Allocation Method Using 28 nm CMOS Technology

Applied Sciences ◽

10.3390/app11010429 ◽

2021 ◽

Vol 11 (1) ◽

pp. 429

Author(s):

Min-Su Kim ◽

Youngoo Yang ◽

Hyungmo Koo ◽

Hansik Oh

Keyword(s):

Integrated Circuits ◽

Embedded System ◽

Low Power ◽

High Speed ◽

Cmos Technology ◽

Clock Frequency ◽

Clock Generation ◽

Allocation Method ◽

Evaluation Board ◽

28 Nm

To improve the performance of analog, RF, and digital integrated circuits, the cutting-edge advanced CMOS technology has been widely utilized. We successfully designed and implemented a high-speed and low-power serial-to-parallel (S2P) converter for 5G applications based on the 28 nm CMOS technology. It can update data easily and quickly using the proposed address allocation method. To verify the performances, an embedded system (NI-FPGA) for fast clock generation on the evaluation board level was also used. The proposed S2P converter circuit shows extremely low power consumption of 28.1 uW at 0.91 V with a core die area of 60 × 60 μm2 and operates successfully over a wide clock frequency range from 5 M to 40 MHz.

Download Full-text

Convergence properties of the Broyden-like method for mixed linear–nonlinear systems of equations

Numerical Algorithms ◽

10.1007/s11075-020-01060-y ◽

2021 ◽

Author(s):

Florian Mannel

Keyword(s):

Linear Independence ◽

Affine Subspace ◽

Systems Of Equations ◽

Nonlinear Systems Of Equations ◽

Convergence Properties ◽

Initial Matrix ◽

First Time ◽

Special Case ◽

Dimensional Mapping ◽

Finite Number Of Iterations

AbstractWe consider the Broyden-like method for a nonlinear mapping $F:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}$ F : ℝ n → ℝ n that has some affine component functions, using an initial matrix B0 that agrees with the Jacobian of F in the rows that correspond to affine components of F. We show that in this setting, the iterates belong to an affine subspace and can be viewed as outcome of the Broyden-like method applied to a lower-dimensional mapping $G:\mathbb {R}^{d}\rightarrow \mathbb {R}^{d}$ G : ℝ d → ℝ d , where d is the dimension of the affine subspace. We use this subspace property to make some small contributions to the decades-old question of whether the Broyden-like matrices converge: First, we observe that the only available result concerning this question cannot be applied if the iterates belong to a subspace because the required uniform linear independence does not hold. By generalizing the notion of uniform linear independence to subspaces, we can extend the available result to this setting. Second, we infer from the extended result that if at most one component of F is nonlinear while the others are affine and the associated n − 1 rows of the Jacobian of F agree with those of B0, then the Broyden-like matrices converge if the iterates converge; this holds whether the Jacobian at the root is invertible or not. In particular, this is the first time that convergence of the Broyden-like matrices is proven for n > 1, albeit for a special case only. Third, under the additional assumption that the Broyden-like method turns into Broyden’s method after a finite number of iterations, we prove that the convergence order of iterates and matrix updates is bounded from below by $\frac {\sqrt {5}+1}{2}$ 5 + 1 2 if the Jacobian at the root is invertible. If the nonlinear component of F is actually affine, we show finite convergence. We provide high-precision numerical experiments to confirm the results.

Download Full-text

Investigation of PVT-Aware STT-MRAM Sensing Circuits for Low-VDD Scenario

Micromachines ◽

10.3390/mi12050551 ◽

2021 ◽

Vol 12 (5) ◽

pp. 551

Author(s):

Zhongjian Bian ◽

Xiaofeng Hong ◽

Yanan Guo ◽

Lirida Naviner ◽

Wei Ge ◽

...

Keyword(s):

Nonvolatile Memory ◽

High Speed ◽

Low Voltage ◽

Supply Voltage ◽

Random Access ◽

Magnetic Tunnel Junction ◽

Complementary Metal Oxide Semiconductor ◽

Current Mode ◽

Cmos Technology ◽

Oxide Semiconductor

Spintronic based embedded magnetic random access memory (eMRAM) is becoming a foundry validated solution for the next-generation nonvolatile memory applications. The hybrid complementary metal-oxide-semiconductor (CMOS)/magnetic tunnel junction (MTJ) integration has been selected as a proper candidate for energy harvesting, area-constraint and energy-efficiency Internet of Things (IoT) systems-on-chips. Multi-VDD (low supply voltage) techniques were adopted to minimize energy dissipation in MRAM, at the cost of reduced writing/sensing speed and margin. Meanwhile, yield can be severely affected due to variations in process parameters. In this work, we conduct a thorough analysis of MRAM sensing margin and yield. We propose a current-mode sensing amplifier (CSA) named 1D high-sensing 1D margin, high 1D speed and 1D stability (HMSS-SA) with reconfigured reference path and pre-charge transistor. Process-voltage-temperature (PVT) aware analysis is performed based on an MTJ compact model and an industrial 28 nm CMOS technology, explicitly considering low-voltage (0.7 V), low tunneling magnetoresistance (TMR) (50%) and high temperature (85 °C) scenario as the worst sensing case. A case study takes a brief look at sensing circuits, which is applied to in-memory bit-wise computing. Simulation results indicate that the proposed high-sensing margin, high speed and stability sensing-sensing amplifier (HMSS-SA) achieves remarkable performance up to 2.5 GHz sensing frequency. At 0.65 V supply voltage, it can achieve 1 GHz operation frequency with only 0.3% failure rate.

Download Full-text

A manufacturable 0.30 μm gate CMOS technology for high speed microprocessors

1996 Symposium on VLSI Technology. Digest of Technical Papers ◽

10.1109/vlsit.1996.507857 ◽

2002 ◽

Cited By ~ 1

Author(s):

A. Appel ◽

S. Crank ◽

Y. Kim ◽

C. Scharrer ◽

D. Spratt ◽

...

Keyword(s):

High Speed ◽

Cmos Technology

Download Full-text

A high density matched hexagonal transistor structure in standard CMOS technology for high speed applications

ICMTS 1999. Proceedings of 1999 International Conference on Microelectronic Test Structures (Cat. No.99CH36307) ◽

10.1109/icmts.1999.766245 ◽

2003 ◽

Cited By ~ 3

Author(s):

A. Van den Bosch ◽

M. Steyaert ◽

W. Sansen

Keyword(s):

High Speed ◽

Cmos Technology ◽

High Density ◽

Transistor Structure

Download Full-text

A 25 Gbps VCSEL driving ASIC: an attempt for ultra-high-speed front-end readout applications

Journal of Instrumentation ◽

10.1088/1748-0221/17/01/c01040 ◽

2022 ◽

Vol 17 (01) ◽

pp. C01040

Author(s):

C. Zhao ◽

D. Guo ◽

Q. Chen ◽

N. Fang ◽

Y. Gan ◽

...

Keyword(s):

High Speed ◽

Cmos Technology ◽

Test Results ◽

Eye Diagram ◽

Optical Links ◽

Optical Module ◽

Active Feedback ◽

High Bandwidth ◽

Output Driver ◽

Very High

Abstract This paper presents the design and the test results of a 25 Gbps VCSEL driving ASIC fabricated in a 55 nm CMOS technology as an attempt for the future very high-speed optical links. The VCSEL driving ASIC is composed of an input equalizer stage, a pre-driver stage and a novel output driver stage. To achieve high bandwidth, the pre-driver stage combines the inductor-shared peaking structure and the active-feedback technique. A novel output driver stage uses the pseudo differential CML driver structure and the adjustable FFE pre-emphasis technique to improve the bandwidth. This VCSEL driver has been integrated in a customized optical module with a VCSEL array. Both the electrical function and optical performance have been fully evaluated. The output optical eye diagram has passed the eye mask test at the data rate of 25 Gbps. The peak-to-peak jitter of 25 Gbps optical eye is 19.5 ps and the RMS jitter is 2.9 ps.

Download Full-text