Genetic Parallel Programming: Design and Implementation

This paper presents a novel Genetic Parallel Programming (GPP) paradigm for evolving parallel programs running on a Multi-Arithmetic-Logic-Unit (Multi-ALU) Processor (MAP). The MAP is a Multiple Instruction-streams, Multiple Data-streams (MIMD), general-purpose register machine that can be implemented on modern Very Large-Scale Integrated Circuits (VLSIs) in order to evaluate genetic programs at high speed. For human programmers, writing parallel programs is more difficult than writing sequential programs. However, experimental results show that GPP evolves parallel programs with less computational effort than that of their sequential counterparts. It creates a new approach to evolving a feasible problem solution in parallel program form and then serializes it into a sequential programif required. The effectiveness and efficiency of GPP are investigated using a suite of 14 well-studied benchmark problems. Experimental results show that GPP speeds up evolution substantially.

Download Full-text

Plasma Surface Modification of Silica and its Application in Epoxy Molding Compounds for Large-Scale Integrated Circuits Packaging

Advanced Materials Research ◽

10.4028/www.scientific.net/amr.158.184 ◽

2010 ◽

Vol 158 ◽

pp. 184-188 ◽

Cited By ~ 1

Author(s):

Ming Shan Yang ◽

Lin Kai Li ◽

Jian Guo Zhang

Keyword(s):

Surface Modification ◽

Acrylic Acid ◽

Integrated Circuits ◽

Plasma Polymerization ◽

High Speed ◽

Large Scale ◽

Treatment Time ◽

Rf Plasma ◽

Formaldehyde Resin ◽

Molding Compounds

The surface modification of silica for epoxy molding compounds (EMC) was conducted by plasma polymerization using RF plasma (13.56MPa), and the modification factors such as plasma power, gas pressure and treatment time were investigated systematically in this paper. The monomers utilized for the plasma polymer coatings were pyrrole, 1,3-diaminopropane, acrylic acid and urea. The plasma polymerization coating of silica was characterized by FTIR, contact angle. Using the silica treated by plasma as filler, ortho-cresol novolac epoxy as main resin, novolac phenolic-formaldehyde resin as cross-linking agent and 2-methylmizole as curing accelerating agent, the EMCs used for the packaging of large-scale integrated circuits were prepared by high-speed pre-mixture and twin roller mixing technology. The results have shown that the surface of silica can be coated by plasma polymerization of pyrrole, 1,3-diaminopropane, acrylic acid and urea, and the comprehensive properties of EMC were improved.

Download Full-text

Performance Comparison of OpenMP, MPI, and MapReduce in Practical Problems

Advances in Multimedia ◽

10.1155/2015/575687 ◽

2015 ◽

Vol 2015 ◽

pp. 1-9 ◽

Cited By ~ 30

Author(s):

Sol Ji Kang ◽

Sang Yeon Lee ◽

Keon Myung Lee

Keyword(s):

Parallel Programming ◽

Large Scale ◽

Memory Systems ◽

Performance Comparison ◽

Benchmark Problems ◽

Distributed Programming ◽

Problem Size ◽

Good Picture ◽

Data Intensive ◽

The Right

With problem size and complexity increasing, several parallel and distributed programming models and frameworks have been developed to efficiently handle such problems. This paper briefly reviews the parallel computing models and describes three widely recognized parallel programming frameworks: OpenMP, MPI, and MapReduce. OpenMP is the de facto standard for parallel programming on shared memory systems. MPI is the de facto industry standard for distributed memory systems. MapReduce framework has become the de facto standard for large scale data-intensive applications. Qualitative pros and cons of each framework are known, but quantitative performance indexes help get a good picture of which framework to use for the applications. As benchmark problems to compare those frameworks, two problems are chosen: all-pairs-shortest-path problem and data join problem. This paper presents the parallel programs for the problems implemented on the three frameworks, respectively. It shows the experiment results on a cluster of computers. It also discusses which is the right tool for the jobs by analyzing the characteristics and performance of the paradigms.

Download Full-text

AN EFFICIENT MULTI-SPIN CODING ALGORITHM FOR NEURAL NETWORKS

International Journal of Modern Physics C ◽

10.1142/s0129183191000925 ◽

1991 ◽

Vol 02 (02) ◽

pp. 623-636 ◽

Cited By ~ 1

Author(s):

MARTIN NESCHEN

Keyword(s):

Neural Networks ◽

Numerical Simulations ◽

High Speed ◽

Large Scale ◽

Effective Rate ◽

Computational Effort ◽

Local Fields ◽

New Method ◽

Coupling Matrix ◽

Horizontal Structure

A new method for large-scale numerical simulations of neural networks is proposed which reduces the computational effort by incrementally updating the local fields and thus restricting the operations to flipped spins only. A highly optimized multi-spin algorithm is described employing words oriented along the columns of the coupling matrix unlike the horizontal structure in existing high-speed algorithms. An effective rate of 35*109 couplings/s on a Cray-YMP can be attained which is about five times as fast as best existing multi-spin implementations.

Download Full-text

Implementation of Pruned Backpropagation Neural Network Based on Photonic Integrated Circuits

Photonics ◽

10.3390/photonics8090363 ◽

2021 ◽

Vol 8 (9) ◽

pp. 363

Author(s):

Qi Zhang ◽

Zhuangzhuang Xing ◽

Duan Huang

Keyword(s):

Neural Network ◽

Neural Networks ◽

Integrated Circuits ◽

Energy Efficient ◽

High Speed ◽

Large Scale ◽

Matrix Operation ◽

Optical Neural Networks ◽

Optical Neural Network ◽

Random Initialization

We demonstrate a pruned high-speed and energy-efficient optical backpropagation (BP) neural network. The micro-ring resonator (MRR) banks, as the core of the weight matrix operation, are used for large-scale weighted summation. We find that tuning a pruned MRR weight banks model gives an equivalent performance in training with the model of random initialization. Results show that the overall accuracy of the optical neural network on the MNIST dataset is 93.49% after pruning six-layer MRR weight banks on the condition of low insertion loss. This work is scalable to much more complex networks, such as convolutional neural networks and recurrent neural networks, and provides a potential guide for truly large-scale optical neural networks.

Download Full-text

High-speed programmable photonic circuits in a cryogenically compatible, visible–near-infrared 200 mm CMOS architecture

Nature Photonics ◽

10.1038/s41566-021-00903-x ◽

2021 ◽

Author(s):

Mark Dong ◽

Genevieve Clark ◽

Andrew J. Leenheer ◽

Matthew Zimmermann ◽

Daniel Dominguez ◽

...

Keyword(s):

Integrated Circuits ◽

Power Consumption ◽

High Speed ◽

Large Scale ◽

Near Infrared ◽

Response Times ◽

Photonic Integrated Circuits ◽

Complementary Metal Oxide Semiconductor ◽

Phase Shifters ◽

Oxide Semiconductor

AbstractRecent advances in photonic integrated circuits have enabled a new generation of programmable Mach–Zehnder meshes (MZMs) realized by using cascaded Mach–Zehnder interferometers capable of universal linear-optical transformations on N input/output optical modes. MZMs serve critical functions in photonic quantum information processing, quantum-enhanced sensor networks, machine learning and other applications. However, MZM implementations reported to date rely on thermo-optic phase shifters, which limit applications due to slow response times and high power consumption. Here we introduce a large-scale MZM platform made in a 200 mm complementary metal–oxide–semiconductor foundry, which uses aluminium nitride piezo-optomechanical actuators coupled to silicon nitride waveguides, enabling low-loss propagation with phase modulation at greater than 100 MHz in the visible–near-infrared wavelengths. Moreover, the vanishingly low hold-power consumption of the piezo-actuators enables these photonic integrated circuits to operate at cryogenic temperatures, paving the way for a fully integrated device architecture for a range of quantum applications.

Download Full-text

A high-speed fixed width floating-point multiplier using residue logarithmic number system algorithm

International Journal of Electrical Engineering Education ◽

10.1177/0020720918813836 ◽

2018 ◽

Vol 57 (4) ◽

pp. 361-375 ◽

Cited By ~ 2

Author(s):

J Jency Rubia ◽

GA Sathish Kumar

Keyword(s):

Integrated Circuits ◽

High Speed ◽

Large Scale ◽

Digital Signal ◽

Number System ◽

Residue Number System ◽

Floating Point ◽

Hardware Complexity ◽

Logarithmic Number System ◽

Logarithmic Number

The Residue Logarithmic Number System (RLNS) in digital mathematics allows multiplication and division to be performed considerably quickly and more precisely than the extensively used Floating-Point number setups. RLNS in the pitch of large scale integrated circuits, digital signal processing, multimedia, scientific computing and artificial neural network applications have Fixed Width property which has equal number of in and out bit width; hence, these applications need a Fixed Width multiplier. In this paper, a Fixed Width-Floating-Point multiplier based on RLNS was proposed to increase the processing speed. The truncation errors were reduced by using Taylor series. RLNS is the combination of both the residue number system and the logarithmic number system, and uses a table lookup including all bits for expansion. The proposed scheme is effective with regard to speed, area and power utilization in contrast to the design of conservative Floating-Point mathematics designs. Synthesis results were obtained using a Xilinx 14.7 ISE simulator. The area is 16,668 µm2, power is 37 mW, delay is 6.160 ns and truncation error can be lessened by 89% as compared with the direct-truncated multiplier. The proposed Fixed Width RLNS multiplier performs with lesser compensation error and with minimal hardware complexity, particularly as multiplier input bits increment.

Download Full-text

Polarization-Insensitive Waveguide Schottky Photodetectors Based on Mode Hybridization Effects in Asymmetric Plasmonic Waveguides

Sensors ◽

10.3390/s20236885 ◽

2020 ◽

Vol 20 (23) ◽

pp. 6885

Author(s):

Qian Li ◽

Junjie Tu ◽

Yang Tian ◽

Yanli Zhao

Keyword(s):

Integrated Circuits ◽

High Speed ◽

Large Scale ◽

Layer Structure ◽

Single Layer ◽

Absorption Enhancement ◽

Plasmonic Waveguides ◽

Wavelength Band ◽

Te Mode ◽

Polarization Insensitive

Two types of configurations are theoretically proposed to achieve high responsivity polarization-insensitive waveguide Schottky photodetectors, i.e., a dual-layer structure for 1.55 µm and a single-layer structure for 2 µm wavelength band. Mode hybridization effects between quasi-TM modes and sab1 modes in plasmonic waveguides are first presented and further investigated under diverse metal types with different thicknesses in this work. By utilizing the mode hybridization effects between quasi-TE mode and aab0 mode, and also quasi-TM and sab1 mode in our proposed hybrid plasmonic waveguide, light absorption enhancement can be achieved under both TE and TM incidence within ultrathin and short metal stripes, thus resulting in a considerable responsivity for Si-based sub-bandgap photodetection. For 1.55 µm wavelength, the Au-6 nm-thick device can achieve absorptance of 99.6%/87.6% and responsivity of 138 mA·W−1/121.2 mA·W−1 under TE/TM incidence. Meanwhile, the Au-5 nm-thick device can achieve absorptance of 98.4%/90.2% and responsivity of 89 mA·W−1/81.7 mA·W−1 under TE/TM incidence in 2 µm wavelength band. The ultra-compact polarization-insensitive waveguide Schottky photodetectors may have promising applications in large scale all-Si photonic integrated circuits for high-speed optical communication.

Download Full-text

Multi-Layer Ceramic Packaging for High Frequency Mixed-Signal VLSI ASICS

Journal of Microelectronics and Electronic Packaging ◽

10.4071/1551-4897-6.1.38 ◽

2009 ◽

Vol 6 (1) ◽

pp. 38-41

Author(s):

Lewis Dove

Keyword(s):

Integrated Circuits ◽

High Frequency ◽

High Speed ◽

Large Scale ◽

Flip Chip ◽

Data Converters ◽

Mixed Signal ◽

Large Scale Integration ◽

Circuit Techniques ◽

Scale Integration

Mixed-signal Application Specific Integrated Circuits (ASICs) have traditionally been used in test and measurement applications for a variety of functions such as data converters, pin electronics circuitry, drivers, and receivers. Over the past several years, the complexity, power density, and bandwidth of these chips has increased dramatically. This has necessitated dramatic changes in the way these chips have been packaged. As the chips have become true VLSI (Very Large Scale Integration) ICs, the number of I/Os have become too large to interconnect with wire bonds. Thus, it has become necessary to utilize flip chip interconnects. Also, the bandwidth of the high-speed signal paths and clocks has increased into the multi Gbit or GHz ranges. This requires the use of packages with good high-frequency performance which are designed using microwave circuit techniques to optimize signal integrity and to minimize signal crosstalk and noise.

Download Full-text