OPTIMIZED METHOD OF NETWORK PACKET ROUTING

Vestnik komp iuternykh i informatsionnykh tekhnologii ◽

10.14489/vkit.2019.09.pp.027-032 ◽

2019 ◽

pp. 27-32

Author(s):

A. S. Kotlyarov

Keyword(s):

High Speed ◽

Large Scale ◽

Data Transfer ◽

Search Method ◽

Concurrent Use ◽

Parallel Pipeline ◽

Field Programmable ◽

Modified Block ◽

Hardware Costs ◽

Routing Tables

In the paper we review the possibility of using a block search method for network packets routing tasks in high-speed computer networks. The method provides minimization of hardware costs for large-scale routing tables. To support the maximum data transfer rate, it is necessary to perform real-time routing of packets. The existing solution presumes concurrent use of several routing devices. Each device performs independent search of records by the bisection method. However, if the channel rate exceeds 10 Gb/sec, and the number of routs exceeds 220, it leads to high hardware costs. To reduce hardware costs, we have suggested to use a modified block search method, which differs from the classic one by parallel-pipeline form of search. We have presented evaluation of the minimum field-programmable gate array hardware costs for the network packets routing task. Analysis of the results has proved efficiency of the suggested method in comparison with existing solutions. As a result, the hardware costs were reduced in 5 times.

Download Full-text

End-to-End Dataflow Parallelism for Transfer Throughput Optimization

Advancements in Distributed Computing and Internet Technologies ◽

10.4018/978-1-61350-110-8.ch002 ◽

2011 ◽

pp. 23-39

Author(s):

Esma Yildirim ◽

Tevfik Kosar

Keyword(s):

High Speed ◽

Large Scale ◽

Data Transfer ◽

Throughput Optimization ◽

High Speed Networks ◽

Parallel Storage ◽

End To End ◽

Processor Speed ◽

Networking Technology

The emerging petascale increase in the data produced by large-scale scientific applications necessitates innovative solutions for efficient transfer of data through the advanced infrastructure provided by today’s high-speed networks and complex computer-architectures (e.g. supercomputers, parallel storage systems). Although the current optical networking technology reached transport speeds of 100Gbps, the applications still suffer from the inadequate transport protocols and end-system bottlenecks such as processor speed, disk I/O speed and network interface card limits that cause underutilization of the existing network infrastructure and let the application achieve only a small portion of the theoretical performance. Fortunately, with the parallelism provided by usage of multiple CPUs/nodes and multiple disks present in today’s systems, these bottlenecks could be eliminated. However it is necessary to understand the characteristics of the end-systems and the transport protocol used. In this book chapter, we analyze methodologies that will improve the data transfer speed of applications and provide maximal speeds that could be obtained from the available end-system resources and high-speed networks through usage of end-to-end dataflow parallelism.

Download Full-text

Cloud bursting galaxy: federated identity and access management

Bioinformatics ◽

10.1093/bioinformatics/btz472 ◽

2019 ◽

Vol 36 (1) ◽

pp. 1-9 ◽

Cited By ~ 1

Author(s):

Vahid Jalili ◽

Enis Afgan ◽

James Taylor ◽

Jeremy Goecks

Keyword(s):

Cloud Computing ◽

High Speed ◽

Large Scale ◽

Best Practice ◽

Data Transfer ◽

Data Access ◽

Web Security ◽

Biomedical Data ◽

Authentication And Authorization ◽

Computing Platforms

Abstract Motivation Large biomedical datasets, such as those from genomics and imaging, are increasingly being stored on commercial and institutional cloud computing platforms. This is because cloud-scale computing resources, from robust backup to high-speed data transfer to scalable compute and storage, are needed to make these large datasets usable. However, one challenge for large-scale biomedical data on the cloud is providing secure access, especially when datasets are distributed across platforms. While there are open Web protocols for secure authentication and authorization, these protocols are not in wide use in bioinformatics and are difficult to use for even technologically sophisticated users. Results We have developed a generic and extensible approach for securely accessing biomedical datasets distributed across cloud computing platforms. Our approach combines OpenID Connect and OAuth2, best-practice Web protocols for authentication and authorization, together with Galaxy (https://galaxyproject.org), a web-based computational workbench used by thousands of scientists across the world. With our enhanced version of Galaxy, users can access and analyze data distributed across multiple cloud computing providers without any special knowledge of access/authorization protocols. Our approach does not require users to share permanent credentials (e.g. username, password, API key), instead relying on automatically generated temporary tokens that refresh as needed. Our approach is generalizable to most identity providers and cloud computing platforms. To the best of our knowledge, Galaxy is the only computational workbench where users can access biomedical datasets across multiple cloud computing platforms using best-practice Web security approaches and thereby minimize risks of unauthorized data access and credential use. Availability and implementation Freely available for academic and commercial use under the open-source Academic Free License (https://opensource.org/licenses/AFL-3.0) from the following Github repositories: https://github.com/galaxyproject/galaxy and https://github.com/galaxyproject/cloudauthz.

Download Full-text

An Efficient Interconnection System for Neural NOC Using Fault Tolerant Routing Method

Journal of Physics Conference Series ◽

10.1088/1742-6596/2089/1/012069 ◽

2021 ◽

Vol 2089 (1) ◽

pp. 012069

Author(s):

A. Pradeep kumar ◽

Y. Devendar Reddy ◽

T. Srinivas Reddy ◽

K. Jamal

Keyword(s):

Neural Network ◽

Large Scale ◽

Fault Tolerant ◽

Interconnection Network ◽

Heavy Traffic ◽

Data Traffic ◽

Field Programmable ◽

Programmable Gate Arrays ◽

On Chip ◽

Hardware Costs

Abstract Large scale Neural Network (NN) accelerators typically have multiple processing nodes that can be implemented as a multi-core chip, and can be organized on a network of chips (noise) corresponding to neurons with heavy traffic. Portions of several NoC-based NN chip-to-chip interconnect networks are linked to further enhance overall nerve amplification capacity. Large volumes of multicast on-chip or cross-chip can further complicate the construction of a cross-link network and create a NN barrier of device capacity and resources. In this paper, this refer to inter-chip and inter-chip communication strategies known as neuron connection for NN accelerators. Interconnect for powerful fault-tolerant routing system neural NoC is implemented in this paper. This recommends crossbar arbitration placement, virtual interrupts, and path-based parallelization strategies in terms of intra-chip communications for the virtual channel routing resulting in higher NoC output at lower hardware costs. A lightweight NoC compatible chip-to-chip interconnection scheme is proposed regarding to inter-chip communication for multicast-based data traffic to enable efficient interconnection for NoC-based NN chips. Moreover, the proposed methods will be tested with four Field Programmable Gate Arrays (FPGAs) on four hard-wired deep neural network (DNN) chips. From the experimental results it can be illustrate that a high throguput can obtained effectively by the proposed interconnection network in handling thedata traffic and low DNN through advanced links.

Download Full-text

EVALUATION STUDY OF SYSTOLIC ARRAY PROCESSORS OPTIMIZATION AND MAPPING ON k-LUT FPGA DEVICES

Journal of Circuits System and Computers ◽

10.1142/s0218126613500254 ◽

2013 ◽

Vol 22 (04) ◽

pp. 1350025

Author(s):

STAVROS P. DOKOUZYANNIS ◽

ARGIRIS P. MOKIOS

Keyword(s):

Systolic Array ◽

High Speed ◽

Large Scale ◽

Logic Synthesis ◽

Evaluation Study ◽

Optimization Approach ◽

Array Processors ◽

Field Programmable ◽

High Level ◽

Selection Of

This paper analyzes the design automation of embedded Systolic Array Processors (SAPs), into large scale Field Programmable Gate Array (FPGA) devices. SAPs are hardware implementations of a class of iterative, high-level language algorithms, for applications where the high-speed of processing has the principal meaning of a design. Embedding SAPs onto FPGAs is a complex process. The optimization phase in this process reduces the SAP significantly, thus less FPGA area is occupied by the embedded design, without any loss in the final performance. The present paper examines the effect of Projection Vectors (PVs) and Task Scheduling Vectors (TSVs) on the optimization process. Two optimization approaches are examined, namely technology mapping using FlowMap and Flowpack algorithms and optimization via logic synthesis using Xilinx Synthesis Tool. The multiplication of matrices, with entries being up to 32-bit integer vectors, has been taken as a sample space for the experiments conducted. The results, confirm that the selection of PV and TSV greatly affects the number of input/output signal connections of the FPGA, while the selection of an optimization approach affects the final number of logic resources occupied on the targeted device.

Download Full-text

Algorithmic TCAM on FPGA with data collision approach

Indonesian Journal of Electrical Engineering and Computer Science ◽

10.11591/ijeecs.v22.i1.pp89-96 ◽

2021 ◽

Vol 22 (1) ◽

pp. 89

Author(s):

Nguyen Trinh ◽

Anh Le Thi Kim ◽

Hung Nguyen ◽

Linh Tran

Keyword(s):

Power Dissipation ◽

High Speed ◽

Large Scale ◽

Data Centers ◽

Network Routing ◽

Content Addressable Memory ◽

Future Developments ◽

Ternary Content Addressable Memory ◽

Field Programmable ◽

Data Collision

<span>Content addressable memory (CAM) and ternary content addressable memory (TCAM) are specialized high-speed memories for data searching. CAM and TCAM have many applications in network routing, packet forwarding and Internet data centers. These types of memories have drawbacks on power dissipation and area. As field-programmable gate array (FPGA) is recently being used for network acceleration applications, the demand to integrate TCAM and CAM on FPGA is increasing. Because most FPGAs do not support native TCAM and CAM hardware, methods of implementing algorithmic TCAM using FPGA resources have been proposed through recent years. Algorithmic TCAM on FPGA have the advantages of FPGAs low power consumption and high intergration scalability. This paper proposes a scaleable algorithmic TCAM design on FPGA. The design uses memory blocks to negate power dissipation issue and data collision to save area. The paper also presents a design of a 256 x 104-bit algorithmic TCAM on Intel FPGA Cyclone V, evaluates the performance and application ability of the design on large scale and in future developments.</span>

Download Full-text

Systolic Fir Filter using Bypass Multiplier

International Journal of Engineering and Advanced Technology - Regular Issue ◽

10.35940/ijeat.b4914.129219 ◽

2019 ◽

Vol 9 (2) ◽

pp. 3472-3474

Keyword(s):

Signal Processing ◽

High Speed ◽

Large Scale ◽

Finite Impulse Response ◽

Digital Signal ◽

Fir Filter ◽

Filter Method ◽

Design Complexity ◽

Field Programmable ◽

Scale Integration

In DSP the most common function is Finite Impulse Response (FIR) filter which is realized in field Programmable gate Arrays (FPGAs). For efficient Very Large Scale Integration (VLSI) computation systolic FIR filter architecture has attractive models. High speed is the major concern for fast computation in real time Digital Signal Processing (DSP) applications. In conventional systolic FIR filter method uses general array multiplier structure which takes more time to compute the process with high design complexity with less power. To overcome this problem the systolic FIR filter utilizing Bypass Feed Direct Multiplier(BFDM) is proposed. The proposed method 16 tap systolic FIR parallel processing offers less delay with less design complexity which is used in image and signal processing applications. The proposed method is simulated using Xilinx ISE 12.4 ISE tool and the functions are evaluated by MODELSIM 6.3C.

Download Full-text

Dedicated Floorplanning Engine Architecture Based on Genetic Algorithm and Evaluation

Journal of Advanced Computational Intelligence and Intelligent Informatics ◽

10.20965/jaciii.2006.p0112 ◽

2006 ◽

Vol 10 (1) ◽

pp. 112-120

Author(s):

Masaya Yoshikawa ◽

◽

Hidekazu Terai

Keyword(s):

Integrated Circuit ◽

Field Programmable Gate Array ◽

High Speed ◽

Large Scale ◽

Layout Design ◽

Maximum Density ◽

Basic Design ◽

Sequence Pair ◽

Field Programmable ◽

Processing Measurement

The floorplanning problem, a basic design step in layout design of very large-scale integrated circuit (VLSI), deals with placing rectangular modules at maximum density. Many studies have dealt with conducted this problem using sequence pairs based on genetic algorithms (GAs), but this generally requires much calculation time. We propose an architecture for high-speed floorplanning using a sequence pair based on GA. The proposed architecture, implemented on the field-programmable gate array (FPGA), achieves high-speed processing. Measurement evaluating the proposed architecture demonstrated speeds 37.1 times greater than software processing.

Download Full-text

A report on High Speed Wind Tunnel Testing of the Large Scale Advanced Prop-Fan

10.2514/6.1988-2802 ◽

1988 ◽

Author(s):

P. BUSHNELL ◽

W. CAMPBELL ◽

H. WAINAUSKI

Keyword(s):

Wind Tunnel ◽

High Speed ◽

Large Scale ◽

Wind Tunnel Testing ◽

Tunnel Testing

Download Full-text

Do elite soccer players cover longer distance when losing? Differences between attackers and defenders

International Journal of Sports Science & Coaching ◽

10.1177/1747954120982270 ◽

2020 ◽

pp. 174795412098227

Author(s):

Carlos Lago-Peñas ◽

Anton Kalén ◽

Miguel Lorenzo-Martinez ◽

Roberto López-Del Campo ◽

Ricardo Resta ◽

...

Keyword(s):

New York ◽

High Speed ◽

Large Scale ◽

Tracking System ◽

Soccer Players ◽

Running Performance ◽

Situational Variables ◽

Total Distance ◽

Professional Soccer ◽

Score Line

This study aimed to evaluate the effects playing position, match location (home or away), quality of opposition (strong or weak), effective playing time (total time minus stoppages), and score-line on physical match performance in professional soccer players using a large-scale analysis. A total of 10,739 individual match observations of outfield players competing in the Spanish La Liga during the 2018–2019 season were recorded using a computerized tracking system (TRACAB, Chyronhego, New York, USA). The players were classified into five positions (central defenders, players = 94; external defenders, players = 82; central midfielders, players = 101; external midfielders, players = 72; and forwards, players = 67) and the following match running performance categories were considered: total distance covered, low-speed running (LSR) distance (0–14 km · h−1), medium-speed running (MSR) distance (14–21 km · h−1), high-speed running (HSR) distance (>21 km · h−1), very HSR (VHSR) distance (21–24 km · h−1), sprint distance (>24 km · h−1) Overall, match running performance was highly dependent on situational variables, especially the score-line condition (winning, drawing, losing). Moreover, the score-line affected players running performance differently depending on their playing position. Losing status increased the total distance and the distance covered at MSR, HSR, VHSR and Sprint by defenders, while attacking players showed the opposite trend. These findings may help coaches and managers to better understand the effects of situational variables on physical performance in La Liga and could be used to develop a model for predicting the physical activity profile in competition.

Download Full-text

Accelerating laser processes with a smart two-dimensional polygon mirror scanner for ultra-fast beam deflection

Advanced Optical Technologies ◽

10.1515/aot-2021-0014 ◽

2021 ◽

Vol 0 (0) ◽

Author(s):

Florian Roessler ◽

André Streek

Keyword(s):

Energy Distribution ◽

Real Time ◽

Laser Processing ◽

High Speed ◽

Beam Deflection ◽

Two Dimensional ◽

Heat Accumulation ◽

Field Programmable ◽

Drill Holes ◽

Average Laser

Abstract In laser processing, the possible throughput is directly scaling with the available average laser power. To avoid unwanted thermal damage due to high pulse energy or heat accumulation during MHz-repetition rates, energy distribution over the workpiece is required. Polygon mirror scanners enable high deflection speeds and thus, a proper energy distribution within a short processing time. The requirements of laser micro processing with up to 10 kW average laser powers and high scan speeds up to 1000 m/s result in a 30 mm aperture two-dimensional polygon mirror scanner with a patented low-distortion mirror configuration. In combination with a field programmable gate array-based real-time logic, position-true high-accuracy laser switching is enabled for 2D, 2.5D, or 3D laser processing capable to drill holes in multi-pass ablation or engraving. A special developed real-time shifter module within the high-speed logic allows, in combination with external axis, the material processing on the fly and hence, processing of workpieces much larger than the scan field.

Download Full-text