central processor Latest Research Papers

2021 ◽

Vol 4 ◽

pp. 10-15

Author(s):

Gennadii Malaschonok ◽

Serhii Sukharskyi

Keyword(s):

Graphics Processing Unit ◽

Matrix Decomposition ◽

Recent Decade ◽

Orthogonal Matrix ◽

Processing Unit ◽

Central Processor ◽

Graphic Processing Units ◽

Qr Algorithm ◽

Java Language ◽

Graphics Processing

With the development of the Big Data sphere, as well as those fields of study that we can relate to artificial intelligence, the need for fast and efficient computing has become one of the most important tasks nowadays. That is why in the recent decade, graphics processing unit computations have been actively developing to provide an ability for scientists and developers to use thousands of cores GPUs have in order to perform intensive computations. The goal of this research is to implement orthogonal decomposition of a matrix by applying a series of Householder transformations in Java language using JCuda library to conduct a research on its benefits. Several related papers were examined. Malaschonok and Savchenko in their work have introduced an improved version of QR algorithm for this purpose [4] and achieved better results, however Householder algorithm is more promising for GPUs according to another team of researchers – Lahabar and Narayanan [6]. However, they were using Float numbers, while we are using Double, and apart from that we are working on a new BigDecimal type for CUDA. Apart from that, there is still no solution for handling huge matrices where errors in calculations might occur. The algorithm of orthogonal matrix decomposition, which is the first part of SVD algorithm, is researched and implemented in this work. The implementation of matrix bidiagonalization and calculation of orthogonal factors by the Hausholder method in the jCUDA environment on a graphics processor is presented, and the algorithm for the central processor for comparisons is also implemented. Research of the received results where we experimentally measured acceleration of calculations with the use of the graphic processor in comparison with the implementation on the central processor are carried out. We show a speedup up to 53 times compared to CPU implementation on a big matrix size, specifically 2048, and even better results when using more advanced GPUs. At the same time, we still experience bigger errors in calculations while using graphic processing units due to synchronization problems. We compared execution on different platforms (Windows 10 and Arch Linux) and discovered that they are almost the same, taking the computation speed into account. The results have shown that on GPU we can achieve better performance, however there are more implementation difficulties with this approach.

Download Full-text

Real-Time Strain and Elasticity Imaging in Phase-Sensitive Optical Coherence Elastography Using a Computationally Efficient Realization of the Vector Method

Photonics ◽

10.3390/photonics8120527 ◽

2021 ◽

Vol 8 (12) ◽

pp. 527

Author(s):

Vladimir Y. Zaitsev ◽

Sergey Y. Ksenofontov ◽

Alexander A. Sovetsky ◽

Alexander L. Matveyev ◽

Lev A. Matveev ◽

...

Keyword(s):

Real Time ◽

Least Squares Method ◽

Local Strain ◽

Biological Tissues ◽

Central Processor ◽

Optical Coherence ◽

Computationally Efficient ◽

Vector Method ◽

Real Time Processing ◽

Phase Sensitive

We present a real-time realization of OCT-based elastographic mapping local strains and distribution of the Young’s modulus in biological tissues, which is in high demand for biomedical usage. The described variant exploits the principle of Compression Optical Coherence Elastography (C-OCE) and uses processing of phase-sensitive OCT signals. The strain is estimated by finding local axial gradients of interframe phase variations. Instead of the popular least-squares method for finding these gradients, we use the vector approach, one of its advantages being increased computational efficiency. Here, we present a modified, especially fast variant of this approach. In contrast to conventional correlation-based methods and previously used phase-resolved methods, the described method does not use any search operations or local calculations over a sliding window. Rather, it obtains local strain maps (and then elasticity maps) using several transformations represented as matrix operations applied to entire complex-valued OCT scans. We first elucidate the difference of the proposed method from the previously used correlational and phase-resolved methods and then describe the proposed method realization in a medical OCT device, in which for real-time processing, a “typical” central processor (e.g., Intel Core i7-8850H) is sufficient. Representative examples of on-flight obtained elastographic images are given. These results open prospects for broad use of affordable OCT devices for high-resolution elastographic vitalization in numerous biomedical applications, including the use in clinic.

Download Full-text

Multilink Internet-of-Things Sensor Communication Based on Bluetooth Low Energy Considering Scalability

Electronics ◽

10.3390/electronics10192335 ◽

2021 ◽

Vol 10 (19) ◽

pp. 2335

Author(s):

Dong-Suk Ryu ◽

Yeung-Mo Yeon ◽

Seung-Hee Kim

Keyword(s):

Internet Of Things ◽

Low Energy ◽

Sensor Nodes ◽

Sensor Data ◽

Central Processor ◽

Bluetooth Low Energy ◽

Switching Algorithm ◽

Data Errors ◽

The Internet Of Things ◽

Central Device

As the growth rate of the internet-of-things (IoT) sensor market is expected to exceed 30%, a technology that can easily collect and processing a large number of various types of sensor data is gradually required. However, conventional multilink IoT sensor communication based on Bluetooth low energy (BLE) enables only the processing of up to 19 peripheral nodes per central device. This study suggested an alternative to increasing the number of IoT sensor nodes while minimizing the addition of a central processor by expanding the number of peripheral nodes that can be processed per central device through a new group-switching algorithm based on Bluetooth low energy (BLE). Furthermore, this involves verifying the relevancy of application to the industry field. This device environment lowered the possibility of data errors and equipment troubles due to communication interference between central processors, which is a critical advantage when applying it to industry. The scalability and various benefits of a group-switching algorithm are expected to help accelerate various services via the application of BLE 5 wireless communication by innovatively improving the constraint of accessing up to 19 nodes per central device in the conventional multilink IoT sensor communication.

Download Full-text

Development of software and algorithms of parallel learning of artificial neural networks using CUDA technologies

Technology audit and production reserves ◽

10.15587/2706-5448.2021.239784 ◽

2021 ◽

Vol 5 (2(61)) ◽

pp. 21-25

Author(s):

Yaroslav Sokolovskyy ◽

Denys Manokhin ◽

Yaroslav Kaplunsky ◽

Olha Mokrytska

Keyword(s):

Neural Network ◽

Neural Networks ◽

Artificial Neural Networks ◽

Medical Image Analysis ◽

Cloud Services ◽

Small Data ◽

Central Processor ◽

Cuda Technology ◽

Artificial Neural ◽

Cloud Technologies

The object of research is to parallelize the learning process of artificial neural networks to automate the procedure of medical image analysis using the Python programming language, PyTorch framework and Compute Unified Device Architecture (CUDA) technology. The operation of this framework is based on the Define-by-Run model. The analysis of the available cloud technologies for realization of the task and the analysis of algorithms of learning of artificial neural networks is carried out. A modified U-Net architecture from the MedicalTorch library was used. The purpose of its application was the need for a network that can effectively learn with small data sets, as in the field of medicine one of the most problematic places is the availability of large datasets, due to the requirements for data confidentiality of this nature. The resulting information system is able to implement the tasks set before it, contains the most user-friendly interface and all the necessary tools to simplify and automate the process of visualization and analysis of data. The efficiency of neural network learning with the help of the central processor (CPU) and with the help of the graphic processor (GPU) with the use of CUDA technologies is compared. Cloud technology was used in the study. Google Colab and Microsoft Azure were considered among cloud services. Colab was first used to build a prototype. Therefore, the Azure service was used to effectively teach the finished architecture of the artificial neural network. Measurements were performed using cloud technologies in both services. The Adam optimizer was used to learn the model. CPU duration measurements were also measured to assess the acceleration of CUDA technology. An estimate of the acceleration obtained through the use of GPU computing and cloud technologies was implemented. CPU duration measurements were also measured to assess the acceleration of CUDA technology. The model developed during the research showed satisfactory results according to the metrics of Jaccard and Dyce in solving the problem. A key factor in the success of this study was cloud computing services.

Download Full-text

Power Minimization Using Rate Splitting With Statistical CSI in Cloud-Radio Access Networks

Frontiers in Communications and Networks ◽

10.3389/frcmn.2021.716618 ◽

2021 ◽

Vol 2 ◽

Author(s):

Alaa Alameer Ahmad ◽

Hayssam Dahrouj ◽

Anas Chaaban ◽

Tareq Y. Al-Naffouri ◽

Aydin Sezgin ◽

...

Keyword(s):

Mean Squared Error ◽

Sample Average Approximation ◽

Base Stations ◽

Practical Case ◽

Central Processor ◽

Power Minimization ◽

Estimation Errors ◽

Minimum Mean Squared Error ◽

Radio Access ◽

Rate Splitting

Minimizing the power consumption in mobile communication networks while ensuring a minimum quality of service (QoS) for applications is essential in light of the unprecedented expected increase in the number of connected devices and the associated data traffic beyond the fifth generation of wireless networks (B5G). This paper considers a cloud-radio access network (C-RAN) model where a central processor (CP) is connected to the base stations (BSs) via limited capacity fronthaul links. In the context of our C-RAN setting, we consider the practical case where the CP has only statistical knowledge of channel state information (CSI). While conventional wireless systems adopt the treating interference as noise (TIN) strategy to deal with the interference in the network, this paper instead considers that the CP applies the rate splitting (RS) strategy by dividing each user’s message into two parts: a private part to be decoded by the intended user only and a common part to be decoded by a subset of users, for the sole reason of interference mitigation in the network. To best account for the channel estimation errors, this paper addresses the problem of transmit power minimization under minimum QoS constraints on the achievable ergodic rate per user, so as to determine the beamforming vectors of the private and common messages as well as the rate allocated to all the users. The considered problem is of stochastic, complex, and non-convex nature. This paper addresses the problem intricacies through an iterative approach that leverages both the sample average approximation (SAA) technique and the weighted minimum mean squared error (WMMSE) algorithm to obtain a stationary point of the optimization problem in the asymptotic regime. The numerical results demonstrate the gain achieved with the RS strategy as compared to TIN, especially under high QoS requirements.

Download Full-text

USING THE CUDA TECHNOLOGY TO SPEED UP COMPUTATIONS IN PROBLEMS OF CHEMICAL KINETICS

NEWS OF THE NATIONAL ACADEMY OF SCIENCES OF THE REPUBLIC OF KAZAKHSTAN ◽

10.32014/2021.2518-1726.19 ◽

2021 ◽

Vol 2 (336) ◽

pp. 39-47

Author(s):

M. Sarsembayev ◽

B. Urmashev ◽

O. Mamyrbayev ◽

M. Turdalyuly ◽

T. Sarsembayeva

Keyword(s):

Main Idea ◽

Time Interval ◽

Central Processor ◽

Web Browser ◽

Elementary Reactions ◽

Cuda Technology ◽

Speed Up ◽

Reacting Systems ◽

To Receive ◽

5Th Grade

The main idea of the implementation is reducing the time for calculation and thereby implement a multi-user mode for users by placing it on a server with access via a web browser. To model the kinetics of chemical reacting systems were used 4th and 5th grade Runge-Kutta methods and to receive the index of advantages of this elaboration were written programs in C# for sequential computation on a central processor and was used a platform for parallel computation of CUDA on graphic processors. Parallelization of data during calculation on a GPU was performed by the distribution of the reaction to individual strands, when changes of the concentration was calculated over a given time interval of a certain substance. Parallelization is performed over all elementary reactions, with the increasing of the number of reactions in the mechanism, because of this the computation on the GPU has a noticeable gain in time.

Download Full-text

IoT based F-RAN Architecture using Cloud and Edge Detection System

Journal of ISMAC - June 2019 ◽

10.36548/jismac.2021.1.003 ◽

2021 ◽

Vol 3 (1) ◽

pp. 31-39

Author(s):

Subarna Shakya

Keyword(s):

Edge Detection ◽

Detection System ◽

Access Network ◽

Cell System ◽

Free Access ◽

Diagnostic Purpose ◽

Central Processor ◽

Probability Of Error ◽

Radio Access ◽

Iot Devices

A multi-cell Fog-Radio Access Network (F-RAN) architecture that takes into consideration the noisy interference from Internet of Things (IoT) devices and transmission takes place in the uplink with grant-free access. An edge node is used to connect the devices present in every cell and will hold a reasonable capacity in the central processor. The reading obtained from the IoT devices are used to determine the field of correlated Quality of Interests in every cell, transmitting using the Type-Based Multiple Access (TBMA) protocol. This is in contrast to the conventional protocols that are used for diagnostic purpose. In this proposed work, we have implemented the multi-cell F-RAN using cloud or edge detection in analysing the form of information-centric radio access. In a multi-cell system, cloud and edge detection are implemented and analysed. We have implemented model-based detectors and the probability of error for the asymptotic behavior in edge as well as cloud is determined. Similarly, cloud and edge detectors that are data driven are used when statistical models are not available.

Download Full-text

Computational Load Balancing Method for Hybrid Computing Systems

Russian Digital Libraries Journal ◽

10.26907/1562-5419-2021-24-1-42-56 ◽

2021 ◽

Vol 24 (1) ◽

pp. 42-56

Author(s):

Татьяна Петровна Баранова ◽

Александр Борисович Бугеря ◽

Кирилл Николаевич Ефимкин

Keyword(s):

Dynamic Problem ◽

Computing System ◽

Program Execution ◽

Central Processor ◽

Computing Systems ◽

Hybrid Computing ◽

Computational Load ◽

Gas Dynamic ◽

Applied Program ◽

Automatic Balancing

The paper considers the issues of the computations distributing within one node of a hybrid computing system for applied programs with computation-intense operations. A method is proposed for static distribution of computations, as well as a method for automatic balancing of the computational load during program execution, which is based on periodic analyzing the CPU load by the executed program and making decision to redistribute computational load if necessary. The proposed methods are implemented in an applied program that solves a gas dynamic problem using the computing resources of the multicore central processor and graphics accelerators. The results of program execution with various data distributions were obtained and analyzed, both with and without the mechanism for automatic balancing of the computational load.

Download Full-text

ЗАСОБИ ЗАБЕЗПЕЧЕННЯ ОПТИМАЛЬНОГО ФУНКЦІОНУВАННЯ ЕЛЕКТРИЧНОЇ СИСТЕМИ ЛОКАЛЬНОГО ОБ’ЄКТУ

Bulletin of the Kyiv National University of Technologies and Design Technical Science Series ◽

10.30857/1813-6796.2020.4.5 ◽

2021 ◽

Vol 148 (4) ◽

pp. 59-66

Author(s):

О. П. Кравченко ◽

Е. Г. Манойлов ◽

Г. О. Бабич ◽

Я. С. Малий

Keyword(s):

Renewable Energy ◽

Power System ◽

Real Time ◽

Electronic Monitoring ◽

Renewable Energy Sources ◽

Electrical Energy ◽

Electronic System ◽

Energy Sources ◽

Central Processor ◽

Energy Profiles

Development of electronic monitoring and control system for achieving an effective ratio between electrical energy generation and consumption in the local object power supply system. Methodology. The theory of electrical circuits and electronic circuits were used. Obtaned results. The electronic system for monitoring and controlling power supply in the local object power system was developed. The system comprises three modules: central processor, module for monitoring environment parameters and executive module which consists of measuring (current, voltage) and relay blocks. The central processor processes signals from monitoring and measuring blocks and forms executive commands on relay block in order to switch on/off consumer loads and electric generators. Developed systems alowes both maximal power take-off from distributed (renewable) energy sources and flexible implementation of power consumption regulation for achieving an effective ratio between the generation of electrical energy provided by renewable energy sources and the general distribution network, and the total load device consumption in the local object power system. Orginality. The electronic monitoring and controlling system in the local object power system alows providing generated and consumed loads monitoring in the real time. The system provides an ability to form real time energy profiles based on which the control algorithm for executive block control is formed in order to achieve an effective ratio between generation and consumption of electricity in the power system of the local facility.for in power consumption control system has been developed, which consists of a central processor, monitoring and executive units. The monitoring unit allows you to create energy profiles in real time, on the basis of which the control algorithm in the executive unit is formed in order to achieve an effective ratio between the electricity generation and consumption in the local object power system. Practical value. As a result of the presented work, an electronic system for monitoring and controling electricity supply in the local object power system with the defined formation of distributed energy sources generation and required consumption profiles in the real time was developed to provide efficient energy consumption according to the concepts of distributed electrical networks with renewable energy sources and Smart House.

Download Full-text

The functioning investigation of the network semi-active damping system for wheeled vehicle

Automation. Modern Techologies ◽

10.36652/0869-4931-2021-75-1-21-28 ◽

2021 ◽

pp. 21-28

Keyword(s):

System Identification ◽

Physical Model ◽

Network Model ◽

Active Damping ◽

Central Processor ◽

Active System ◽

Physical Network ◽

Wheeled Vehicle ◽

Road Profile

The tasks related to the construction of a united semi-active system for damping vibrations of the supporting platform (chassis) of a wheeled vehicle (WV), taking into account the real road profile were considered. The influence estimation of the network on the functioning resulting quality of the entire united damping system is carried out. The modeling of the network united of the model of one wheelset, the possible law of control of the suspension, the central processor and the physical model of the CAN network by using the National Instruments equipment is performed. The results of the experiments, both purely mathematical and with a physical network model, showed the performance of the proposed solutions. Keywords CAN-tire; semi-active suspension system; identification; modeling

Download Full-text

central processor
Recently Published Documents

TOTAL DOCUMENTS

H-INDEX

А Gpu-based Orthogonal Matrix Factorization Algorithm that Produces a Two-Diagonal Shape

Real-Time Strain and Elasticity Imaging in Phase-Sensitive Optical Coherence Elastography Using a Computationally Efficient Realization of the Vector Method

Multilink Internet-of-Things Sensor Communication Based on Bluetooth Low Energy Considering Scalability

Development of software and algorithms of parallel learning of artificial neural networks using CUDA technologies

Power Minimization Using Rate Splitting With Statistical CSI in Cloud-Radio Access Networks

USING THE CUDA TECHNOLOGY TO SPEED UP COMPUTATIONS IN PROBLEMS OF CHEMICAL KINETICS

IoT based F-RAN Architecture using Cloud and Edge Detection System

Computational Load Balancing Method for Hybrid Computing Systems

ЗАСОБИ ЗАБЕЗПЕЧЕННЯ ОПТИМАЛЬНОГО ФУНКЦІОНУВАННЯ ЕЛЕКТРИЧНОЇ СИСТЕМИ ЛОКАЛЬНОГО ОБ’ЄКТУ

The functioning investigation of the network semi-active damping system for wheeled vehicle

Export Citation Format

central processorRecently Published Documents

TOTAL DOCUMENTS

H-INDEX

А Gpu-based Orthogonal Matrix Factorization Algorithm that Produces a Two-Diagonal Shape

Real-Time Strain and Elasticity Imaging in Phase-Sensitive Optical Coherence Elastography Using a Computationally Efficient Realization of the Vector Method

Multilink Internet-of-Things Sensor Communication Based on Bluetooth Low Energy Considering Scalability

Development of software and algorithms of parallel learning of artificial neural networks using CUDA technologies

Power Minimization Using Rate Splitting With Statistical CSI in Cloud-Radio Access Networks

USING THE CUDA TECHNOLOGY TO SPEED UP COMPUTATIONS IN PROBLEMS OF CHEMICAL KINETICS

IoT based F-RAN Architecture using Cloud and Edge Detection System

Computational Load Balancing Method for Hybrid Computing Systems

ЗАСОБИ ЗАБЕЗПЕЧЕННЯ ОПТИМАЛЬНОГО ФУНКЦІОНУВАННЯ ЕЛЕКТРИЧНОЇ СИСТЕМИ ЛОКАЛЬНОГО ОБ’ЄКТУ

The functioning investigation of the network semi-active damping system for wheeled vehicle

central processor
Recently Published Documents