scholarly journals Study on progress threads placement and dedicated cores for overlapping MPI nonblocking collectives on manycore processor

Author(s):  
Alexandre Denis ◽  
Julien Jaeger ◽  
Emmanuel Jeannot ◽  
Marc Pérache ◽  
Hugo Taboada

To amortize the cost of MPI collective operations, nonblocking collectives have been proposed so as to allow communications to be overlapped with computation. Unfortunately, collective communications are more CPU-hungry than point-to-point communications and running them in a communication thread on a dedicated CPU core makes them slow. On the other hand, running collective communications on the application cores leads to no overlap. In this article, we propose placement algorithms for progress threads that do not degrade performance when running on cores dedicated to communications to get communication/computation overlap. We first show that even simple collective operations, such as those based on a chain topology, are not straightforward to make progress in background on a dedicated core. Then, we propose an algorithm for tree-based collective operations that splits the tree between communication cores and application cores. To get the best of both worlds, the algorithm runs the short but heavy part of the tree on application cores, and the long but narrow part of the tree on one or several communication cores, so as to get a trade-off between overlap and absolute performance. We provide a model to study and predict its behavior and to tune its parameters. We implemented both algorithms in the multiprocessor computing framework, which is a thread-based MPI implementation. We have run benchmarks on manycore processors such as the KNL and Skylake and get good results for both performance and overlap.

2020 ◽  
Vol 12 (7) ◽  
pp. 2767 ◽  
Author(s):  
Víctor Yepes ◽  
José V. Martí ◽  
José García

The optimization of the cost and CO 2 emissions in earth-retaining walls is of relevance, since these structures are often used in civil engineering. The optimization of costs is essential for the competitiveness of the construction company, and the optimization of emissions is relevant in the environmental impact of construction. To address the optimization, black hole metaheuristics were used, along with a discretization mechanism based on min–max normalization. The stability of the algorithm was evaluated with respect to the solutions obtained; the steel and concrete values obtained in both optimizations were analyzed. Additionally, the geometric variables of the structure were compared. Finally, the results obtained were compared with another algorithm that solved the problem. The results show that there is a trade-off between the use of steel and concrete. The solutions that minimize CO 2 emissions prefer the use of concrete instead of those that optimize the cost. On the other hand, when comparing the geometric variables, it is seen that most remain similar in both optimizations except for the distance between buttresses. When comparing with another algorithm, the results show a good performance in optimization using the black hole algorithm.


Author(s):  
Tomoyuki Miyashita ◽  
Hiroshi Yamakawa

Abstract Recent years, financial difficulties led engineers to look for not only the efficiency of the function of a product but also the cost of its development. In order to reduce the time for the development, engineers in each discipline have to develop and improve their objectives collaboratively. Sometimes, they have to cooperate with those who have no knowledge at all for their own disciplines. Collaborative designs have been studied to solve these kinds of the problems, but most of them need some sorts of negotiation among disciplines and assumes that these negotiations will be done successfully. However, in the most cases of real designs, manager of each discipline does not want to give up his or her own objectives to stress on the other objectives. In order to carry out these negotiations smoothly, we need some sort of evaluation criteria which will show efficiency of the product considering the designs by each division and if possible, considering the products of the competitive company, too. In this study, we use Data Envelopment Analysis (DEA) to calculate the efficiency of the design and showed every decision maker the directions of the development of the design. We will call here these kinds of systems as supervisor systems and implemented these systems in computer networks that every decision maker can use conveniently. Through simple numerical examples, we showed the effectiveness of the proposed method.


1876 ◽  
Vol 166 ◽  
pp. 269-313 ◽  

1. Structure of the Medusæ . -Although it is not my intention in this preliminary notice to enter into the literature of my subject, it is nevertheless desirable to quote the well-known statements of Prof. L. Agassiz regarding the nature and distribution of the nervous system which he describes as occurring in the Medusae. He says:-“There is unquestionably a nervous system in the Medusæ, but this nervous system does not form large central masses to which all the activity of the body is referred, or from which it emanates....In Medusæ the nervous system consists of a simple cord, of a string of ovate cells, forming a ring round the lower margin of the animal, extending from one eye-speck to the other, following the circular chymiferous tube, and also its vertical branches, round the upper portion of which they form another circle. The substance of this nervous system, however, is throughout cellular, and strictly so, and the cells are ovate. There is no appearance in any of its parts of true fibres. “I do not wonder, therefore, that the very existence of a nervous system in the Medusae should have been denied, and should not be at all surprised if it were even now further questioned. I would only urge those interested in this question to look carefully along the inner margin of the chymiferous tubes, and to search there for a cord of cells of a peculiar ovate form, arranged in six or seven rows, forming a sort of string, or rather similar to a chain of ovate beads placed side by side and point to point, but in such a manner that the individual cells would overlap each other for one half, one third, or a quarter of their length, being from five to seven side by side at any given point upon a transverse section of the row ; and would ask those who do not recognize at once such a string as the nervous system to trace it for its whole extent, especially to the base of the eye-speck, where these cells accumulate in a larger heap, with intervening coloured pigment forming a sort of ganglion; then, further, to follow it up along the inner side of the radiating chymiferous tubes which extend from the summit of the vault of the body, and to ascertain that here, again, it forms another circle round the central digestive cavity, from which other threads, or rather isolated series of elongated cells, run to the proboscis; they will then be satisfied that this apparatus, in all its complication, is really a nervous system of a peculiar structure and adaptation, with peculiar relations to the other systems of organs.......... and such a nervous system I have already traced in all its details, as here described, in the genera Hippocrene ( Bougainvillia ), Tiaropsis , and Staurophora ”.


2020 ◽  
Author(s):  
Sebastian Fehrler ◽  
Moritz Janas

We study the choice of a principal to either delegate a decision to a group of careerist experts or to consult them individually and keep the decision-making power. Our model predicts a trade-off between information acquisition and information aggregation. On the one hand, the expected benefit from being informed is larger in case the experts are consulted individually. Hence, the experts either acquire the same or a larger amount of information, depending on the cost of information, than in case of delegation. On the other hand, any acquired information is better aggregated in the case of delegation, in which experts can deliberate secretly. To test the model’s key predictions, we run an experiment. The results from the laboratory confirm the predicted trade-off despite some deviations from theory on the individual level. This paper was accepted by Yan Chen, decision analysis.


Electronics ◽  
2020 ◽  
Vol 9 (12) ◽  
pp. 2143
Author(s):  
Alex Ming Hui Wong ◽  
Masahiro Furukawa ◽  
Taro Maeda

Authentication has three basic factors—knowledge, ownership, and inherence. Biometrics is considered as the inherence factor and is widely used for authentication due to its conveniences. Biometrics consists of static biometrics (physical characteristics) and dynamic biometrics (behavioral). There is a trade-off between robustness and security. Static biometrics, such as fingerprint and face recognition, are often reliable as they are known to be more robust, but once stolen, it is difficult to reset. On the other hand, dynamic biometrics are usually considered to be more secure due to the constant changes in behavior but at the cost of robustness. In this paper, we proposed a multi-factor authentication—rhythmic-based dynamic hand gesture, where the rhythmic pattern is the knowledge factor and the gesture behavior is the inherence factor, and we evaluate the robustness of the proposed method. Our proposal can be easily applied with other input methods because rhythmic pattern can be observed, such as during typing. It is also expected to improve the robustness of the gesture behavior as the rhythmic pattern acts as a symbolic cue for the gesture. The results shown that our method is able to authenticate a genuine user at the highest accuracy of 0.9301 ± 0.0280 and, also, when being mimicked by impostors, the false acceptance rate (FAR) is as low as 0.1038 ± 0.0179.


2020 ◽  
Vol 4 (02) ◽  
pp. 34-45
Author(s):  
Naufal Dzikri Afifi ◽  
Ika Arum Puspita ◽  
Mohammad Deni Akbar

Shift to The Front II Komplek Sukamukti Banjaran Project is one of the projects implemented by one of the companies engaged in telecommunications. In its implementation, each project including Shift to The Front II Komplek Sukamukti Banjaran has a time limit specified in the contract. Project scheduling is an important role in predicting both the cost and time in a project. Every project should be able to complete the project before or just in the time specified in the contract. Delay in a project can be anticipated by accelerating the duration of completion by using the crashing method with the application of linear programming. Linear programming will help iteration in the calculation of crashing because if linear programming not used, iteration will be repeated. The objective function in this scheduling is to minimize the cost. This study aims to find a trade-off between the costs and the minimum time expected to complete this project. The acceleration of the duration of this study was carried out using the addition of 4 hours of overtime work, 3 hours of overtime work, 2 hours of overtime work, and 1 hour of overtime work. The normal time for this project is 35 days with a service fee of Rp. 52,335,690. From the results of the crashing analysis, the alternative chosen is to add 1 hour of overtime to 34 days with a total service cost of Rp. 52,375,492. This acceleration will affect the entire project because there are 33 different locations worked on Shift to The Front II and if all these locations can be accelerated then the duration of completion of the entire project will be effective


2020 ◽  
Vol 3 (1) ◽  
pp. 61
Author(s):  
Kazuhiro Aruga

In this study, two operational methodologies to extract thinned woods were investigated in the Nasunogahara area, Tochigi Prefecture, Japan. Methodology one included manual extraction and light truck transportation. Methodology two included mini-forwarder forwarding and four-ton truck transportation. Furthermore, a newly introduced chipper was investigated. As a result, costs of manual extractions within 10 m and 20 m were JPY942/m3 and JPY1040/m3, respectively. On the other hand, the forwarding cost of the mini-forwarder was JPY499/m3, which was significantly lower than the cost of manual extractions. Transportation costs with light trucks and four-ton trucks were JPY7224/m3 and JPY1298/m3, respectively, with 28 km transportation distances. Chipping operation costs were JPY1036/m3 and JPY1160/m3 with three and two persons, respectively. Finally, the total costs of methodologies one and two from extraction within 20 m to chipping were estimated as JPY9300/m3 and JPY2833/m3, respectively, with 28 km transportation distances and three-person chipping operations (EUR1 = JPY126, as of 12 August 2020).


Author(s):  
Vincent E. Castillo ◽  
John E. Bell ◽  
Diane A. Mollenkopf ◽  
Theodore P. Stank

2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Jeonghyuk Park ◽  
Yul Ri Chung ◽  
Seo Taek Kong ◽  
Yeong Won Kim ◽  
Hyunho Park ◽  
...  

AbstractThere have been substantial efforts in using deep learning (DL) to diagnose cancer from digital images of pathology slides. Existing algorithms typically operate by training deep neural networks either specialized in specific cohorts or an aggregate of all cohorts when there are only a few images available for the target cohort. A trade-off between decreasing the number of models and their cancer detection performance was evident in our experiments with The Cancer Genomic Atlas dataset, with the former approach achieving higher performance at the cost of having to acquire large datasets from the cohort of interest. Constructing annotated datasets for individual cohorts is extremely time-consuming, with the acquisition cost of such datasets growing linearly with the number of cohorts. Another issue associated with developing cohort-specific models is the difficulty of maintenance: all cohort-specific models may need to be adjusted when a new DL algorithm is to be used, where training even a single model may require a non-negligible amount of computation, or when more data is added to some cohorts. In resolving the sub-optimal behavior of a universal cancer detection model trained on an aggregate of cohorts, we investigated how cohorts can be grouped to augment a dataset without increasing the number of models linearly with the number of cohorts. This study introduces several metrics which measure the morphological similarities between cohort pairs and demonstrates how the metrics can be used to control the trade-off between performance and the number of models.


2020 ◽  
Vol 15 (1) ◽  
pp. 4-17
Author(s):  
Jean-François Biasse ◽  
Xavier Bonnetain ◽  
Benjamin Pring ◽  
André Schrottenloher ◽  
William Youmans

AbstractWe propose a heuristic algorithm to solve the underlying hard problem of the CSIDH cryptosystem (and other isogeny-based cryptosystems using elliptic curves with endomorphism ring isomorphic to an imaginary quadratic order 𝒪). Let Δ = Disc(𝒪) (in CSIDH, Δ = −4p for p the security parameter). Let 0 < α < 1/2, our algorithm requires:A classical circuit of size $2^{\tilde{O}\left(\log(|\Delta|)^{1-\alpha}\right)}.$A quantum circuit of size $2^{\tilde{O}\left(\log(|\Delta|)^{\alpha}\right)}.$Polynomial classical and quantum memory.Essentially, we propose to reduce the size of the quantum circuit below the state-of-the-art complexity $2^{\tilde{O}\left(\log(|\Delta|)^{1/2}\right)}$ at the cost of increasing the classical circuit-size required. The required classical circuit remains subexponential, which is a superpolynomial improvement over the classical state-of-the-art exponential solutions to these problems. Our method requires polynomial memory, both classical and quantum.


Sign in / Sign up

Export Citation Format

Share Document