high radix
Recently Published Documents


TOTAL DOCUMENTS

239
(FIVE YEARS 24)

H-INDEX

20
(FIVE YEARS 2)

2022 ◽  
Vol 71 (2) ◽  
pp. 436-449
Author(s):  
Bo Zhang ◽  
Zeming Cheng ◽  
Massoud Pedram

Nanophotonics ◽  
2021 ◽  
Vol 0 (0) ◽  
Author(s):  
Ye Tian ◽  
Yang Zhao ◽  
Shengping Liu ◽  
Qiang Li ◽  
Wei Wang ◽  
...  

Abstract Photonic computation has garnered huge attention due to its great potential to accelerate artificial neural network tasks at much higher clock rate to digital electronic alternatives. Especially, reconfigurable photonic processor consisting of Mach–Zehnder interferometer (MZI) mesh is promising for photonic matrix multiplier. It is desired to implement high-radix MZI mesh to boost the computation capability. Conventionally, three cascaded MZI meshes (two universal N × N unitary MZI mesh and one diagonal MZI mesh) are needed to express N × N weight matrix with O(N 2) MZIs requirements, which limits scalability seriously. Here, we propose a photonic matrix architecture using the real-part of one nonuniversal N × N unitary MZI mesh to represent the real-value matrix. In the applications like photonic neural network, it probable reduces the required MZIs to O(Nlog2 N) level while pay low cost on learning capability loss. Experimentally, we implement a 4 × 4 photonic neural chip and benchmark its performance in convolutional neural network for handwriting recognition task. Low learning-capability-loss is observed in our 4 × 4 chip compared to its counterpart based on conventional architecture using O(N 2) MZIs. While regarding the optical loss, chip size, power consumption, encoding error, our architecture exhibits all-round superiority.


Author(s):  
Julio Villalba ◽  
Javier Hormigo

AbstractThis article proposes a family of high-radix floating-point representation to efficiently deal with floating-point addition in FPGA devices with no native floating-point support. Since variable shifter implementation (required in any FP adder) has a very high cost in FPGA, high-radix formats considerably reduce the number of possible shifts, decreasing the execution time and area highly. Although the high-radix format produces also a significant penalty in the implementation of multipliers, the experimental results show that the adder improvement overweights the multiplication penalty for most of the practical and common cases (digital filters, matrix multiplications, etc.). We also provide the designer with guidelines on selecting a suitable radix as a function of the ratio between the number of additions and multiplications of the targeted algorithm. For applications with similar numbers of additions and multiplications, the high-radix version may be up to 26% faster and even having a wider dynamic range and using higher number of significant bits. Furthermore, thanks to the proposed efficient converters between the standard IEEE-754 format and our internal high-radix format, the cost of the input/output conversions in FPGA accelerators is negligible.


2021 ◽  
Author(s):  
Venkata Reddy Kolagatla ◽  
Vivian Desalphine ◽  
David Selvakumar

Electronics ◽  
2021 ◽  
Vol 10 (16) ◽  
pp. 1988
Author(s):  
Yuheng Yang ◽  
Qing Yuan ◽  
Jian Liu

In this paper, we propose an efficient architecture of floating-point square-root circuit with low area cost, which is in accordance with the IEEE-754 standard. We extend the principle of the standard SRT algorithm so that the latency and area cost of the proposed circuit are linear with the radix. In addition, no extra computation cycles are required. With 65 nm technology, the area cost of the single-precision floating-point square-root circuit based on proposed architecture is only 6450.84 μm2, and the dynamic power consumption is only 0.764 mW at 300 MHz. The implementation results show that the proposed square-root circuit can reduce the area cost by 60%~90% compared with other designs in the literature.


2021 ◽  
Vol 18 (4) ◽  
pp. 1-21
Author(s):  
Cunlu Li ◽  
Dezun Dong ◽  
Shazhou Yang ◽  
Xiangke Liao ◽  
Guangyu Sun ◽  
...  

Hierarchical organization is widely used in high-radix routers to enable efficient scaling to higher switch port count. A general-purpose hierarchical router must be symmetrically designed with the same input buffer depth, resulting in a large amount of unused input buffers due to the different link lengths. Sharing input buffers between different input ports can improve buffer utilization, but the implementation overhead also increases with the number of shared ports. Previous work allowed input buffers to be shared among all router ports, which maximizes the buffer utilization but also introduces higher implementation complexity. Moreover, such design can impair performance when faced with long packets, due to the head-of-line blocking in intermediate buffers. In this work, we explain that sharing unused buffers between a subset of router ports is a more efficient design. Based on this observation, we propose Centralized Input Buffer Design in Hierarchical High-radix Routers (CIB-HIER), a novel centralized input buffer design for hierarchical high-radix routers. CIB-HIER integrates multiple input ports onto a single tile and organizes all unused input buffers in the tile as a centralized input buffer. CIB-HIER only allows the centralized input buffer to be shared between ports on the same tile, without introducing additional intermediate virtual channels or global scheduling circuits. Going beyond the basic design of CIB-HIER, the centralized input buffer can be used to relieve the head-of-line blocking caused by shallow intermediate buffers, by stashing long packets in the centralized input buffer. Experimental results show that CIB-HIER is highly effective and can significantly increase the throughput of high-radix routers.


2021 ◽  
Vol 37 (01) ◽  
pp. 045-052
Author(s):  
Mario Bazanelli Junqueira Ferraz ◽  
Guilherme Constante Preis Sella

AbstractNasal dorsal preservation surgery was described more than 100 years ago, but recently has gained prominence. Our objective is to show the surgical technique, the main indications and counterindications, and the complications. It is a technique that does not cause the detachment of the upper lateral cartilage (ULC) from the nasal septum, and has the main following sequence: preparation of the septum and its resection can be at different levels (high or low, i.e., SPAR [septum pyramidal adjustment and repositioning] A or B); preparation of the pyramid; transversal osteotomy; lateral osteotomy(s); and septopyramidal adjustment. The result is a nose with a lower radix than the original, a deprojection of the nasal dorsum tending to maintain its original shape; an increase in the interalar distance (IAD) and enlargement of the nasal middle ⅓; and loss of projection of the nasal tip and roundness of the nostrils. Thus, the ideal candidate is the one who benefits from such side effects, that is: tension nose, that is, high radix with projected dorsum, projected anterior nasal septal angle (ANSA), narrow middle ⅓, narrow IAD, thin nostrils and straight perpendicular plate of the ethmoid (PPE), and, depending on the characteristics, the deviated nose. The counterindications are low radix, irregularities in the nasal dorsum, ANSA lower than rhinion, and a wide middle ⅓. And the main stigmas are: a nose with a very low radix, middle ⅓ enlarged, residual hump, and saddling of the supratip area. Other issues of this technique are: the shape of the radix; the need or not to remove PPE; wide dorsum; irregular dorsum; ANSA lower than rhinion; weak cartilages; long nasal bone; deviated PPE; and obsessive patient. We conclude that this is a great technique for noses with characteristics suitable to it; care must be taken with the stigmas it can cause.


Author(s):  
Yi Dai ◽  
Kai Lu ◽  
Junsheng Chang ◽  
Xingyun Qi ◽  
Jijun Cao ◽  
...  
Keyword(s):  

Sign in / Sign up

Export Citation Format

Share Document