Natural statistics in language Modelling

1998 ◽  
Vol 5 (3) ◽  
pp. 246-255 ◽  
Author(s):  
Royal Skousen
Author(s):  
Saad Irtza ◽  
Vidhyasaharan Sethu ◽  
Sarith Fernando ◽  
Eliathamby Ambikairajah ◽  
Haizhou Li

PLoS ONE ◽  
2020 ◽  
Vol 15 (3) ◽  
pp. e0229963 ◽  
Author(s):  
Ignat Drozdov ◽  
Daniel Forbes ◽  
Benjamin Szubert ◽  
Mark Hall ◽  
Chris Carlin ◽  
...  
Keyword(s):  
X Ray ◽  

Author(s):  
Sarah Samson Juan ◽  
Muhamad Fikri Che Ismail ◽  
Hamimah Ujir ◽  
Irwandi Hipiny

2021 ◽  
Author(s):  
Shi Pui Donald Li ◽  
Michael F. Bonner

The scene-preferring portion of the human ventral visual stream, known as the parahippocampal place area (PPA), responds to scenes and landmark objects, which tend to be large in real-world size, fixed in location, and inanimate. However, the PPA also exhibits preferences for low-level contour statistics, including rectilinearity and cardinal orientations, that are not directly predicted by theories of scene- and landmark-selectivity. It is unknown whether these divergent findings of both low- and high-level selectivity in the PPA can be explained by a unified computational theory. To address this issue, we fit hierarchical computational models of mid-level tuning to the image-evoked fMRI responses of the PPA, and we performed a series of high-throughput experiments on these models. Our findings show that hierarchical encoding models of the PPA exhibit emergent selectivity across multiple levels of complexity, giving rise to high-level preferences along dimensions of real-world size, fixedness, and naturalness/animacy as well as low-level preferences for rectilinear shapes and cardinal orientations. These results reconcile disparate theories of PPA function in a unified model of mid-level visual representation, and they demonstrate how multifaceted selectivity profiles naturally emerge from the hierarchical computations of visual cortex and the natural statistics of images.


Author(s):  
Ye Lin ◽  
Yanyang Li ◽  
Tengbo Liu ◽  
Tong Xiao ◽  
Tongran Liu ◽  
...  

8-bit integer inference, as a promising direction in reducing both the latency and storage of deep neural networks, has made great progress recently. On the other hand, previous systems still rely on 32-bit floating point for certain functions in complex models (e.g., Softmax in Transformer), and make heavy use of quantization and de-quantization. In this work, we show that after a principled modification on the Transformer architecture, dubbed Integer Transformer, an (almost) fully 8-bit integer inference algorithm Scale Propagation could be derived. De-quantization is adopted when necessary, which makes the network more efficient. Our experiments on WMT16 En<->Ro, WMT14 En<->De and En->Fr translation tasks as well as the WikiText-103 language modelling task show that the fully 8-bit Transformer system achieves comparable performance with the floating point baseline but requires nearly 4x less memory footprint.


Sign in / Sign up

Export Citation Format

Share Document