1-bit stochastic gradient descent and its application to data-parallel distributed training of speech DNNs

Author(s):  
Frank Seide ◽  
Hao Fu ◽  
Jasha Droppo ◽  
Gang Li ◽  
Dong Yu
Sign in / Sign up

Export Citation Format

Share Document