Abstract: Quantization is a widely used technique to compress neural networks. Assigning uniform bit-widths across all layers can result in significant accuracy degradation at low precision and ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results