ENOT Reference Documentation
You can find ENOT reference documentation here. If you experience lack of documentation, you probably should contact ENOT team to complete it or to make it more clear.
Before proceeding to examples or documentation, we recommend you to read this page carefully.
ENOT framework supports quantization and pruning.
Neural network quantization attempts to make weights of neural network discrete (instead of using their floating-point representation, usually float32 format). Quantization decreases model size by a factor of 4 by using int8 data type. On NVIDIA GPUs and Intel CPUs quantization may also provide noticeable inference time decrease.
Neural network pruning removes redundant (or least important) channels (or features) from neural network. Our framework implements structured pruning (removing channels or neurons, not separate connections).
To improve metrics of baseline or to finetune model it might be helpful to try our optimizer.
Before reading the documentation, we recommend you to look at the Tutorials to clarify basic notions and concepts of the framework.
Table of contents
Packages are listed in descending importance order.