Latency optimization package¶
enot.latency
package contains utilities which can help you to profile your
model, measure model or search space latency, or to obtain statistical
information about latency distribution in a search space.
Main tools¶
- initialize_latency(latency_type, search_space, inputs, keyword_inputs=None, **kwargs)¶
Initializes latency of type
latency_type
in search space. To calculate latency of a module, either the module should inherit fromLatencyMixin
, orlatency_type
should correspond to availableSearchSpaceLatencyCalculator
. To list availablelatency_type
(calculators) useavailable_calculators()
function. IfLatencyMixin
interface is implemented, it is always used for calculation.- Parameters
latency_type (str) – The type of latency to be initialized in search space.
search_space (SearchSpaceModel) – Search space for latency calculation.
inputs (Tuple) – Model input.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.
**kwargs – Arbitrary keyword arguments for
SearchSpaceLatencyCalculator
.
- Returns
Calculated latency of
search_space
asSearchSpaceLatencyContainer
. This container can be analyzed with the help of statistical tools from this module.- Return type
To select latency_type
see Search space latency calculators section.
- available_calculators()¶
Returns available search space calculators as string.
- Returns
List of available calculators.
- Return type
- reset_latency(search_space)¶
Reset all latency parameters of search space.
- Parameters
search_space (SearchSpaceModel) – Search space which latency will be reset.
- Return type
SearchSpaceLatencyContainer¶
- class SearchSpaceLatencyContainer(latency_type, constant_latency, operations_latencies)¶
Latency storage for
SearchSpaceModel
.- __init__(latency_type, constant_latency, operations_latencies)¶
- Parameters
latency_type (str) – Type of latency that container holds.
constant_latency (float) – Constant latency of
SearchSpaceModel
.operations_latencies (List[List[float]]) – Latencies of all operations in NAS blocks.
- property constant_latency: float¶
Returns constant latency of
SearchSpaceModel
.- Return type
- classmethod load_from_bytes(data)¶
Creates
SearchSpaceLatencyContainer
from bytes object.- Parameters
data (bytes) – Bytes object from which container will be created.
- Returns
- Return type
- classmethod load_from_file(filename)¶
Creates
SearchSpaceLatencyContainer
from file.- Parameters
filename (Union[str, Path]) – Filename of file with dumped
SearchSpaceLatencyContainer
.- Returns
- Return type
- property operations_latencies: List[List[float]]¶
Returns latencies of all operations in Nas blocks.
Statistical tools for SearchSpaceLatencyContainer
¶
- min_latency(arg)¶
Sum of minimum latencies over all containers supplemented by constant part.
- Return type
- max_latency(arg)¶
Sum of maximum latencies over all containers supplemented by constant part.
- Return type
- mean_latency(arg)¶
Sum of mean latencies over all containers supplemented by constant part.
- Return type
- median_latency(arg, n=100)¶
Compute median latency of n sampled architectures of
SearchSpaceModel
orSearchSpaceLatencyContainer
.
- sample_latencies(arg, n=100)¶
Returns latencies of n sampled architectures of
SearchSpaceModel
orSearchSpaceLatencyContainer
.
- current_latency(arg, arch=None)¶
Returns current latency of
SearchSpaceModel
orSearchSpaceLatencyContainer
.
- best_arch_latency(search_space)¶
Returns latency of best architecture of search space.
- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for calculating latency of best architecture. Best architecture is taken from search space.
- Returns
- Return type
latency_mixin¶
- class LatencyMixin[source]¶
Base class for operations with latency.
Implement latency calculators like def latency_<name>(self, inputs) -> float: … Then you can use it like op.forward_latency(inputs, ‘<name>’)
- forward_latency(inputs, latency_type)[source]¶
- Parameters
inputs (Tuple[Tensor, ...]) – Inputs.
latency_type (str) – Type of latency which you want to run. user should implement it by yourself after inhering of this class.
- Returns
Latency of user defined op.
- Return type
- Raises
ValueError – If user don’t implement latency_<latency_type> methid of inherited class.
Search space latency calculators¶
We support two types of calculators.
First type calculates amount of multiply-accumulate (mac
) operations
(inherited from SearchSpaceCommonCalculator
).
Second type estimates real-time/latency (in ms) of operations on target device.
Amount of multiply-accumulate operations may correlate poorly with latency (in ms) on device,
because of the difference in compiler optimization, device architecures, etc.
If you know on which type of device you want to optimize neural network and you can run calculator on it — use
the second type of calculators: SearchSpacePytorchCudaLatencyCalculator
,
SearchSpacePytorchCpuLatencyCalculator
.
Otherwise, use mac
-calculators, these calculators can roughly estimate the complexity of neural network
on an abstract device.
Use initialize_latency()
function to calculate latency of SearchSpaceModel
.
Pass latency_type
corrensponding to selected calculator (see available_calculators()
).
- class SearchSpaceLatencyCalculator(search_space, **kwargs)¶
Search space latency calculator interface.
- __init__(search_space, **kwargs)¶
Inits
SearchSpaceLatencyCalculator
withSearchSpaceModel
.- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.
- abstract compute(inputs, keyword_inputs=None)¶
Computes latency of
SearchSpaceModel
.- Parameters
inputs (Tuple) – Model input.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.
- Returns
- Return type
- class SearchSpaceCommonCalculator(search_space, **kwargs)¶
Bases:
enot.latency.search_space_latency_calculator.SearchSpaceLatencyCalculator
Search space common latency calculator.
The calculator is based on
LatencyMixin
andSearchVariantsContainer
classes, but also can calculate latency of unsupported by built-in calculator modules with the help of third-party calculators.- __init__(search_space, **kwargs)¶
Inits
SearchSpaceLatencyCalculator
withSearchSpaceModel
.- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.
- compute(inputs, keyword_inputs=None)¶
Computes latency of
SearchSpaceModel
.- Parameters
inputs (Tuple) – Model input.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.
- Returns
- Return type
- class SearchSpaceMacCalculator(search_space, **kwargs)¶
Bases:
enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator
Search space MAC calculator based on
LatencyMixin
only.- __init__(search_space, **kwargs)¶
Inits
SearchSpaceLatencyCalculator
withSearchSpaceModel
.- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.
- class SearchSpaceMacThopCalculator(search_space, **kwargs)¶
Bases:
enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator
Search space MAC calculator based on
LatencyMixin
andthop
third-party calculator.- __init__(search_space, **kwargs)¶
Inits
SearchSpaceLatencyCalculator
withSearchSpaceModel
.- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.
- class SearchSpaceMacPthflopsCalculator(search_space, **kwargs)¶
Bases:
enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator
Search space MAC calculator based on
LatencyMixin
andpthflops
third-party calculator.- __init__(search_space, **kwargs)¶
Inits
SearchSpaceLatencyCalculator
withSearchSpaceModel
.- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.
- class SearchSpaceMacFvcoreCalculator(search_space, **kwargs)¶
Bases:
enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator
Search space MAC calculator based on
LatencyMixin
andfvcore
third-party calculator.- __init__(search_space, **kwargs)¶
Inits
SearchSpaceLatencyCalculator
withSearchSpaceModel
.- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.
- class SearchSpacePytorchLatencyCalculator(search_space, **kwargs)¶
Bases:
enot.latency.search_space_latency_calculator.SearchSpaceLatencyCalculator
Pytorch latency calculator for
SearchSpaceModel
. Calculator measures latency (time in ms) ofSearchSpaceModel
and supports two types of devices:cpu
andcuda
.- __init__(search_space, **kwargs)¶
- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass
warmup_iterations
to set number of warmup iterations before measuring, default 10. Passrun_iterations
to set number of iterations for measuring, default 10. Passget_base_samples
to provide function that acceptsinputs
,keyword_inputs
and returns samples number. Default implementation is based on the knowledge thatinputs
is Tuple oftorch.Tensor
and number of samples doesn’t depend onkeyword_inputs
:inputs[0].shape[0]
.
- compute(inputs, keyword_inputs=None)¶
Computes latency of
SearchSpaceModel
.- Parameters
inputs (Tuple) – Model input.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.
- Returns
- Return type
- class SearchSpacePytorchCpuLatencyCalculator(search_space, **kwargs)¶
Bases:
enot.latency.search_space_latency_calculator.SearchSpacePytorchLatencyCalculator
- __init__(search_space, **kwargs)¶
- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass
warmup_iterations
to set number of warmup iterations before measuring, default 10. Passrun_iterations
to set number of iterations for measuring, default 10. Passget_base_samples
to provide function that acceptsinputs
,keyword_inputs
and returns samples number. Default implementation is based on the knowledge thatinputs
is Tuple oftorch.Tensor
and number of samples doesn’t depend onkeyword_inputs
:inputs[0].shape[0]
.
- class SearchSpacePytorchCudaLatencyCalculator(search_space, **kwargs)¶
Bases:
enot.latency.search_space_latency_calculator.SearchSpacePytorchLatencyCalculator
- __init__(search_space, **kwargs)¶
- Parameters
search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass
warmup_iterations
to set number of warmup iterations before measuring, default 10. Passrun_iterations
to set number of iterations for measuring, default 10. Passget_base_samples
to provide function that acceptsinputs
,keyword_inputs
and returns samples number. Default implementation is based on the knowledge thatinputs
is Tuple oftorch.Tensor
and number of samples doesn’t depend onkeyword_inputs
:inputs[0].shape[0]
.
Latency calculators¶
- class LatencyCalculator¶
Base class for latency calculators.
- abstract calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶
Calculates model latency.
- Parameters
model (torch.nn.Module) – Model for latency calculation.
inputs (Tuple[torch.Tensor, ...]) – Model input.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules that will be ignored in latency calculation.
- Returns
Model latency.
- Return type
- class MacCalculator¶
Bases:
enot.latency.latency_calculator.LatencyCalculator
Wrapper for third-party MAC calculators.
- calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶
Calculates number of Multiply-Accumulate operations in model.
- Parameters
model (torch.nn.Module) – Model for MAC operations calculation.
inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.
- Returns
Number of MAC operations in model (in millions).
- Return type
- class MacCalculatorThop¶
Bases:
enot.latency.latency_calculator.MacCalculator
Wrapper for https://github.com/Lyken17/pytorch-OpCounter
- calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶
Calculates number of Multiply-Accumulate operations in model.
- Parameters
model (torch.nn.Module) – Model for MAC operations calculation.
inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.
- Returns
Number of MAC operations in model (in millions).
- Return type
- class MacCalculatorPthflops¶
Bases:
enot.latency.latency_calculator.MacCalculator
Wrapper for https://github.com/1adrianb/pytorch-estimate-flops
- calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶
Calculates number of Multiply-Accumulate operations in model.
- Parameters
model (torch.nn.Module) – Model for MAC operations calculation.
inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.
- Returns
Number of MAC operations in model (in millions).
- Return type
- class MacCalculatorFvcore¶
Bases:
enot.latency.latency_calculator.MacCalculator
Wrapper for https://github.com/facebookresearch/fvcore
- calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶
Calculates number of Multiply-Accumulate operations in model.
- Parameters
model (torch.nn.Module) – Model for MAC operations calculation.
inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.
- Returns
Number of MAC operations in model (in millions).
- Return type
mac_calculator¶
- conv_out_spatial_size(input_size, kernel_size, stride=(1, 1), padding=(0, 0), dilation=(1, 1))¶
Calculates output spatial size of convolution.
- Parameters
input_size (TRepeatableInt) – Height and width of input image.
kernel_size (TRepeatableInt) – Height and width of convolution kernel.
stride (TRepeatableInt) – Stride along correspondent axes.
padding (TRepeatableInt) – Number of pixels for symmetrical padding.
dilation (TRepeatableInt) – Dilation pixel numbers.
- Returns
Final height and width of input image.
- Return type
Notes
All parameters of type TRepeatableInt will be repeated to correspond [int int].
- conv_mac_count(spatial_size, kernel_size, stride, in_channels, padding=(0, 0), out_channels=None, groups=1, calculate_in_millions=True)¶
Calculates number of multiply-accumulates and output shape of conv
- Parameters
spatial_size (TRepeatableInt) – Height and width of input image.
kernel_size (TRepeatableInt) – Height and width of convolution kernel.
stride (TRepeatableInt) – Stride along correspondent axes.
in_channels (int) – Number of input channels.
padding (TRepeatableInt) – Number of pixels for symmetrical padding.
out_channels (int) – Number of output channels.
groups (int) – Number of groups.
calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.
- Returns
Returns macs, output height and width of image.
- Return type
- mib_mac_count(spatial_size, kernel_size, expand_ratio, stride, in_channels, out_channels, padding=None, calculate_in_millions=True)¶
Calculates Mobile Inverted Bottleneck MAC operations.
Notes
https://arxiv.org/pdf/1801.04381.pdf
- Parameters
spatial_size (TRepeatableInt) – Height and width of input image.
kernel_size (TRepeatableInt) – Height and width of convolution kernel.
expand_ratio (float) – Expansion ratio after first conv1x1 in MIB. Size after expansion equals round(expand_ratio * in_channels). If is None MIB behaves itself as DWS + conv1x1.
stride (TRepeatableInt) – Stride along correspondent axes.
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
padding (TRepeatableInt) – Number of pixels for symmetrical padding.
calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.
- Returns
Returns macs.
- Return type
- rn_mac_count(spatial_size, squeeze_kernel_size, expand_kernel_size, expand_ratio, stride, in_channels, out_channels, padding=None, calculate_in_millions=True)¶
Calculates MAC for ResNet-v2 block.
Notes
See https://arxiv.org/pdf/1603.05027.pdf
- Parameters
spatial_size (TRepeatableInt) – Height and width of input image.
squeeze_kernel_size (TRepeatableInt) – Height and width of convolution kernel which reduces channel number.
expand_kernel_size (TRepeatableInt) – Height and width of convolution kernel which reduces channel number.
expand_ratio (float) – Expansion ratio after last conv.
stride (TRepeatableInt) – Stride along correspondent axes.
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
padding (TRepeatableInt) – Number of pixels for symmetrical padding.
calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.
- Returns
Returns macs.
- Return type