Latency optimization package¶

enot.latency package contains utilities which can help you to profile your model, measure model or search space latency, or to obtain statistical information about latency distribution in a search space.

Main tools¶

initialize_latency(latency_type, search_space, inputs, keyword_inputs=None, **kwargs)¶

Initializes latency of type latency_type in search space. To calculate latency of a module, either the module should inherit from LatencyMixin, or latency_type should correspond to available SearchSpaceLatencyCalculator. To list available latency_type (calculators) use available_calculators() function. If LatencyMixin interface is implemented, it is always used for calculation.

Parameters

latency_type (str) – The type of latency to be initialized in search space.
search_space (SearchSpaceModel) – Search space for latency calculation.
inputs (Tuple) – Model input.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

Returns

Calculated latency of search_space as SearchSpaceLatencyContainer. This container can be analyzed with the help of statistical tools from this module.

Return type

SearchSpaceLatencyContainer

To select latency_type see Search space latency calculators section.

available_calculators()¶

Returns available search space calculators as string.

Returns: List of available calculators.
Return type: str

reset_latency(search_space)¶

Reset all latency parameters of search space.

Parameters: search_space (SearchSpaceModel) – Search space which latency will be reset.
Return type: None

SearchSpaceLatencyContainer¶

class SearchSpaceLatencyContainer(latency_type, constant_latency, operations_latencies)¶

Latency storage for SearchSpaceModel.

__init__(latency_type, constant_latency, operations_latencies)¶

Parameters

latency_type (str) – Type of latency that container holds.
constant_latency (float) – Constant latency of SearchSpaceModel.
operations_latencies (List[List[float]]) – Latencies of all operations in NAS blocks.

property constant_latency: float¶

Returns constant latency of SearchSpaceModel.

Return type: float

property latency_type: str¶

Returns latency type.

Return type: str

classmethod load_from_bytes(data)¶

Creates SearchSpaceLatencyContainer from bytes object.

Parameters: data (bytes) – Bytes object from which container will be created.
Returns
Return type: SearchSpaceLatencyContainer

classmethod load_from_file(filename)¶

Creates SearchSpaceLatencyContainer from file.

Parameters: filename (Union[str, Path]) – Filename of file with dumped SearchSpaceLatencyContainer.
Returns
Return type: SearchSpaceLatencyContainer

property operations_latencies: List[List[float]]¶

Returns latencies of all operations in Nas blocks.

Return type: List[List[float]]

save_to_bytes()¶

Dumps latency container to bytes object.

Return type: bytes

save_to_file(filename)¶

Saves latency container to file.

Parameters: filename (Union[str, Path]) – Filename of file for dumping SearchSpaceLatencyContainer.
Return type: None

Statistical tools for `SearchSpaceLatencyContainer`¶

min_latency(arg)¶

Sum of minimum latencies over all containers supplemented by constant part.

Return type: float

max_latency(arg)¶

Sum of maximum latencies over all containers supplemented by constant part.

Return type: float

mean_latency(arg)¶

Sum of mean latencies over all containers supplemented by constant part.

Return type: float

median_latency(arg, n=100)¶

Compute median latency of n sampled architectures of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters: n (int) –
Return type: float

sample_latencies(arg, n=100)¶

Returns latencies of n sampled architectures of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters: n (int) –
Return type: List[float]

current_latency(arg, arch=None)¶

Returns current latency of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters: arch (Optional[List[Union[List[int], int]]]) –
Return type: float

best_arch_latency(search_space)¶

Returns latency of best architecture of search space.

Parameters: search_space (SearchSpaceModel) – SearchSpaceModel for calculating latency of best architecture. Best architecture is taken from search space.
Returns
Return type: float

latency_mixin¶

class LatencyMixin[source]¶

Base class for operations with latency.

Implement latency calculators like def latency_<name>(self, inputs) -> float: … Then you can use it like op.forward_latency(inputs, ‘<name>’)

forward_latency(inputs, latency_type)[source]¶

Parameters

inputs (Tuple[Tensor, ...]) – Inputs.
latency_type (str) – Type of latency which you want to run. user should implement it by yourself after inhering of this class.

Returns

Latency of user defined op.

Return type

float

Raises

ValueError – If user don’t implement latency_<latency_type> methid of inherited class.

Search space latency calculators¶

We support two types of calculators. First type calculates amount of multiply-accumulate (mac) operations (inherited from SearchSpaceCommonCalculator). Second type estimates real-time/latency (in ms) of operations on target device. Amount of multiply-accumulate operations may correlate poorly with latency (in ms) on device, because of the difference in compiler optimization, device architecures, etc. If you know on which type of device you want to optimize neural network and you can run calculator on it — use the second type of calculators: SearchSpacePytorchCudaLatencyCalculator, SearchSpacePytorchCpuLatencyCalculator. Otherwise, use mac-calculators, these calculators can roughly estimate the complexity of neural network on an abstract device.

Use initialize_latency() function to calculate latency of SearchSpaceModel. Pass latency_type corrensponding to selected calculator (see available_calculators()).

class SearchSpaceLatencyCalculator(search_space, **kwargs)¶

Search space latency calculator interface.

__init__(search_space, **kwargs)¶

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

abstract compute(inputs, keyword_inputs=None)¶

Computes latency of SearchSpaceModel.

Parameters

inputs (Tuple) – Model input.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns

Return type

SearchSpaceLatencyContainer

class SearchSpaceCommonCalculator(search_space, **kwargs)¶

Bases: enot.latency.search_space_latency_calculator.SearchSpaceLatencyCalculator

Search space common latency calculator.

The calculator is based on LatencyMixin and SearchVariantsContainer classes, but also can calculate latency of unsupported by built-in calculator modules with the help of third-party calculators.

__init__(search_space, **kwargs)¶

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

compute(inputs, keyword_inputs=None)¶

Computes latency of SearchSpaceModel.

Parameters

inputs (Tuple) – Model input.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns

Return type

SearchSpaceLatencyContainer

class SearchSpaceMacCalculator(search_space, **kwargs)¶

Bases: enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator

Search space MAC calculator based on LatencyMixin only.

__init__(search_space, **kwargs)¶

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpaceMacThopCalculator(search_space, **kwargs)¶

Bases: enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator

Search space MAC calculator based on LatencyMixin and thop third-party calculator.

__init__(search_space, **kwargs)¶

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpaceMacPthflopsCalculator(search_space, **kwargs)¶

Bases: enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator

Search space MAC calculator based on LatencyMixin and pthflops third-party calculator.

__init__(search_space, **kwargs)¶

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpaceMacFvcoreCalculator(search_space, **kwargs)¶

Bases: enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator

Search space MAC calculator based on LatencyMixin and fvcore third-party calculator.

__init__(search_space, **kwargs)¶

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpacePytorchLatencyCalculator(search_space, **kwargs)¶

Bases: enot.latency.search_space_latency_calculator.SearchSpaceLatencyCalculator

Pytorch latency calculator for SearchSpaceModel. Calculator measures latency (time in ms) of SearchSpaceModel and supports two types of devices: cpu and cuda.

__init__(search_space, **kwargs)¶

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

compute(inputs, keyword_inputs=None)¶

Computes latency of SearchSpaceModel.

Parameters

inputs (Tuple) – Model input.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns

Return type

SearchSpaceLatencyContainer

class SearchSpacePytorchCpuLatencyCalculator(search_space, **kwargs)¶

Bases: enot.latency.search_space_latency_calculator.SearchSpacePytorchLatencyCalculator

__init__(search_space, **kwargs)¶

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

class SearchSpacePytorchCudaLatencyCalculator(search_space, **kwargs)¶

Bases: enot.latency.search_space_latency_calculator.SearchSpacePytorchLatencyCalculator

__init__(search_space, **kwargs)¶

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.
**kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

Latency calculators¶

class LatencyCalculator¶

Base class for latency calculators.

abstract calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶

Calculates model latency.

Parameters

model (torch.nn.Module) – Model for latency calculation.
inputs (Tuple[torch.Tensor, ...]) – Model input.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules that will be ignored in latency calculation.
keyword_inputs (Optional[Dict[str, Any]]) –

Returns

Model latency.

Return type

float

class MacCalculator¶

Bases: enot.latency.latency_calculator.LatencyCalculator

Wrapper for third-party MAC calculators.

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶

Calculates number of Multiply-Accumulate operations in model.

Parameters

model (torch.nn.Module) – Model for MAC operations calculation.
inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns

Number of MAC operations in model (in millions).

Return type

float

class MacCalculatorThop¶

Bases: enot.latency.latency_calculator.MacCalculator

Wrapper for https://github.com/Lyken17/pytorch-OpCounter

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶

Calculates number of Multiply-Accumulate operations in model.

Parameters

model (torch.nn.Module) – Model for MAC operations calculation.
inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns

Number of MAC operations in model (in millions).

Return type

float

class MacCalculatorPthflops¶

Bases: enot.latency.latency_calculator.MacCalculator

Wrapper for https://github.com/1adrianb/pytorch-estimate-flops

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶

Calculates number of Multiply-Accumulate operations in model.

Parameters

model (torch.nn.Module) – Model for MAC operations calculation.
inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns

Number of MAC operations in model (in millions).

Return type

float

class MacCalculatorFvcore¶

Bases: enot.latency.latency_calculator.MacCalculator

Wrapper for https://github.com/facebookresearch/fvcore

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)¶

Calculates number of Multiply-Accumulate operations in model.

Parameters

model (torch.nn.Module) – Model for MAC operations calculation.
inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.
keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.
ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns

Number of MAC operations in model (in millions).

Return type

float

mac_calculator¶

conv_out_spatial_size(input_size, kernel_size, stride=(1, 1), padding=(0, 0), dilation=(1, 1))¶

Calculates output spatial size of convolution.

Parameters

input_size (TRepeatableInt) – Height and width of input image.
kernel_size (TRepeatableInt) – Height and width of convolution kernel.
stride (TRepeatableInt) – Stride along correspondent axes.
padding (TRepeatableInt) – Number of pixels for symmetrical padding.
dilation (TRepeatableInt) – Dilation pixel numbers.

Returns

Final height and width of input image.

Return type

Tuple[int, int]

Notes

All parameters of type TRepeatableInt will be repeated to correspond [int int].

conv_mac_count(spatial_size, kernel_size, stride, in_channels, padding=(0, 0), out_channels=None, groups=1, calculate_in_millions=True)¶

Calculates number of multiply-accumulates and output shape of conv

Parameters

spatial_size (TRepeatableInt) – Height and width of input image.
kernel_size (TRepeatableInt) – Height and width of convolution kernel.
stride (TRepeatableInt) – Stride along correspondent axes.
in_channels (int) – Number of input channels.
padding (TRepeatableInt) – Number of pixels for symmetrical padding.
out_channels (int) – Number of output channels.
groups (int) – Number of groups.
calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.

Returns

Returns macs, output height and width of image.

Return type

Tuple[float, Tuple[int, int]]

mib_mac_count(spatial_size, kernel_size, expand_ratio, stride, in_channels, out_channels, padding=None, calculate_in_millions=True)¶

Calculates Mobile Inverted Bottleneck MAC operations.

Notes

https://arxiv.org/pdf/1801.04381.pdf

Parameters

spatial_size (TRepeatableInt) – Height and width of input image.
kernel_size (TRepeatableInt) – Height and width of convolution kernel.
expand_ratio (float) – Expansion ratio after first conv1x1 in MIB. Size after expansion equals round(expand_ratio * in_channels). If is None MIB behaves itself as DWS + conv1x1.
stride (TRepeatableInt) – Stride along correspondent axes.
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
padding (TRepeatableInt) – Number of pixels for symmetrical padding.
calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.

Returns

Returns macs.

Return type

float

rn_mac_count(spatial_size, squeeze_kernel_size, expand_kernel_size, expand_ratio, stride, in_channels, out_channels, padding=None, calculate_in_millions=True)¶

Calculates MAC for ResNet-v2 block.

Notes

See https://arxiv.org/pdf/1603.05027.pdf

Parameters

spatial_size (TRepeatableInt) – Height and width of input image.
squeeze_kernel_size (TRepeatableInt) – Height and width of convolution kernel which reduces channel number.
expand_kernel_size (TRepeatableInt) – Height and width of convolution kernel which reduces channel number.
expand_ratio (float) – Expansion ratio after last conv.
stride (TRepeatableInt) – Stride along correspondent axes.
in_channels (int) – Number of input channels.
out_channels (int) – Number of output channels.
padding (TRepeatableInt) – Number of pixels for symmetrical padding.
calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.

Returns

Returns macs.

Return type

float

Latency optimization package¶

Main tools¶

SearchSpaceLatencyContainer¶

Statistical tools for SearchSpaceLatencyContainer¶

latency_mixin¶

Search space latency calculators¶

Latency calculators¶

mac_calculator¶

Statistical tools for `SearchSpaceLatencyContainer`¶