Latency optimization package

enot.latency package contains utilities which can help you to profile your model, measure model or search space latency, or to obtain statistical information about latency distribution in a search space.

Main tools

initialize_latency(latency_type, search_space, inputs, keyword_inputs=None, **kwargs)

Initializes latency of type latency_type in search space. To calculate latency of a module, either the module should inherit from LatencyMixin, or latency_type should correspond to available SearchSpaceLatencyCalculator. To list available latency_type (calculators) use available_calculators() function. If LatencyMixin interface is implemented, it is always used for calculation.

Parameters
  • latency_type (str) – The type of latency to be initialized in search space.

  • search_space (SearchSpaceModel) – Search space for latency calculation.

  • inputs (Tuple) – Model input.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

Returns

Calculated latency of search_space as SearchSpaceLatencyContainer. This container can be analyzed with the help of statistical tools from this module.

Return type

SearchSpaceLatencyContainer

To select latency_type see Search space latency calculators section.

available_calculators()

Returns available search space calculators as string.

Returns

List of available calculators.

Return type

str

reset_latency(search_space)

Reset all latency parameters of search space.

Parameters

search_space (SearchSpaceModel) – Search space which latency will be reset.

Return type

None

SearchSpaceLatencyContainer

class SearchSpaceLatencyContainer(latency_type, constant_latency, operations_latencies)

Latency storage for SearchSpaceModel.

Parameters
__init__(latency_type, constant_latency, operations_latencies)
Parameters
  • latency_type (str) – Type of latency that container holds.

  • constant_latency (float) – Constant latency of SearchSpaceModel.

  • operations_latencies (List[List[float]]) – Latencies of all operations in NAS blocks.

property constant_latency: float

Returns constant latency of SearchSpaceModel.

Return type

float

property latency_type: str

Returns latency type.

Return type

str

classmethod load_from_bytes(data)

Creates SearchSpaceLatencyContainer from bytes object.

Parameters

data (bytes) – Bytes object from which container will be created.

Returns

Return type

SearchSpaceLatencyContainer

classmethod load_from_file(filename)

Creates SearchSpaceLatencyContainer from file.

Parameters

filename (Union[str, Path]) – Filename of file with dumped SearchSpaceLatencyContainer.

Returns

Return type

SearchSpaceLatencyContainer

property operations_latencies: List[List[float]]

Returns latencies of all operations in Nas blocks.

Return type

List[List[float]]

save_to_bytes()

Dumps latency container to bytes object.

Return type

bytes

save_to_file(filename)

Saves latency container to file.

Parameters

filename (Union[str, Path]) – Filename of file for dumping SearchSpaceLatencyContainer.

Return type

None

Statistical tools for SearchSpaceLatencyContainer

min_latency(arg)

Sum of minimum latencies over all containers supplemented by constant part.

Return type

float

max_latency(arg)

Sum of maximum latencies over all containers supplemented by constant part.

Return type

float

mean_latency(arg)

Sum of mean latencies over all containers supplemented by constant part.

Return type

float

median_latency(arg, n=100)

Compute median latency of n sampled architectures of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters

n (int) –

Return type

float

sample_latencies(arg, n=100)

Returns latencies of n sampled architectures of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters

n (int) –

Return type

List[float]

current_latency(arg, arch=None)

Returns current latency of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters

arch (Optional[List[Union[List[int], int]]]) –

Return type

float

best_arch_latency(search_space)

Returns latency of best architecture of search space.

Parameters

search_space (SearchSpaceModel) – SearchSpaceModel for calculating latency of best architecture. Best architecture is taken from search space.

Returns

Return type

float

latency_mixin

class LatencyMixin[source]

Base class for operations with latency.

Implement latency calculators like def latency_<name>(self, inputs) -> float: … Then you can use it like op.forward_latency(inputs, ‘<name>’)

forward_latency(inputs, latency_type)[source]
Parameters
  • inputs (Tuple[Tensor, ...]) – Inputs.

  • latency_type (str) – Type of latency which you want to run. user should implement it by yourself after inhering of this class.

Returns

Latency of user defined op.

Return type

float

Raises

ValueError – If user don’t implement latency_<latency_type> methid of inherited class.

Search space latency calculators

We support two types of calculators. First type calculates amount of multiply-accumulate (mac) operations (inherited from SearchSpaceCommonCalculator). Second type estimates real-time/latency (in ms) of operations on target device. Amount of multiply-accumulate operations may correlate poorly with latency (in ms) on device, because of the difference in compiler optimization, device architecures, etc. If you know on which type of device you want to optimize neural network and you can run calculator on it — use the second type of calculators: SearchSpacePytorchCudaLatencyCalculator, SearchSpacePytorchCpuLatencyCalculator. Otherwise, use mac-calculators, these calculators can roughly estimate the complexity of neural network on an abstract device.

Use initialize_latency() function to calculate latency of SearchSpaceModel. Pass latency_type corrensponding to selected calculator (see available_calculators()).

class SearchSpaceLatencyCalculator(search_space, **kwargs)

Search space latency calculator interface.

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

abstract compute(inputs, keyword_inputs=None)

Computes latency of SearchSpaceModel.

Parameters
  • inputs (Tuple) – Model input.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns

Return type

SearchSpaceLatencyContainer

class SearchSpaceCommonCalculator(search_space, **kwargs)

Bases: enot.latency.search_space_latency_calculator.SearchSpaceLatencyCalculator

Search space common latency calculator.

The calculator is based on LatencyMixin and SearchVariantsContainer classes, but also can calculate latency of unsupported by built-in calculator modules with the help of third-party calculators.

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

compute(inputs, keyword_inputs=None)

Computes latency of SearchSpaceModel.

Parameters
  • inputs (Tuple) – Model input.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns

Return type

SearchSpaceLatencyContainer

class SearchSpaceMacCalculator(search_space, **kwargs)

Bases: enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator

Search space MAC calculator based on LatencyMixin only.

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpaceMacThopCalculator(search_space, **kwargs)

Bases: enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator

Search space MAC calculator based on LatencyMixin and thop third-party calculator.

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpaceMacPthflopsCalculator(search_space, **kwargs)

Bases: enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator

Search space MAC calculator based on LatencyMixin and pthflops third-party calculator.

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpaceMacFvcoreCalculator(search_space, **kwargs)

Bases: enot.latency.search_space_latency_calculator.SearchSpaceCommonCalculator

Search space MAC calculator based on LatencyMixin and fvcore third-party calculator.

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpacePytorchLatencyCalculator(search_space, **kwargs)

Bases: enot.latency.search_space_latency_calculator.SearchSpaceLatencyCalculator

Pytorch latency calculator for SearchSpaceModel. Calculator measures latency (time in ms) of SearchSpaceModel and supports two types of devices: cpu and cuda.

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)
Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

compute(inputs, keyword_inputs=None)

Computes latency of SearchSpaceModel.

Parameters
  • inputs (Tuple) – Model input.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns

Return type

SearchSpaceLatencyContainer

class SearchSpacePytorchCpuLatencyCalculator(search_space, **kwargs)

Bases: enot.latency.search_space_latency_calculator.SearchSpacePytorchLatencyCalculator

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)
Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

class SearchSpacePytorchCudaLatencyCalculator(search_space, **kwargs)

Bases: enot.latency.search_space_latency_calculator.SearchSpacePytorchLatencyCalculator

Parameters

search_space (SearchSpaceModel) –

__init__(search_space, **kwargs)
Parameters
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

Latency calculators

class LatencyCalculator

Base class for latency calculators.

abstract calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates model latency.

Parameters
  • model (torch.nn.Module) – Model for latency calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model input.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules that will be ignored in latency calculation.

  • keyword_inputs (Optional[Dict[str, Any]]) –

Returns

Model latency.

Return type

float

class MacCalculator

Bases: enot.latency.latency_calculator.LatencyCalculator

Wrapper for third-party MAC calculators.

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates number of Multiply-Accumulate operations in model.

Parameters
  • model (torch.nn.Module) – Model for MAC operations calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns

Number of MAC operations in model (in millions).

Return type

float

class MacCalculatorThop

Bases: enot.latency.latency_calculator.MacCalculator

Wrapper for https://github.com/Lyken17/pytorch-OpCounter

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates number of Multiply-Accumulate operations in model.

Parameters
  • model (torch.nn.Module) – Model for MAC operations calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns

Number of MAC operations in model (in millions).

Return type

float

class MacCalculatorPthflops

Bases: enot.latency.latency_calculator.MacCalculator

Wrapper for https://github.com/1adrianb/pytorch-estimate-flops

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates number of Multiply-Accumulate operations in model.

Parameters
  • model (torch.nn.Module) – Model for MAC operations calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns

Number of MAC operations in model (in millions).

Return type

float

class MacCalculatorFvcore

Bases: enot.latency.latency_calculator.MacCalculator

Wrapper for https://github.com/facebookresearch/fvcore

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates number of Multiply-Accumulate operations in model.

Parameters
  • model (torch.nn.Module) – Model for MAC operations calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns

Number of MAC operations in model (in millions).

Return type

float

mac_calculator

conv_out_spatial_size(input_size, kernel_size, stride=(1, 1), padding=(0, 0), dilation=(1, 1))

Calculates output spatial size of convolution.

Parameters
  • input_size (TRepeatableInt) – Height and width of input image.

  • kernel_size (TRepeatableInt) – Height and width of convolution kernel.

  • stride (TRepeatableInt) – Stride along correspondent axes.

  • padding (TRepeatableInt) – Number of pixels for symmetrical padding.

  • dilation (TRepeatableInt) – Dilation pixel numbers.

Returns

Final height and width of input image.

Return type

Tuple[int, int]

Notes

All parameters of type TRepeatableInt will be repeated to correspond [int int].

conv_mac_count(spatial_size, kernel_size, stride, in_channels, padding=(0, 0), out_channels=None, groups=1, calculate_in_millions=True)

Calculates number of multiply-accumulates and output shape of conv

Parameters
  • spatial_size (TRepeatableInt) – Height and width of input image.

  • kernel_size (TRepeatableInt) – Height and width of convolution kernel.

  • stride (TRepeatableInt) – Stride along correspondent axes.

  • in_channels (int) – Number of input channels.

  • padding (TRepeatableInt) – Number of pixels for symmetrical padding.

  • out_channels (int) – Number of output channels.

  • groups (int) – Number of groups.

  • calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.

Returns

Returns macs, output height and width of image.

Return type

Tuple[float, Tuple[int, int]]

mib_mac_count(spatial_size, kernel_size, expand_ratio, stride, in_channels, out_channels, padding=None, calculate_in_millions=True)

Calculates Mobile Inverted Bottleneck MAC operations.

Notes

https://arxiv.org/pdf/1801.04381.pdf

Parameters
  • spatial_size (TRepeatableInt) – Height and width of input image.

  • kernel_size (TRepeatableInt) – Height and width of convolution kernel.

  • expand_ratio (float) – Expansion ratio after first conv1x1 in MIB. Size after expansion equals round(expand_ratio * in_channels). If is None MIB behaves itself as DWS + conv1x1.

  • stride (TRepeatableInt) – Stride along correspondent axes.

  • in_channels (int) – Number of input channels.

  • out_channels (int) – Number of output channels.

  • padding (TRepeatableInt) – Number of pixels for symmetrical padding.

  • calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.

Returns

Returns macs.

Return type

float

rn_mac_count(spatial_size, squeeze_kernel_size, expand_kernel_size, expand_ratio, stride, in_channels, out_channels, padding=None, calculate_in_millions=True)

Calculates MAC for ResNet-v2 block.

Notes

See https://arxiv.org/pdf/1603.05027.pdf

Parameters
  • spatial_size (TRepeatableInt) – Height and width of input image.

  • squeeze_kernel_size (TRepeatableInt) – Height and width of convolution kernel which reduces channel number.

  • expand_kernel_size (TRepeatableInt) – Height and width of convolution kernel which reduces channel number.

  • expand_ratio (float) – Expansion ratio after last conv.

  • stride (TRepeatableInt) – Stride along correspondent axes.

  • in_channels (int) – Number of input channels.

  • out_channels (int) – Number of output channels.

  • padding (TRepeatableInt) – Number of pixels for symmetrical padding.

  • calculate_in_millions (bool) – Whether to apply common practice for calculation macs in millions.

Returns

Returns macs.

Return type

float