Latency optimization package

The enot.latency package contains utilities which can help you to profile your model, measure model or search space latency, or to obtain statistical information about latency distribution in a search space.

Main tools

initialize_latency(latency_type, search_space, inputs, keyword_inputs=None, **kwargs)

Initializes latency of type latency_type in search space, latency_type should correspond to available SearchSpaceLatencyCalculator. To list available latency_type (calculators) use available_calculators() function.

Parameters:
  • latency_type (str) – The type of latency to be initialized in search space.

  • search_space (SearchSpaceModel) – Search space for latency calculation.

  • inputs (Tuple) – Model input.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

Returns:

Calculated latency of search_space as SearchSpaceLatencyContainer. This container can be analyzed with the help of statistical tools from this module.

Return type:

SearchSpaceLatencyContainer

To select latency_type see Search space latency calculators section.

available_calculators()

Returns available search space calculators as string.

Returns:

List of available calculators.

Return type:

str

reset_latency(search_space)

Reset all latency parameters of search space.

Parameters:

search_space (SearchSpaceModel) – Search space which latency will be reset.

Return type:

None

SearchSpaceLatencyContainer

class SearchSpaceLatencyContainer(latency_type, constant_latency, operations_latencies)

Latency storage for SearchSpaceModel.

__init__(latency_type, constant_latency, operations_latencies)
Parameters:
  • latency_type (str) – Type of latency that container holds.

  • constant_latency (float) – Constant latency of SearchSpaceModel.

  • operations_latencies (List[List[float]]) – Latencies of all operations in NAS blocks.

property constant_latency: float

Returns constant latency of SearchSpaceModel.

property latency_type: str

Returns latency type.

classmethod load_from_bytes(data)

Creates SearchSpaceLatencyContainer from bytes object.

Parameters:

data (bytes) – Bytes object from which container will be created.

Returns:

Return type:

SearchSpaceLatencyContainer

classmethod load_from_file(filename)

Creates SearchSpaceLatencyContainer from file.

Parameters:

filename (Union[str, Path]) – Filename of file with dumped SearchSpaceLatencyContainer.

Returns:

Return type:

SearchSpaceLatencyContainer

property operations_latencies: List[List[float]]

Returns latencies of all operations in Nas blocks.

save_to_bytes()

Dumps latency container to bytes object.

Return type:

bytes

save_to_file(filename)

Saves latency container to file.

Parameters:

filename (Union[str, Path]) – Filename of file for dumping SearchSpaceLatencyContainer.

Return type:

None

Statistical tools for SearchSpaceLatencyContainer

min_latency(arg)

Sum of minimum latencies over all containers supplemented by constant part.

Return type:

float

max_latency(arg)

Sum of maximum latencies over all containers supplemented by constant part.

Return type:

float

mean_latency(arg)

Sum of mean latencies over all containers supplemented by constant part.

Return type:

float

median_latency(arg, n=100)

Compute median latency of n sampled architectures of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters:

n (int) –

Return type:

float

sample_latencies(arg, n=100)

Returns latencies of n sampled architectures of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters:

n (int) –

Return type:

List[float]

current_latency(arg, arch=None)

Returns current latency of SearchSpaceModel or SearchSpaceLatencyContainer.

Parameters:

arch (Optional[List[Union[List[int], int]]]) –

Return type:

float

best_arch_latency(search_space)

Returns latency of best architecture of search space.

Parameters:

search_space (SearchSpaceModel) – SearchSpaceModel for calculating latency of best architecture. Best architecture is taken from search space.

Returns:

Return type:

float

Search space latency calculators

We support two types of calculators. First type calculates amount of multiply-accumulate (mac) operations (inherited from SearchSpaceCommonCalculator). Second type estimates real-time/latency (in ms) of operations on target device. Amount of multiply-accumulate operations may correlate poorly with latency (in ms) on device, because of the difference in compiler optimization, device architecures, etc. If you know on which type of device you want to optimize neural network and you can run calculator on it — use the second type of calculators: SearchSpacePytorchCudaLatencyCalculator, SearchSpacePytorchCpuLatencyCalculator. Otherwise, use mac-calculators, these calculators can roughly estimate the complexity of neural network on an abstract device.

Use initialize_latency() function to calculate latency of SearchSpaceModel. Pass latency_type corrensponding to selected calculator (see available_calculators()).

class SearchSpaceLatencyCalculator(search_space, **kwargs)

Search space latency calculator interface.

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters:
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

abstract compute(inputs, keyword_inputs=None)

Computes latency of SearchSpaceModel.

Parameters:
  • inputs (Tuple) – Model input.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns:

Return type:

SearchSpaceLatencyContainer

class SearchSpaceCommonCalculator(search_space, **kwargs)

Bases: SearchSpaceLatencyCalculator

Search space common latency calculator.

The calculator is based on third-party calculators.

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters:
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

compute(inputs, keyword_inputs=None)

Computes latency of SearchSpaceModel.

Parameters:
  • inputs (Tuple) – Model input.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns:

Return type:

SearchSpaceLatencyContainer

class SearchSpaceMacThopCalculator(search_space, **kwargs)

Bases: SearchSpaceCommonCalculator

Search space MAC calculator based on thop third-party calculator.

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters:
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpaceMacPthflopsCalculator(search_space, **kwargs)

Bases: SearchSpaceCommonCalculator

Search space MAC calculator based on pthflops third-party calculator.

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters:
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpaceMacFvcoreCalculator(search_space, **kwargs)

Bases: SearchSpaceCommonCalculator

Search space MAC calculator based on fvcore third-party calculator.

__init__(search_space, **kwargs)

Inits SearchSpaceLatencyCalculator with SearchSpaceModel.

Parameters:
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Arbitrary keyword arguments for SearchSpaceLatencyCalculator.

class SearchSpacePytorchLatencyCalculator(search_space, **kwargs)

Bases: SearchSpaceLatencyCalculator

Pytorch latency calculator for SearchSpaceModel.

Calculator measures latency (time in ms) of SearchSpaceModel and supports two types of devices: cpu and cuda.

__init__(search_space, **kwargs)
Parameters:
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

compute(inputs, keyword_inputs=None)

Computes latency of SearchSpaceModel.

Parameters:
  • inputs (Tuple) – Model input.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword input arguments.

Returns:

Return type:

SearchSpaceLatencyContainer

class SearchSpacePytorchCpuLatencyCalculator(search_space, **kwargs)

Bases: SearchSpacePytorchLatencyCalculator

Search space CPU-time latency calculator.

__init__(search_space, **kwargs)
Parameters:
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

class SearchSpacePytorchCudaLatencyCalculator(search_space, **kwargs)

Bases: SearchSpacePytorchLatencyCalculator

Search space CUDA-time latency calculator.

__init__(search_space, **kwargs)
Parameters:
  • search_space (SearchSpaceModel) – SearchSpaceModel for latency calculation.

  • **kwargs – Keyword arguments for SearchSpacePytorchLatencyCalculator. Pass warmup_iterations to set number of warmup iterations before measuring, default 10. Pass run_iterations to set number of iterations for measuring, default 10. Pass get_base_samples to provide function that accepts inputs, keyword_inputs and returns samples number. Default implementation is based on the knowledge that inputs is Tuple of torch.Tensor and number of samples doesn’t depend on keyword_inputs: inputs[0].shape[0].

Latency calculators

class LatencyCalculator

Base class for latency calculators.

abstract calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates model latency.

Parameters:
  • model (torch.nn.Module) – Model for latency calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model input.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules that will be ignored in latency calculation.

  • keyword_inputs (Optional[Dict[str, Any]]) –

Returns:

Model latency.

Return type:

float

class MacCalculatorFvcore

Bases: MacCalculator

Wrapper for https://github.com/facebookresearch/fvcore.

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates number of Multiply-Accumulate operations in model.

Parameters:
  • model (torch.nn.Module) – Model for MAC operations calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns:

Number of MAC operations in model (in millions).

Return type:

float

class MacCalculatorThop

Bases: MacCalculator

Wrapper for https://github.com/Lyken17/pytorch-OpCounter.

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates number of Multiply-Accumulate operations in model.

Parameters:
  • model (torch.nn.Module) – Model for MAC operations calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns:

Number of MAC operations in model (in millions).

Return type:

float

class MacCalculatorPthflops

Bases: MacCalculator

Wrapper for https://github.com/1adrianb/pytorch-estimate-flops.

calculate(model, inputs, keyword_inputs=None, ignore_modules=None, **options)

Calculates number of Multiply-Accumulate operations in model.

Parameters:
  • model (torch.nn.Module) – Model for MAC operations calculation.

  • inputs (Tuple[torch.Tensor, ...]) – Model inputs stored in tuple.

  • keyword_inputs (Optional[Dict[str, Any]]) – Model keyword inputs.

  • ignore_modules (Optional[List[Type[nn.Module]]]) – List of types of modules, that will be ignored in MAC calculation.

Returns:

Number of MAC operations in model (in millions).

Return type:

float