ENOT optimization package

enot.optimize package contains optimizer definitions for three main neural architecture search stages.

To optimize a neural network architecture, you should do the following:

  • Train the search space (pretrain)

  • Search for the optimal architecture (search)

  • Tune obtained model (tune)

build_enot_optimizer(phase_name, model, optimizer, **options)

Builds and returns ENOT optimizer according to the given phase name.

Parameters
  • phase_name ({"pretrain", "search", "train"}) – Name of the phase. Must be one of {“pretrain”, “search” or “train”}.

  • model (torch.nn.Module) – PyTorch model to optimize.

  • optimizer (torch.optim.Optimizer) – PyTorch optimizer to be replaced with the ENOT optimizer.

  • options – ENOT optimizer options.

Returns

Suitable ENOT optimizer instance.

Return type

BaseEnotOptimizer

Raises
  • ValueError – If got unknown phase name or unknown optimizer options.

  • TypeError – If model type is not suitable for the selected phase. Or, if optimizer instance is not a subclass of torch.optim.Optimizer.

Search space pre-training

Neural architecture search starts with the search space creation and its pretrain procedure for further selection of the best operations.

class EnotPretrainOptimizer(search_space, optimizer, check_recommended_optimizations=True, **options)

ENOT optimizer for pretrain phase.

Parameters
__init__(search_space, optimizer, check_recommended_optimizations=True, **options)
Parameters
  • search_space (SearchSpaceModel) – Search space model to optimize.

  • optimizer (torch.optim.Optimizer) – PyTorch optimizer to be replaced with the ENOT optimizer.

  • check_recommended_optimizations (bool, optional) – Whether to use recommended ENOT optimizations. Default value is True.

  • options – Other experimental options (should be ignored by user).

add_param_group(param_group)

Call add_param_group of the wrapped optimizer.

Parameters

param_group (dict with str keys) – Parameter group description to add to user optimizer.

Return type

None

load_state_dict(state_dict)

Call load_state_dict of the wrapped optimizer.

Parameters

state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.

Return type

None

property model: torch.nn.modules.module.Module

Model passed to the constructor.

Returns

PyTorch model passed to the Enot optimizer constructor.

Return type

torch.nn.Module

model_step(closure)

Perform gradient accumulation step.

Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.

To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.

It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().

In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.

This function must be used in conjunction with the step function.

Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.

Parameters

closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.

Returns

The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.

Return type

float or torch.Tensor or None

property search_space: enot.models.search_space_model.SearchSpaceModel

Search space model passed to the constructor.

Returns

Search space passed to the Enot optimizer constructor.

Return type

SearchSpaceModel

state_dict()

Call state_dict of the wrapped optimizer and return the result.

Returns

User optimizer state dict.

Return type

dict with str keys

step(closure=None)

Performs a single optimization step (parameter update).

Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.

Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.

Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.

More detailed description of gradient computation and closure structure can be found in model_step function documentation.

Parameters

closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.

Returns

The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.

Return type

float or torch.Tensor or None

zero_grad()

Call zero_grad of the wrapped optimizer.

Return type

None

Optimal model selection

The pretrained search space is used to select the best combination of operations.

class EnotSearchOptimizer(search_space, optimizer, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)

ENOT optimizer for search phase with batch norm tuning.

Parameters
__init__(search_space, optimizer, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)

Warning

If you use batch norm tuning, make sure to use the same batch norm tuning dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation or misaligned search process.

Parameters
  • search_space (torch.Tensor) – Search space model to optimize.

  • optimizer (torch.optim.Optimizer) – PyTorch optimizer which will be wrapped by ENOT optimizer.

  • bn_tune_dataloader (torch.utils.data.DataLoader or None, optional) – Dataloader to tune batch norms during search and validation. It is important to use the same dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation. Default value is None, which suppresses batch norm tuning procedure during search procedure.

  • bn_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers at each search step. Default value is 10.

  • bn_validation_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers before each validation run. Default value is 50.

  • sample_to_model_inputs (Callable, optional) – Function to map dataloader samples to model input format. Default value is default_sample_to_model_inputs(). See more in Converting dataloader items to PyTorch model inputs.

  • options – Other experimental options (should be ignored by user).

add_param_group(param_group)

Call add_param_group of the wrapped optimizer.

Parameters

param_group (dict with str keys) – Parameter group description to add to user optimizer.

Return type

None

property bn_tune_batches: int

The number of batch norm tune bathes for each search step.

Returns

The number of batch norm tune bathes for each search step.

Return type

int

property bn_validation_tune_batches: int

The number of batch norm tune bathes before validation.

Returns

The number of batch norm tune bathes before validation.

Return type

int

load_state_dict(state_dict)

Call load_state_dict of the wrapped optimizer.

Parameters

state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.

Return type

None

property model: torch.nn.modules.module.Module

Model passed to the constructor.

Returns

PyTorch model passed to the Enot optimizer constructor.

Return type

torch.nn.Module

model_step(closure)

Perform gradient accumulation step.

Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.

To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.

It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().

In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.

This function must be used in conjunction with the step function.

Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.

Parameters

closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.

Returns

The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.

Return type

float or torch.Tensor or None

prepare_validation_model(architecture_indices=None)

Prepares search space for validation.

This function prepares search space for validation. Specifically, it does the following:

  1. Samples current best architecture or user-defined architecture.

  2. Optionally optimizes it.

Warning

It is your responsibility to call this function before running search space evaluation procedure.

Warning

It is not desired to change sampled architecture in the search space until your validation process is finished.

Parameters

architecture_indices (list of int or None, optional) – Custom architecture to use in validation. Default value is None, in which case the current best architecture is sampled for validation.

Returns

Architecture selected for validation.

Return type

list of int

property search_space: enot.models.search_space_model.SearchSpaceModel

Search space model passed to the constructor.

Returns

Search space passed to the Enot optimizer constructor.

Return type

SearchSpaceModel

state_dict()

Call state_dict of the wrapped optimizer and return the result.

Returns

User optimizer state dict.

Return type

dict with str keys

step(closure=None)

Performs a single optimization step (parameter update).

Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.

Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.

Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.

More detailed description of gradient computation and closure structure can be found in model_step function documentation.

Parameters

closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.

Returns

The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.

Return type

float or torch.Tensor or None

zero_grad()

Call zero_grad of the wrapped optimizer.

Return type

None

class EnotFixedLatencySearchOptimizer(search_space, optimizer, max_latency_value, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)

ENOT optimizer for search phase with latency upper boundary.

Parameters
__init__(search_space, optimizer, max_latency_value, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)

Warning

If you use batch norm tuning, make sure to use the same batch norm tuning dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation or misaligned search process.

Parameters
  • search_space (SearchSpaceModel) – Search space model to optimize.

  • optimizer (Optimizer) – PyTorch optimizer to be replaced with the ENOT optimizer.

  • max_latency_value (float) – Maximal latency for search phase. Search process is constrained with this value so all architectures sampled for validation will have latency less than or equal to this value.

  • bn_tune_dataloader (torch.utils.data.DataLoader or None, optional) – Dataloader to tune batch norms during search and validation. It is important to use the same dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation. Default value is None, which suppresses batch norm tuning procedure during search procedure.

  • bn_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers at each search step. Default value is 10.

  • bn_validation_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers before each validation run. Default value is 50.

  • sample_to_model_inputs (Callable, optional) – Function to map dataloader samples to model input format. Default value is default_sample_to_model_inputs(). See more in Converting dataloader items to PyTorch model inputs.

  • options – Other experimental options (should be ignored by user).

add_param_group(param_group)

Call add_param_group of the wrapped optimizer.

Parameters

param_group (dict with str keys) – Parameter group description to add to user optimizer.

Return type

None

property bn_tune_batches: int

The number of batch norm tune bathes for each search step.

Returns

The number of batch norm tune bathes for each search step.

Return type

int

property bn_validation_tune_batches: int

The number of batch norm tune bathes before validation.

Returns

The number of batch norm tune bathes before validation.

Return type

int

load_state_dict(state_dict)

Call load_state_dict of the wrapped optimizer.

Parameters

state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.

Return type

None

property max_latency_value: float

Maximal latency value (upper bound) for search process.

Returns

Maximal latency value (upper bound) for search process.

Return type

float

property model: torch.nn.modules.module.Module

Model passed to the constructor.

Returns

PyTorch model passed to the Enot optimizer constructor.

Return type

torch.nn.Module

model_step(closure)

Perform gradient accumulation step.

Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.

To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.

It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().

In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.

This function must be used in conjunction with the step function.

Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.

Parameters

closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.

Returns

The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.

Return type

float or torch.Tensor or None

prepare_validation_model(architecture_indices=None)

Prepares search space for validation.

This function prepares search space for validation. Specifically, it does the following:

  1. Samples current best architecture or user-defined architecture.

  2. Optionally optimizes it.

Warning

It is your responsibility to call this function before running search space evaluation procedure.

Warning

It is not desired to change sampled architecture in the search space until your validation process is finished.

Parameters

architecture_indices (list of int or None, optional) – Custom architecture to use in validation. Default value is None, in which case the current best architecture is sampled for validation.

Returns

Architecture selected for validation.

Return type

list of int

property search_space: enot.models.search_space_model.SearchSpaceModel

Search space model passed to the constructor.

Returns

Search space passed to the Enot optimizer constructor.

Return type

SearchSpaceModel

state_dict()

Call state_dict of the wrapped optimizer and return the result.

Returns

User optimizer state dict.

Return type

dict with str keys

step(closure=None)

Performs a single optimization step (parameter update).

Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.

Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.

Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.

More detailed description of gradient computation and closure structure can be found in model_step function documentation.

Parameters

closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.

Returns

The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.

Return type

float or torch.Tensor or None

zero_grad()

Call zero_grad of the wrapped optimizer.

Return type

None

Optimal model tuning

Final model is tuned for some epochs to match it’s standalone accuracy. Sometimes, however, it is necessary to re-train the obtained architecture from scratch.

class EnotTrainOptimizer(model, optimizer, **options)

ENOT optimizer for train and tune phases.

Parameters
  • model (Module) –

  • optimizer (Union[Optimizer, Any]) –

__init__(model, optimizer, **options)
Parameters
  • model (torch.nn.Module) – PyTorch model to optimize.

  • optimizer (torch.optim.Optimizer) – PyTorch optimizer which will be wrapped by ENOT optimizer.

  • options – Experimental options (should be ignored by user).

add_param_group(param_group)

Call add_param_group of the wrapped optimizer.

Parameters

param_group (dict with str keys) – Parameter group description to add to user optimizer.

Return type

None

load_state_dict(state_dict)

Call load_state_dict of the wrapped optimizer.

Parameters

state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.

Return type

None

property model: torch.nn.modules.module.Module

Model passed to the constructor.

Returns

PyTorch model passed to the Enot optimizer constructor.

Return type

torch.nn.Module

model_step(closure)

Perform gradient accumulation step.

Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.

To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.

It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().

In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.

This function must be used in conjunction with the step function.

Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.

Parameters

closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.

Returns

The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.

Return type

float or torch.Tensor or None

state_dict()

Call state_dict of the wrapped optimizer and return the result.

Returns

User optimizer state dict.

Return type

dict with str keys

step(closure=None)

Performs a single optimization step (parameter update).

Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.

Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.

Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.

More detailed description of gradient computation and closure structure can be found in model_step function documentation.

Parameters

closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.

Returns

The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.

Return type

float or torch.Tensor or None

zero_grad()

Call zero_grad of the wrapped optimizer.

Return type

None