ENOT optimization package¶
enot.optimize package contains optimizer definitions for three main neural architecture search stages.
To optimize a neural network architecture, you should do the following:
Train the search space (pretrain)
Search for the optimal architecture (search)
Tune obtained model (tune)
- build_optimizer(phase_name, model, optimizer, **options)¶
Builds and returns ENOT optimizer according to the given phase name.
- Parameters
phase_name ({"pretrain", "search", "train"}) – Name of the phase. Must be one of {“pretrain”, “search” or “train”}.
model (torch.nn.Module) – PyTorch model to optimize.
optimizer (torch.optim.Optimizer) – PyTorch optimizer to be replaced with the ENOT optimizer.
options – ENOT optimizer options.
- Returns
Suitable ENOT optimizer instance.
- Return type
BaseOptimizer
- Raises
ValueError – If got unknown phase name or unknown optimizer options.
TypeError – If model type is not suitable for the selected phase. Or, if optimizer instance is not a subclass of torch.optim.Optimizer.
Search space pre-training¶
Neural architecture search starts with the search space creation and its pretrain procedure for further selection of the best operations.
- class PretrainOptimizer(search_space, optimizer, check_recommended_optimizations=True, **options)¶
Pptimizer for pretrain phase.
- __init__(search_space, optimizer, check_recommended_optimizations=True, **options)¶
- Parameters
search_space (SearchSpaceModel) – Search space model to optimize.
optimizer (torch.optim.Optimizer) – PyTorch optimizer to be replaced with the Pretrain optimizer.
check_recommended_optimizations (bool, optional) – Whether to use recommended optimizations. Default value is True.
options – Other experimental options (should be ignored by user).
- add_param_group(param_group)¶
Call add_param_group of the wrapped optimizer.
- Parameters
param_group (dict with str keys) – Parameter group description to add to user optimizer.
- Return type
- load_state_dict(state_dict)¶
Call load_state_dict of the wrapped optimizer.
- Parameters
state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.
- Return type
- property model: torch.nn.modules.module.Module¶
Model passed to the constructor.
- Returns
PyTorch model passed to the optimizer constructor.
- Return type
- model_step(closure)¶
Perform gradient accumulation step.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.
It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().
In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.
This function must be used in conjunction with the step function.
Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.
- Parameters
closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.
- Returns
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type
float or torch.Tensor or None
- property search_space: enot.models.search_space_model.SearchSpaceModel¶
Search space model passed to the constructor.
- Returns
Search space passed to the Enot optimizer constructor.
- Return type
- state_dict()¶
Call state_dict of the wrapped optimizer and return the result.
- Returns
User optimizer state dict.
- Return type
dict with str keys
- step(closure=None)¶
Performs a single optimization step (parameter update).
Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.
More detailed description of gradient computation and closure structure can be found in model_step function documentation.
- Parameters
closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.
- Returns
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type
float or torch.Tensor or None
Optimal model selection¶
The pretrained search space is used to select the best combination of operations.
- class SearchOptimizer(search_space, optimizer, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)¶
Optimizer for search phase with batch norm tuning.
- __init__(search_space, optimizer, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)¶
Warning
If you use batch norm tuning, make sure to use the same batch norm tuning dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation or misaligned search process.
- Parameters
search_space (torch.Tensor) – Search space model to optimize.
optimizer (torch.optim.Optimizer) – PyTorch optimizer which will be wrapped by our optimizer.
bn_tune_dataloader (torch.utils.data.DataLoader or None, optional) – Dataloader to tune batch norms during search and validation. It is important to use the same dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation. Default value is None, which suppresses batch norm tuning procedure during search procedure.
bn_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers at each search step. Default value is 10.
bn_validation_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers before each validation run. Default value is 50.
sample_to_model_inputs (Callable, optional) – Function to map dataloader samples to model input format. Default value is
default_sample_to_model_inputs()
. See more here.options – Other experimental options (should be ignored by user).
- add_param_group(param_group)¶
Call add_param_group of the wrapped optimizer.
- Parameters
param_group (dict with str keys) – Parameter group description to add to user optimizer.
- Return type
- property bn_tune_batches: int¶
The number of batch norm tune bathes for each search step.
- Returns
The number of batch norm tune bathes for each search step.
- Return type
- property bn_validation_tune_batches: int¶
The number of batch norm tune bathes before validation.
- Returns
The number of batch norm tune bathes before validation.
- Return type
- load_state_dict(state_dict)¶
Call load_state_dict of the wrapped optimizer.
- Parameters
state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.
- Return type
- property model: torch.nn.modules.module.Module¶
Model passed to the constructor.
- Returns
PyTorch model passed to the optimizer constructor.
- Return type
- model_step(closure)¶
Perform gradient accumulation step.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.
It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().
In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.
This function must be used in conjunction with the step function.
Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.
- Parameters
closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.
- Returns
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type
float or torch.Tensor or None
- prepare_validation_model(architecture_indices=None)¶
Prepares search space for validation.
This function prepares search space for validation. Specifically, it does the following:
Samples current best architecture or user-defined architecture.
Optionally optimizes it.
Warning
It is your responsibility to call this function before running search space evaluation procedure.
Warning
It is not desired to change sampled architecture in the search space until your validation process is finished.
- Parameters
architecture_indices (list of int or None, optional) – Custom architecture to use in validation. Default value is None, in which case the current best architecture is sampled for validation.
- Returns
Architecture selected for validation.
- Return type
list of int
- property search_space: enot.models.search_space_model.SearchSpaceModel¶
Search space model passed to the constructor.
- Returns
Search space passed to the Enot optimizer constructor.
- Return type
- state_dict()¶
Call state_dict of the wrapped optimizer and return the result.
- Returns
User optimizer state dict.
- Return type
dict with str keys
- step(closure=None)¶
Performs a single optimization step (parameter update).
Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.
More detailed description of gradient computation and closure structure can be found in model_step function documentation.
- Parameters
closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.
- Returns
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type
float or torch.Tensor or None
- class FixedLatencySearchOptimizer(search_space, optimizer, max_latency_value, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)¶
Optimizer for search phase with latency upper boundary.
- __init__(search_space, optimizer, max_latency_value, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)¶
Warning
If you use batch norm tuning, make sure to use the same batch norm tuning dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation or misaligned search process.
- Parameters
search_space (SearchSpaceModel) – Search space model to optimize.
optimizer (Optimizer) – PyTorch optimizer to be replaced with our optimizer.
max_latency_value (float) – Maximal latency for search phase. Search process is constrained with this value so all architectures sampled for validation will have latency less than or equal to this value.
bn_tune_dataloader (torch.utils.data.DataLoader or None, optional) – Dataloader to tune batch norms during search and validation. It is important to use the same dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation. Default value is None, which suppresses batch norm tuning procedure during search procedure.
bn_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers at each search step. Default value is 10.
bn_validation_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers before each validation run. Default value is 50.
sample_to_model_inputs (Callable, optional) – Function to map dataloader samples to model input format. Default value is
default_sample_to_model_inputs()
. See more here.options – Other experimental options (should be ignored by user).
- add_param_group(param_group)¶
Call add_param_group of the wrapped optimizer.
- Parameters
param_group (dict with str keys) – Parameter group description to add to user optimizer.
- Return type
- property bn_tune_batches: int¶
The number of batch norm tune bathes for each search step.
- Returns
The number of batch norm tune bathes for each search step.
- Return type
- property bn_validation_tune_batches: int¶
The number of batch norm tune bathes before validation.
- Returns
The number of batch norm tune bathes before validation.
- Return type
- load_state_dict(state_dict)¶
Call load_state_dict of the wrapped optimizer.
- Parameters
state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.
- Return type
- property max_latency_value: float¶
Maximal latency value (upper bound) for search process.
- Returns
Maximal latency value (upper bound) for search process.
- Return type
- property model: torch.nn.modules.module.Module¶
Model passed to the constructor.
- Returns
PyTorch model passed to the optimizer constructor.
- Return type
- model_step(closure)¶
Perform gradient accumulation step.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.
It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().
In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.
This function must be used in conjunction with the step function.
Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.
- Parameters
closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.
- Returns
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type
float or torch.Tensor or None
- prepare_validation_model(architecture_indices=None)¶
Prepares search space for validation.
This function prepares search space for validation. Specifically, it does the following:
Samples current best architecture or user-defined architecture.
Optionally optimizes it.
Warning
It is your responsibility to call this function before running search space evaluation procedure.
Warning
It is not desired to change sampled architecture in the search space until your validation process is finished.
- Parameters
architecture_indices (list of int or None, optional) – Custom architecture to use in validation. Default value is None, in which case the current best architecture is sampled for validation.
- Returns
Architecture selected for validation.
- Return type
list of int
- property search_space: enot.models.search_space_model.SearchSpaceModel¶
Search space model passed to the constructor.
- Returns
Search space passed to the Enot optimizer constructor.
- Return type
- state_dict()¶
Call state_dict of the wrapped optimizer and return the result.
- Returns
User optimizer state dict.
- Return type
dict with str keys
- step(closure=None)¶
Performs a single optimization step (parameter update).
Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.
More detailed description of gradient computation and closure structure can be found in model_step function documentation.
- Parameters
closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.
- Returns
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type
float or torch.Tensor or None
Optimal model tuning¶
Final model is tuned for some epochs to match it’s standalone accuracy. Sometimes, however, it is necessary to re-train the obtained architecture from scratch.
- class TrainOptimizer(model, optimizer, **options)¶
ENOT optimizer for train and tune phases.
- __init__(model, optimizer, **options)¶
- Parameters
model (torch.nn.Module) – PyTorch model to optimize.
optimizer (torch.optim.Optimizer) – PyTorch optimizer which will be wrapped by ENOT optimizer.
options – Experimental options (should be ignored by user).
- add_param_group(param_group)¶
Call add_param_group of the wrapped optimizer.
- Parameters
param_group (dict with str keys) – Parameter group description to add to user optimizer.
- Return type
- load_state_dict(state_dict)¶
Call load_state_dict of the wrapped optimizer.
- Parameters
state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.
- Return type
- property model: torch.nn.modules.module.Module¶
Model passed to the constructor.
- Returns
PyTorch model passed to the optimizer constructor.
- Return type
- model_step(closure)¶
Perform gradient accumulation step.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.
It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().
In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.
This function must be used in conjunction with the step function.
Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.
- Parameters
closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.
- Returns
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type
float or torch.Tensor or None
- state_dict()¶
Call state_dict of the wrapped optimizer and return the result.
- Returns
User optimizer state dict.
- Return type
dict with str keys
- step(closure=None)¶
Performs a single optimization step (parameter update).
Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.
More detailed description of gradient computation and closure structure can be found in model_step function documentation.
- Parameters
closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.
- Returns
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type
float or torch.Tensor or None