ENOT optimization package
enot.optimize package contains optimizer definitions for three main neural architecture search stages.
To optimize a neural network architecture, you should do the following:
Train the search space (pretrain)
Search for the optimal architecture (search)
Tune obtained model (tune)
- build_optimizer(phase_name, model, optimizer, **options)
Builds and returns ENOT optimizer according to the given phase name.
- Parameters:
phase_name ({"pretrain", "search", "train"}) – Name of the phase. Must be one of {“pretrain”, “search” or “train”}.
model (torch.nn.Module) – PyTorch model to optimize.
optimizer (torch.optim.Optimizer) – PyTorch optimizer to be replaced with the ENOT optimizer.
options – ENOT optimizer options.
- Returns:
Suitable ENOT optimizer instance.
- Return type:
BaseOptimizer
- Raises:
ValueError – If got unknown phase name or unknown optimizer options.
TypeError – If model type is not suitable for the selected phase. Or, if optimizer instance is not a subclass of torch.optim.Optimizer.
Search space pre-training
Neural architecture search starts with the search space creation and its pretrain procedure for further selection of the best operations.
- class PretrainOptimizer(search_space, optimizer, check_recommended_optimizations=True, **options)
Pptimizer for pretrain phase.
- __init__(search_space, optimizer, check_recommended_optimizations=True, **options)
- Parameters:
search_space (SearchSpaceModel) – Search space model to optimize.
optimizer (torch.optim.Optimizer) – PyTorch optimizer to be replaced with the Pretrain optimizer.
check_recommended_optimizations (bool, optional) – Whether to use recommended optimizations. Default value is True.
options – Other experimental options (should be ignored by user).
- add_param_group(param_group)
Call add_param_group of the wrapped optimizer.
- Parameters:
param_group (dict with str keys) – Parameter group description to add to user optimizer.
- Return type:
- load_state_dict(state_dict)
Call load_state_dict of the wrapped optimizer.
- Parameters:
state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.
- Return type:
- property model: Module
Model passed to the constructor.
- Returns:
PyTorch model passed to the optimizer constructor.
- Return type:
- model_step(closure)
Perform gradient accumulation step.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.
It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().
In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.
This function must be used in conjunction with the step function.
Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.
- Parameters:
closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.
- Returns:
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type:
float or torch.Tensor or None
- property search_space: SearchSpaceModel
Search space model passed to the constructor.
- Returns:
Search space passed to the Enot optimizer constructor.
- Return type:
- state_dict()
Call state_dict of the wrapped optimizer and return the result.
- Returns:
User optimizer state dict.
- Return type:
dict with str keys
- step(closure=None)
Performs a single optimization step (parameter update).
Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.
More detailed description of gradient computation and closure structure can be found in model_step function documentation.
- Parameters:
closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.
- Returns:
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type:
float or torch.Tensor or None
Optimal model selection
The pretrained search space is used to select the best combination of operations.
- class SearchOptimizer(search_space, optimizer, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)
Optimizer for search phase with batch norm tuning.
- __init__(search_space, optimizer, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)
Warning
If you use batch norm tuning, make sure to use the same batch norm tuning dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation or misaligned search process.
- Parameters:
search_space (torch.Tensor) – Search space model to optimize.
optimizer (torch.optim.Optimizer) – PyTorch optimizer which will be wrapped by our optimizer.
bn_tune_dataloader (torch.utils.data.DataLoader or None, optional) – Dataloader to tune batch norms during search and validation. It is important to use the same dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation. Default value is None, which suppresses batch norm tuning procedure during search procedure.
bn_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers at each search step. Default value is 10.
bn_validation_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers before each validation run. Default value is 50.
sample_to_model_inputs (Callable, optional) – Function to map dataloader samples to model input format. Default value is
default_sample_to_model_inputs()
. See more here.options – Other experimental options (should be ignored by user).
- add_param_group(param_group)
Call add_param_group of the wrapped optimizer.
- Parameters:
param_group (dict with str keys) – Parameter group description to add to user optimizer.
- Return type:
- property bn_tune_batches: int
The number of batch norm tune bathes for each search step.
- Returns:
The number of batch norm tune bathes for each search step.
- Return type:
- property bn_validation_tune_batches: int
The number of batch norm tune bathes before validation.
- Returns:
The number of batch norm tune bathes before validation.
- Return type:
- load_state_dict(state_dict)
Call load_state_dict of the wrapped optimizer.
- Parameters:
state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.
- Return type:
- property model: Module
Model passed to the constructor.
- Returns:
PyTorch model passed to the optimizer constructor.
- Return type:
- model_step(closure)
Perform gradient accumulation step.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.
It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().
In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.
This function must be used in conjunction with the step function.
Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.
- Parameters:
closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.
- Returns:
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type:
float or torch.Tensor or None
- prepare_validation_model(architecture_indices=None)
Prepares search space for validation.
This function prepares search space for validation. Specifically, it does the following:
Samples current best architecture or user-defined architecture.
Optionally optimizes it.
Warning
It is your responsibility to call this function before running search space evaluation procedure.
Warning
It is not desired to change sampled architecture in the search space until your validation process is finished.
- property search_space: SearchSpaceModel
Search space model passed to the constructor.
- Returns:
Search space passed to the Enot optimizer constructor.
- Return type:
- state_dict()
Call state_dict of the wrapped optimizer and return the result.
- Returns:
User optimizer state dict.
- Return type:
dict with str keys
- step(closure=None)
Performs a single optimization step (parameter update).
Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.
More detailed description of gradient computation and closure structure can be found in model_step function documentation.
- Parameters:
closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.
- Returns:
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type:
float or torch.Tensor or None
- class FixedLatencySearchOptimizer(search_space, optimizer, max_latency_value, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)
Optimizer for search phase with latency upper boundary.
- __init__(search_space, optimizer, max_latency_value, bn_tune_dataloader=None, bn_tune_batches=10, bn_validation_tune_batches=50, sample_to_model_inputs=<function default_sample_to_model_inputs>, **options)
Warning
If you use batch norm tuning, make sure to use the same batch norm tuning dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation or misaligned search process.
- Parameters:
search_space (SearchSpaceModel) – Search space model to optimize.
optimizer (Optimizer) – PyTorch optimizer to be replaced with our optimizer.
max_latency_value (float) – Maximal latency for search phase. Search process is constrained with this value so all architectures sampled for validation will have latency less than or equal to this value.
bn_tune_dataloader (torch.utils.data.DataLoader or None, optional) – Dataloader to tune batch norms during search and validation. It is important to use the same dataloader as you used during regular training. This ensures that your model will not have input data distribution shift which may cause performance degradation. Default value is None, which suppresses batch norm tuning procedure during search procedure.
bn_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers at each search step. Default value is 10.
bn_validation_tune_batches (int, optional) – Number of steps (batches) to tune batch norm layers before each validation run. Default value is 50.
sample_to_model_inputs (Callable, optional) – Function to map dataloader samples to model input format. Default value is
default_sample_to_model_inputs()
. See more here.options – Other experimental options (should be ignored by user).
- add_param_group(param_group)
Call add_param_group of the wrapped optimizer.
- Parameters:
param_group (dict with str keys) – Parameter group description to add to user optimizer.
- Return type:
- property bn_tune_batches: int
The number of batch norm tune bathes for each search step.
- Returns:
The number of batch norm tune bathes for each search step.
- Return type:
- property bn_validation_tune_batches: int
The number of batch norm tune bathes before validation.
- Returns:
The number of batch norm tune bathes before validation.
- Return type:
- load_state_dict(state_dict)
Call load_state_dict of the wrapped optimizer.
- Parameters:
state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.
- Return type:
- property max_latency_value: float
Maximal latency value (upper bound) for search process.
- Returns:
Maximal latency value (upper bound) for search process.
- Return type:
- property model: Module
Model passed to the constructor.
- Returns:
PyTorch model passed to the optimizer constructor.
- Return type:
- model_step(closure)
Perform gradient accumulation step.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.
It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().
In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.
This function must be used in conjunction with the step function.
Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.
- Parameters:
closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.
- Returns:
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type:
float or torch.Tensor or None
- prepare_validation_model(architecture_indices=None)
Prepares search space for validation.
This function prepares search space for validation. Specifically, it does the following:
Samples current best architecture or user-defined architecture.
Optionally optimizes it.
Warning
It is your responsibility to call this function before running search space evaluation procedure.
Warning
It is not desired to change sampled architecture in the search space until your validation process is finished.
- property search_space: SearchSpaceModel
Search space model passed to the constructor.
- Returns:
Search space passed to the Enot optimizer constructor.
- Return type:
- state_dict()
Call state_dict of the wrapped optimizer and return the result.
- Returns:
User optimizer state dict.
- Return type:
dict with str keys
- step(closure=None)
Performs a single optimization step (parameter update).
Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.
More detailed description of gradient computation and closure structure can be found in model_step function documentation.
- Parameters:
closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.
- Returns:
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type:
float or torch.Tensor or None
Optimal model tuning
Final model is tuned for some epochs to match it’s standalone accuracy. Sometimes, however, it is necessary to re-train the obtained architecture from scratch.
- class TrainOptimizer(model, optimizer, **options)
ENOT optimizer for train and tune phases.
- __init__(model, optimizer, **options)
- Parameters:
model (torch.nn.Module) – PyTorch model to optimize.
optimizer (torch.optim.Optimizer) – PyTorch optimizer which will be wrapped by ENOT optimizer.
options – Experimental options (should be ignored by user).
- add_param_group(param_group)
Call add_param_group of the wrapped optimizer.
- Parameters:
param_group (dict with str keys) – Parameter group description to add to user optimizer.
- Return type:
- load_state_dict(state_dict)
Call load_state_dict of the wrapped optimizer.
- Parameters:
state_dict (dict with str keys) – State dict to be loaded to user optimizer instance.
- Return type:
- property model: Module
Model passed to the constructor.
- Returns:
PyTorch model passed to the optimizer constructor.
- Return type:
- model_step(closure)
Perform gradient accumulation step.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
To accumulate gradients, this method must perform complete gradient computation cycle, which consists of forward step following by backward step. To achieve this, it requires user-defined closure, which encapsulates both of these steps.
It is usually enough to calculate model predictions, compute loss function by using model predictions, and then apply backprop algorithm to compute model parameter’s gradients by calling loss.backward().
In more sophisticated situations, you should contact ENOT team to make sure that in your situation it is possible to use our current API and that nothing would go wrong.
This function must be used in conjunction with the step function.
Usually, you only need to call this function when you need gradient accumulation for multiple data batches. In this case, you should call model_step for each data batch within your larger “ghost batch”. After accumulating gradients, you should call step function without arguments.
- Parameters:
closure (Callable) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure.
- Returns:
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type:
float or torch.Tensor or None
- state_dict()
Call state_dict of the wrapped optimizer and return the result.
- Returns:
User optimizer state dict.
- Return type:
dict with str keys
- step(closure=None)
Performs a single optimization step (parameter update).
Optimization step includes gradient computation (forward+backward passes, only when closure is not None) and parameter update. Parameter update is performed by base optimizer provided by user.
Besides gradient computation, this method performs ENOT internal ENOT algorithms and utility configurations.
Calling this function with a non-None closure argument is equivalent to calling model_step with this closure followed by step call without any argument.
More detailed description of gradient computation and closure structure can be found in model_step function documentation.
- Parameters:
closure (Callable or None, optional) – A closure (nested function which has access to a free variable from an enclosing function) that performs complete gradient accumulation procedure. Must be None if you accumulated gradients using model_step. Default value is None.
- Returns:
The result of closure execution, which should be either a loss value stored in torch.Tensor or in float, or None.
- Return type:
float or torch.Tensor or None