Enot utility package

enot.utils package contains utility functional.

batch_norm

This package contains batch normalization layer utilities. Some examples are batch norm tuning functionality, resetting all batch norm layers and checking if module is an instance of PyTorch’s BatchNorm class.

is_bn(module)

Checks whether torch.nn.Module is a BatchNorm instance or not.

Parameters:

module (torch.nn.Module) – Module to check.

Returns:

Whether input module is a BatchNorm instance or not.

Return type:

bool

reset_bn(module)

Resets module running stats if module is a batch norm layer.

Parameters:

module (torch.nn.Module) – Module to reset running stats (if module is an instance of torch BatchNorm).

Return type:

None

model_reset_bn(model)

Resets running stats in all batch norm layers.

Parameters:

model (torch.nn.Module) – Model in which the running stats of all batch norm layers will be reset.

Return type:

None

model_set_bn_momentum(model, momentum=None)

Sets momentum value in all batch norm layers.

Parameters:
  • model (torch.nn.Module) – Model to update batch norms.

  • momentum (float or None, optional) – Momentum value to set in all batch norm layers. Default value is None.

Return type:

None

tune_bn_stats(model, dataloader, reset_bns=False, set_momentums_none=False, n_steps=None, epochs=1, sample_to_model_inputs=<function default_sample_to_model_inputs>, verbose=0)

Tunes batch norm running statistics of the model.

Parameters:
  • model (Module) – Model to update batch norms.

  • dataloader (torch.utils.data.DataLoader) – Dataloader which generates data that will be used to update model’s running statistics.

  • reset_bns (bool, optional) – Whether to reset ALL of the running statistics before tuning. Default value is False.

  • set_momentums_none (bool, optional) – Whether to set all of the momentums in batch norms to None. Default value is False.

  • n_steps (int or None, optional) – Number of steps in one epoch of batch norm tuning. Defaults to None, which completes each epoch.

  • epochs (int, optional) – Number of epochs to tune batch norms. Default value is 1.

  • sample_to_model_inputs (Callable, optional) – Function to map dataloader samples to model input format. Default value is default_sample_to_model_inputs(). See more here.

  • verbose (int) – Procedure verbosity level. 0 disables all messages, 1 enables tqdm progress bar logging.

Return type:

None

Notes

Typically, it is better to tune batch norm statistics on the same data you trained your model. When tuning batch norms on the holdout set, you may experience performance degradation due to the distribution shift.

data

This package contains data-related functional. You can find dataloader creation functions, dataset classes and image transformation functions.

csv_annotation_dataset

class CsvAnnotationDataset(csv_annotation_path, root_dir=None, transform=None)

Bases: Dataset

Creates dataset from csv file.

Read CSV annotation with fields ‘filepath’: str, ‘label’: int

__init__(csv_annotation_path, root_dir=None, transform=None)

Init vision dataset from CSV-file.

Parameters:
  • csv_annotation_path (Union[str, Path]) – Path to csv-file with [‘filepath’, ‘label’] columns.

  • root_dir (Union[str, Path]) – Optional absolute path to folder with images which prepends to ‘filepath’ field in csv. Useful for handle with different dataset locations.

  • transform (torchvision.transforms.Compose) – Transformation should be applied to image.

dataloaders

get_default_train_transform(input_size, mean, std)

Returns common train augmentation.

Augments images via RandomCrop and flip, also transforms it into normalized tensor.

Parameters:
Returns:

All composed transformations.

Return type:

transforms.Compose

get_default_validation_transform(input_size, mean, std)

Returns common validation transformations.

No train augmentation presented, only Resize and normalization.

Parameters:
Returns:

All composed transformations.

Return type:

transforms.Compose

create_data_loader(dataset, batch_size, *, check_sampler=True, sampler=None, collate_fn=<function fast_collate>, **kwargs)

Creates dataloader for certain dataset.

Creates dataloader each sample of one will be placed automatically to CUDA.

Parameters:
  • dataset (torch.utils.data.Dataset) – Data.

  • batch_size (int) – How much samples should be processing at once.

  • check_sampler (bool) – Should we check sample object. Important for multi-GPU procedure.

  • sampler (torch.utils.data.Sampler) – Way to sample object from dataset.

  • collate_fn (TCollateFunction) – Custom collate function.

  • kwargs – Additional parameters propagated to dataloader.

Returns:

Custom dataloader.

Return type:

torch.utils.data.DataLoader

create_data_loader_from_csv_annotation(csv_annotation_path, dataset_root_dir=None, dataset_transform=None, **kwargs)

Creates dataloader from csv-file.

Parameters:
  • csv_annotation_path (Union[str, Path]) – Path to csv-file with [‘filepath’, ‘label’] columns.

  • dataset_root_dir (Union[str, Path]) – Optional absolute path to folder with images which prepends to ‘filepath’ field in csv. Useful for handle with different dataset locations.

  • dataset_transform (torchvision.transforms.Compose) – Transformation should be applied to image.

  • kwargs – Additional parameters propagated to dataloader.

Returns:

Dataloader from csv file.

Return type:

torch.utils.data.DataLoader

common

This package contains regular PyTorch utility functions which can be used outside architecture optimization pipeline.

is_floating_tensor(tensor)

Checks that tensor has floating point data type.

Parameters:

tensor (torch.Tensor) – Tensor to check.

Returns:

Whether input tensor is floating point tensor (one of float16, float32 or float64).

Return type:

bool

profile_speed(fn, *fn_args, sort_type='cumtime', out_functions=80)

Profile function and print profile stats.

Parameters:
  • fn (callable) – Target function to profile.

  • fn_args (args) – Target function arguments.

  • sort_type (str, optional) – Type of statistics sorting. Full list of possible values is available in the python profile documentation. Default value is “cumtime”.

  • out_functions (int, optional) – Number of top-n functions that profiler will print. Default value is 80.

Returns:

Result of function call.

Return type:

Any

save(model, model_path)

Saves module state_dict.

Parameters:
  • model (torch.nn.Module) – Module with state_dict.

  • model_path (str or Path) – Path to model state_dict checkpoint.

Return type:

None

load(model, model_path, cpu=True, strict=True)

Inplace load state_dict from model_path to model.

Parameters:
  • model (Module) – Module to update with loaded state_dict.

  • model_path (str or Path) – Path to state_dict checkpoint.

  • cpu (bool, optional) – Whether to move checkpoint on cpu or not. Default: True.

  • strict (bool, optional) – Whether to raise RuntimeError on name mismatch. Default: True.

Return type:

None

init_convnet_params(model, fc_std=0.01)

Initializes convolutional neural network weights, following Kaiming He strategy.

Parameters:
  • model (torch.nn.Module) – Convolutional neural network to initialize.

  • fc_std (float, optional) – Standard deviation of Linear layers. Default value is 0.01.

Return type:

None

iterate_by_submodules(root_module, submodule_class=<class 'torch.nn.modules.module.Module'>, include_self=False)

Creates iterator over submodules with a specific type.

Parameters:
  • root_module (torch.nn.Module) – Top level module for selection.

  • submodule_class (type, optional) – Type of submodules to iterate over. Default value is torch.nn.Module.

  • include_self (bool, optional) – Whether to include root_module (when type of root_module is equal to submodule_class) in selected submodules or not. Default value is False.

Returns:

Iterator over submodules with a specific type.

Return type:

iterator

cast_to_numpy(storage, copy=True)

Moves PyTorch tensor or numpy.ndarray to numpy.ndarray.

Parameters:
  • storage (torch.Tensor, numpy.ndarray, float or int) – Data container that will be casted to numpy.ndarray.

  • copy (bool, optional) – Whether copy is strongly necessary. If copy is set to true, then this function guarantees that the resulting container will be the copy of the original one. Otherwise it is not guarantied whether copy will be made or not (as moving torch tensor from cuda to cpu will force to make a copy). Default value is True.

Returns:

Container converted to numpy and optionally copied.

Return type:

numpy.ndarray

replace_ops(model, accept_lambda, module_constructor, *args, **kwargs)

According to accept_lambda, replaces some modules in the model.

Parameters:
  • model (torch.nn.Module) – Model where we recursively replace attributes.

  • accept_lambda (Callable) – Function that returns true if module should be replaced and returns False otherwise.

  • module_constructor (Callable) – Function which creates new operation instead of previous one.

Return type:

None

train

This package contains train-related utility functions.

split_into_groups(iterable, group_size, length=None, drop_last=False)

Create iterator with groups of samples from iterable.

Parameters:
  • iterable (Iterable) – Source iterable object

  • group_size (int) – Number of groups

  • length (Optional[int]) – Length of iterable object if __len__ is not implemented or shouldn’t use

  • drop_last (bool) – Flag for drop last group

Returns:

Iterator over groups