Tensor Collections¶

The construct of a Collection groups tensors together. A Collection is identified by a string representing the name of the collection. It can be used to group tensors of a particular kind such as “losses”, “weights”, “biases”, or “gradients”. A Collection has its own list of tensors specified by include regex patterns, and other parameters determining how these tensors should be saved and when. Using collections enables you to save different types of tensors at different frequencies and in different forms. These collections are then also available during analysis so you can query a group of tensors at once.

There are a number of built-in collections that SageMaker Debugger manages by default. This means that the library takes care of identifying what tensors should be saved as part of that collection. You can also define custom collections, to do which there are couple of different ways.

You can specify which of these collections to save in the hook’s include_collections parameter, or through the collection_configs parameter to the DebuggerHookConfig in the SageMaker Python SDK.

Built in Collections¶

Below is a comprehensive list of the built-in collections that are managed by SageMaker Debugger. The Hook identifes the tensors that should be saved as part of that collection for that framework and saves them if they were requested.

The names of these collections are all lower case strings.

Name	Supported by frameworks/hooks	Description
`all`	all	Matches all tensors
`default`	all	It’s a default collection created, which matches the regex patterns passed as `include_regex` to the Hook
`weights`	TensorFlow, PyTorch, MXNet	Matches all weights of the model
`biases`	TensorFlow, PyTorch, MXNet	Matches all biases of the model
`gradients`	TensorFlow, PyTorch, MXNet	Matches all gradients of the model. In TensorFlow when not using Zero Script Change environments, must use `hoo k.wrap_optimizer()`.
`losses`	TensorFlow, PyTorch, MXNet	Saves the loss for the model
`metrics`	TensorFlow’s KerasHook, XGBoost	For KerasHook, saves the metrics computed by Keras for the model. For XGBoost, the evaluation metrics computed by the algorithm.
`outputs`	TensorFlow’s KerasHook	Matches the outputs of the model
`layers`	TensorFlow’s KerasHook	Input and output of intermediate convolutional layers
`sm_metrics`	TensorFlow	You can add scalars that you want to show up in SageMaker Metrics to this collection. SageMaker Debugger will save these scalars both to the out_dir of the hook, as well as to SageMaker Metric. Note that the scalars passed here will be saved on AWS servers outside of your AWS account.
`optimizer_variables`	TensorFlow’s KerasHook	Matches all optimizer variables, currently only supported in Keras.
`hyperparameters`	XGBoost	Booster paramamete rs
`predictions`	XGBoost	Predictions on validation set (if provided)
`labels`	XGBoost	Labels on validation set (if provided)
`feature_importance`	XGBoost	Feature importance given by g et_score()
`full_shap`	XGBoost	A matrix of (nsmaple, nfeatures + 1) with each record indicating the feature contributions (SHAP valu es) for that prediction. Computed on training data with predic t()
`average_shap`	XGBoost	The sum of SHAP value magnitudes over all samples. Represents the impact each feature has on the model output.
`trees`	XGBoost	Boosted tree model given by trees_to_dataframe( )

Default collections saved¶

The following collections are saved regardless of the hook configuration.

Framework	Default collections saved
`TensorFlow`	METRICS, LOSSES, SM_METRICS
`PyTorch`	LOSSES
`MXNet`	LOSSES
`XGBoost`	METRICS

If for some reason, you want to disable the saving of these collections, you can do so by setting end_step to 0 in the collection’s SaveConfig. When using the SageMaker Python SDK this would look like

from sagemaker.debugger import DebuggerHookConfig, CollectionConfig

hook_config = DebuggerHookConfig(
    s3_output_path='s3://smdebug-dev-demo-pdx/mnist',
    collection_configs=[
        CollectionConfig(name="metrics", parameters={"end_step": 0})
    ]
)

When configuring the Collection in your Python script, it would be as follows:

hook.get_collection("metrics").save_config.end_step = 0

Creating or retrieving a Collection¶

Function	Behavior
`hook. get_collection(collection_name)`	Returns the collection with the given name. Creates the collection with default configuration if it doesn’t already exist. A new collection created by default does not match any tensor and is configured to save histograms and distributions along with the tensor if tensorboard support is enabled, and uses the reduction configuration and save configuration passed to the hook.

Properties of a Collection¶

Property	Description
`tensor_names`	Get or set list of tensor names as strings
`include_regex`	Get or set list of regexes to include. Tensors whose names match these regex patterns will be included in the collection
`reduction_config`	Get or set the ReductionConfig object to be used for tensors part of this collection
`save_config`	Get or set the SaveConfig object to be used for tensors part of this collection
`save_histogram`	Get or set the boolean flag which determines whether to write histograms to enable histograms and distributions in TensorBoard, for tensors part of this collection. Only applicable if TensorBoard support is enabled.

Methods on a Collection¶

Method	Behavior
`coll.include(regex)`	Takes a regex string or a list of regex strings to match tensors to include in the collection.
`coll.add(tensor)`	(TensorFlow only) Takes an instance or list or set of tf.Tensor/tf.Variable /tf.MirroredVariable/tf.Operation to add to the collection.
`coll.add_keras_layer(lay er, inputs=False, outputs=True)`	(tf.keras only) Takes an instance of a tf.keras layer and logs input/output tensors for that module. By default, only outputs are saved.
`coll.add_module_tensors(modu le, inputs=False, outputs=True)`	(PyTorch only) Takes an instance of a PyTorch module and logs input/output tensors for that module. By default, only outputs are saved.
`coll.add_block_tensors(blo ck, inputs=False, outputs=True)`	(MXNet only) Takes an instance of a Gluon block,and logs input/output tensors for that module. By default, only outputs are saved.

Configuring Collection using SageMaker Python SDK¶

Parameters to configure Collection are passed as below when using the SageMaker Python SDK.

from sagemaker.debugger import CollectionConfig
coll_config = CollectionConfig(
    name="weights",
    parameters={ "parameter": "value" })

The parameters can be one of the following. The meaning of these parameters will be clear as you review the sections of documentation below. Note that all parameters below have to be strings. So any parameter which accepts a list (such as save_steps, reductions, include_regex), needs to be given as strings separated by a comma between them.

include_regex
save_histogram
reductions
save_raw_tensor
save_interval
save_steps
start_step
end_step
train.save_interval
train.save_steps
train.start_step
train.end_step
eval.save_interval
eval.save_steps
eval.start_step
eval.end_step
predict.save_interval
predict.save_steps
predict.start_step
predict.end_step
global.save_interval
global.save_steps
global.start_step
global.end_step