Hook Methods

Common Hook Methods

These methods are common for all hooks in any framework.

Note

The methods in this page are available after you create a hook object.

  • TensorFlow

    import smdebug.tensorflow as smd
    hook = smd.KerasHook.create_from_json_file()
    hook = smd.SessionHook.create_from_json_file()
    hook = smd.EstimatorHook.create_from_json_file()
    

    In case of TensorFlow, you need to pick the appropriate HookClass among KerasHook, SessionHook, and EstimatorHook depending on how your training script is composed. For more information, see Tensorflow.

  • PyTorch

    import smdebug.pytorch as smd
    hook = smd.Hook.create_from_json_file()
    
  • MXNet

    import smdebug.mxnet as smd
    hook = smd.Hook.create_from_json_file()
    
  • XGBoost

    import smdebug.xgboost as smd
    hook = smd.Hook.create_from_json_file()
    
add_collection(collection)

Takes a Collection object and adds it to the CollectionManager that the Hook holds. Note that you should only pass in a Collection object for the same framework as the hook

Parameters:

  • collection (smd.Collection)

get_collection(name)

Returns collection identified by the given name

Parameters:

  • name (str)

get_collections()

Returns all collection objects held by the hook

set_mode(mode)

Sets mode of the job. smd.modes.TRAIN, smd.modes.EVAL, smd.modes.PREDICT, smd.modes.GLOBAL. For more information, see Modes for Tensors.

Parameters:

  • value of the enum smd.modes

create_from_json_file(json_file_path (str)

Takes the path of a file which holds the json configuration of the hook, and creates hook from that configuration. This is an optional parameter. If this is not passed it tries to get the file path from the value of the environment variable SMDEB UG_CONFIG_FILE_PATH and defaults to /opt/ml/input/config/debughookconfig.json. When training on SageMaker you do not have to specify any path because this is the default path that SageMaker writes the hook configuration to.

Parameters:

  • json_file_path (str)

close()

Closes all files that are currently open by the hook

save_scalar()

Saves a scalar value by the given name. Passing sm_metric=True flag also makes this scalar available as a SageMaker Metric to show up in SageMaker Studio. Note that when sm_metric is False, this scalar always resides only in your AWS account, but setting it to True saves the scalar also on AWS servers. The default value of sm_metric for this method is False.

Parameters:

  • name (str), value (float), sm_metric (bool)

save_tensor()

Manually save metrics tensors. The re cord_tensor_value() API is deprecated in favor or save_tensor().

Parameters:

  • tensor_name (str), tensor_value (numpy.array or numpy.ndarray), collections_to_write (str or list[str])

TensorFlow specific Hook API

Note that there are three types of Hooks in TensorFlow: SessionHook, EstimatorHook and KerasHook based on the TensorFlow interface being used for training. Tensorflow shows examples of each of these.

Method

Arguments

Returns

Behavior

wrap_optimiz er(optimizer)

optimizer (tf. train.Optimizer or tf.k eras.Optimizer)

Returns the same optimizer object passed with a couple of identifying markers to help smdebug. This returned optimizer should be used for training.

When not using Zero Script Change environments, calling this method on your optimizer is necessary for SageMaker Debugger to identify and save gradient tensors. Note that this method returns the same optimizer object passed and does not change your optimization logic. If the hook is of type KerasHook, you can pass in either an object of type tf.tr ain.Optimizer or tf.ker as.Optimizer. If the hook is of type SessionHook or E stimatorHook, the optimizer can only be of type tf.tra in.Optimizer. This new

add_to_ collection(collection_na me, variable)

collecti on_name (str) : name of the collection to add to. variable parameter to pass to the collection’s add method.

None

Calls the add method of a collection object.

The following hook APIs are specific to training scripts using the TF 2.x GradientTape (Tensorflow):

Method

Arguments

Returns

Behavior

wr ap_tape(tape)

tape (t ensorflow.pytho n.eager.backpro p.GradientTape)

Returns a tape object with three identifying markers to help smdebug. This returned tape should be used for training.

When not using Zero Script Change environments, calling this method on your tape is necessary for SageMaker Debugger to identify and save gradient tensors. Note that this method returns the same tape object passed.

MXNet specific Hook API

Method

Arguments

Behavior

re gister_block(block)

blo ck (mx.gluon.Block)

Calling this method applies the hook to the Gluon block representing the model, so SageMaker Debugger gets called by MXNet and can save the tensors required.

PyTorch specific Hook API

Method

Arguments

Behavior

regi ster_module(module)

modul e (torch.nn.Module)

Calling this method applies the hook to the Torch Module representing the model, so SageMaker Debugger gets called by PyTorch and can save the tensors required.

registe r_loss(loss_module)

l oss_module (torch.nn. modules.loss._Loss)

Calling this method applies the hook to the Torch Module representing the loss, so SageMaker Debugger can save losses