# Xavier Initialization

Xavier Initialization Glorot, X. & Bengio, Y. (2010) is a Gaussian initialization heuristic that keeps the variance of the input to a layer the same as that of the output of the layer. This ensures that the variance remains the same throughout the network.

## PyTorch Usage

PyTorch offers both uniform and normal distributed initializations for the Xavier heuristic.

conv_layer = t.nn.Conv2d(16, 16)
torch.nn.init.xavier_uniform_(conv_layer.weight, gain=1)
torch.nn.init.constant_(conv_layer.bias, 0)


or

conv_layer = t.nn.Conv2d(16, 16)
torch.nn.init.xavier_normal_(conv_layer.weight, gain=1)
torch.nn.init.constant_(conv_layer.bias, 0)


The gain value depends on the type of the non linearity used in the layer and can be obtained using the torch.nn.init.calculate_gain() function in PyTorch. For ReLU networks use the default gain=1.