Dr. Kay
Pytorch,, Computational Graph
0 comment
06 Dec, 2025
To be able to build neural network, then a you need to some knowledge of backpropagation. Backpropagation is the most common algorithm for training a network and it's quite clear how it works.
Here the model weights are adjusted based on the gradient of the loss function. The gradient is computed with respect to the given parameter. This means that for for function y = w.x + b, the weights(parameters) are w and b and a loss function loss, we must compute dloss/dw and dloss/db (these would be partial derivatives)Let's see how this works using a one-layer network
import torch x = torch.ones(5) # input y = torch.zeros(3) # expected output w = torch.randn(5,3, requires_grad=True) b = torch.randn(3, requires_grad=True) z = torch.matmul(x, w) + b loss = torch.nn.functional.binary_cross_entropy_with_logits(z, y)
The one-layer network we created above represents a computational graph. A computational graph defines all the computation that are performed in the network
In this example, the following are contained the the graph: z = w * x + b; loss = loss_function(y, z). The parameter of the network w and b would need to be optimized based on the value of the gradients of the loss function with respect to the parameters. To compute these gradients, we set the requires_grad parameter to True
In the forward direction, the function is computed. In the backward direction (during backpropagation) step, the derivative is computed.
A reference to these functions are stored in the grad_fn property of the tensor
print(f"Gradient of function z = {z.grad_fn}") print(f"Gradient of loss fuction = {loss.grad_fn}")
To optimize the weights of the network parameters, we would have to compute the derivative of the loss function with respect to each parameter. These derivatives are automatically computed with a call to loss.backward().
The values of the gradients are available in .grad property of the parameters
loss.backward() print(w.grad) print(b.grad)
By default, tensors that have requires_grad property set to True would be tracking their history of computation and therefore support continuous gradient computation. This is very useful for training the network.
However, there are situations where we may not need to track the history. For instance, we just want to run some input data after training the model, performing only forward computation. In situation like this, we can stop the tracking the computation history using either one of two ways:
Surround our computation code with torch.no_grad() block# Surround our computation code with torch.no_grad() block z = torch.matmul(x, w)+b print(z.requires_grad) with torch.no_grad(): z = torch.matmul(x, w)+b print(z.requires_grad)Us the detach method on the tensor
# Use the detatch method on the tensor z = torch.matmul(x, w)+b z_det = z.detach() print(z_det.requires_grad)
Dr. Kay
0 comment