Building Neural Networks in PyTorch

We will discuss how to build a machine learning algorithm with PyTorch, mainly focusing on two main classes:

torch.nn.Module
torch.nn.Parameter

These two classes contains the methods to call different types of layers and the parameters from these layers. The torch.nn.Parameter is a sub-class of the torch.Tensor class and registers the parameters in the objects of the Module class. The two classes, work together and make possible the building of a model. By observing the example below we can notice that models are built in two main phases:

defining the type and the sequence of the layers (init method)
defining how the output of each layer is fed forward across the network (forward method)

In the following example we will build a model by defining a class calles Example_Model, and we will implement 2 fully connected layers, the first followed by a ReLU activation function, and the second one from a Softmax function by using the init method. The forward method will also be defined to establish how the output of each layer will flow through the network.

import torch

class Example_Model(torch.nn.Module):

    def __init__(self):
        super(Example_Model, self).__init__()

        self.linear1 = torch.nn.Linear(5, 10) # (5 in features, 10 out features)
        self.activation = torch.nn.ReLU()
        self.linear2 = torch.nn.Linear(10, 5) # (10 in features, 5 out features)
        self.softmax = torch.nn.Softmax()

    def forward(self, x):
        x = self.linear1(x)
        x = self.activation(x)
        x = self.linear2(x)
        x = self.softmax(x)
        return x

dummy_model = Example_Model()

We can print the model just by calling the instance dummy_model that we created:

print(dummy_model)

Example_Model(
  (linear1): Linear(in_features=5, out_features=10, bias=True)
  (activation): ReLU()
  (linear2): Linear(in_features=10, out_features=5, bias=True)
  (softmax): Softmax(dim=None)
)

We can print the details of a specific layer by calling the specific layer:

print(dummy_model.linear2)

Linear(in_features=10, out_features=5, bias=True)

We can print the parameters of the whole model, and we will have respectely a

5x10 weight matrix for the first layer
10 bias parameters from the 1st layer
10x5 weight matrix for the first layer
5 bias parameters from the 1st layer

for param in dummy_model.parameters():
    print(param)

Parameter containing:
tensor([[ 0.2425, -0.0518,  0.2428,  0.3637,  0.1235],
        [ 0.3232,  0.0940, -0.2151,  0.1606, -0.3963],
        [ 0.1729, -0.1239, -0.1306,  0.0218, -0.3682],
        [-0.1481, -0.2419,  0.2261,  0.2547, -0.4250],
        [-0.2309,  0.2802,  0.2447, -0.0481,  0.3517],
        [-0.0960,  0.4341,  0.3808, -0.1070,  0.1520],
        [ 0.2868, -0.4020,  0.1665, -0.2042,  0.2753],
        [ 0.1949, -0.3306, -0.1487,  0.3389, -0.1585],
        [-0.1870, -0.1361,  0.1372, -0.0092,  0.1107],
        [-0.0904, -0.0931,  0.1237,  0.2062, -0.2834]], requires_grad=True)
Parameter containing:
tensor([ 0.4161,  0.3360,  0.3570, -0.1723,  0.0327, -0.0217, -0.0882, -0.0824,
         0.0516, -0.1008], requires_grad=True)
Parameter containing:
tensor([[ 0.1387, -0.1030,  0.1031, -0.1182, -0.1140,  0.2976, -0.0084,  0.0843,
          0.2238,  0.0408],
        [-0.1646,  0.0479,  0.1498, -0.0500,  0.1570,  0.0857,  0.2209, -0.0100,
          0.1949,  0.1191],
        [-0.1416, -0.2367, -0.2072, -0.0462,  0.0670,  0.0265, -0.0257, -0.2176,
          0.2269, -0.1983],
        [ 0.0040,  0.1385,  0.0914, -0.2275, -0.1346, -0.0841, -0.2890, -0.2638,
         -0.0749,  0.1757],
        [-0.0679, -0.0231, -0.1748, -0.0177,  0.0334,  0.3104, -0.3142, -0.2812,
         -0.1555, -0.1290]], requires_grad=True)
Parameter containing:
tensor([ 0.1831, -0.2066, -0.3030, -0.1717,  0.2589], requires_grad=True)

We can print the parameters from a specific layer of our choice:

for param in dummy_model.linear2.parameters():
    print(param)

Parameter containing:
tensor([[ 0.1387, -0.1030,  0.1031, -0.1182, -0.1140,  0.2976, -0.0084,  0.0843,
          0.2238,  0.0408],
        [-0.1646,  0.0479,  0.1498, -0.0500,  0.1570,  0.0857,  0.2209, -0.0100,
          0.1949,  0.1191],
        [-0.1416, -0.2367, -0.2072, -0.0462,  0.0670,  0.0265, -0.0257, -0.2176,
          0.2269, -0.1983],
        [ 0.0040,  0.1385,  0.0914, -0.2275, -0.1346, -0.0841, -0.2890, -0.2638,
         -0.0749,  0.1757],
        [-0.0679, -0.0231, -0.1748, -0.0177,  0.0334,  0.3104, -0.3142, -0.2812,
         -0.1555, -0.1290]], requires_grad=True)
Parameter containing:
tensor([ 0.1831, -0.2066, -0.3030, -0.1717,  0.2589], requires_grad=True)