Building a Convolutional Neural Network using PyTorch

Last Updated : 11 Feb, 2025

Convolutional Neural Networks (CNNs) are deep learning models used for image processing tasks. They automatically learn spatial hierarchies of features from images through convolutional, pooling and fully connected layers. In this article we'll learn how to build a CNN model using PyTorch. This includes defining the network architecture, preparing the data, training the model and evaluating its performance.

Implementation of Building a Convolutional Neural Network in PyTorch

Step 1: Import necessary libraries

In this Python code block, we are importing essential modules from the PyTorch library, which is a popular open-source machine learning framework.

Python

import torch import torch.nn as nn import torch.optim as optim import torchvision import torchvision.transforms as transforms import torch.nn.functional as F

Step 2: Prepare the dataset

This code sets up the CIFAR-10 dataset for training and testing a neural network using PyTorch.
It defines a sequence of image transformations, including converting images to PyTorch tensors and normalizing them. Then, it creates dataset objects for both the training and test sets of CIFAR-10, specifying the root directory, that it's for training or testing, and the transformation sequence.
Next, it creates data loaders for both sets, which help in loading the data in batches, shuffling it, and using multiple processes for faster data loading.
Finally, it defines the class labels for CIFAR-10, representing the 10 different object classes in the dataset. Overall, this code prepares the CIFAR-10 dataset for use in training and evaluating neural network models.

Python

transform = transforms.Compose(     [transforms.ToTensor(),      transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])  trainset = torchvision.datasets.CIFAR10(root='./data', train=True,                                         download=True, transform=transform) trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,                                           shuffle=True, num_workers=2)  testset = torchvision.datasets.CIFAR10(root='./data', train=False,                                        download=True, transform=transform) testloader = torch.utils.data.DataLoader(testset, batch_size=4,                                          shuffle=False, num_workers=2)  classes = ('plane', 'car', 'bird', 'cat',            'deer', 'dog', 'frog', 'horse', 'ship', 'truck')

Step 3: Define the CNN architecture

This code defines a neural network architecture using the nn.Module class from PyTorch. The Net class inherits from nn.Module and defines the layers of the network in its __init__ method.
It has two convolutional layers (conv1 and conv2) with ReLU activation functions, followed by max pooling layers (pool). The fully connected layers (fc1, fc2, and fc3) process the output of the convolutional layers.
The forward method defines the forward pass of the network, where input x is passed through each layer sequentially. The view method reshapes the output of the second convolutional layer to be compatible with the fully connected layers. Finally, an instance of the Net class is created as net, representing the neural network model.

Python

class Net(nn.Module):     def __init__(self):         super(Net, self).__init__()         self.conv1 = nn.Conv2d(3, 6, 5)         self.pool = nn.MaxPool2d(2, 2)         self.conv2 = nn.Conv2d(6, 16, 5)         self.fc1 = nn.Linear(16 * 5 * 5, 120)         self.fc2 = nn.Linear(120, 84)         self.fc3 = nn.Linear(84, 10)      def forward(self, x):         x = self.pool(F.relu(self.conv1(x)))         x = self.pool(F.relu(self.conv2(x)))         x = x.view(-1, 16 * 5 * 5)         x = F.relu(self.fc1(x))         x = F.relu(self.fc2(x))         x = self.fc3(x)         return x  net = Net()

Step 4: Define loss function and optimizer

In this code , the nn.CrossEntropyLoss() is used as the loss function (criterion) for training the neural network. CrossEntropyLoss is commonly used for classification tasks and calculates the loss between the predicted class probabilities and the actual class labels.
The optimizer (optim.SGD) is used to update the weights of the neural network during training. Stochastic Gradient Descent (SGD) is the chosen optimization algorithm, with a learning rate of 0.001 and momentum of 0.9.

Python

criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)

Step 5: Train the network

This code trains a neural network (net) using the CIFAR-10 dataset with a specified loss function (criterion) and optimizer (optimizer) for 2 epochs, printing the average loss every 2000 mini-batches.

Python

for epoch in range(2):        running_loss = 0.0     for i, data in enumerate(trainloader, 0):         inputs, labels = data          optimizer.zero_grad()          outputs = net(inputs)         loss = criterion(outputs, labels)         loss.backward()         optimizer.step()          running_loss += loss.item()         if i % 2000 == 1999:              print('[%d, %5d] loss: %.3f' %                   (epoch + 1, i + 1, running_loss / 2000))             running_loss = 0.0  print('Finished Training')

Step 6: Testing the network

This code calculates the accuracy of the neural network (net) on the test dataset (testloader) by comparing the predicted labels with the actual labels. It iterates over the test dataset, computes the outputs of the network for each image, and compares the predicted labels with the actual labels.

Python

correct = 0 total = 0 with torch.no_grad():     for data in testloader:         images, labels = data         outputs = net(images)         _, predicted = torch.max(outputs.data, 1)         total += labels.size(0)         correct += (predicted == labels).sum().item()  print('Accuracy of the network on the 10000 test images: %d %%' % (     100 * correct / total))

Complete Code to Build CNN using PyTorch

Python

import torch import torch.nn as nn import torch.optim as optim import torchvision import torchvision.transforms as transforms import torch.nn.functional as F  transform = transforms.Compose(     [transforms.ToTensor(),      transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])  trainset = torchvision.datasets.CIFAR10(root='./data', train=True,                                         download=True, transform=transform) trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,                                           shuffle=True, num_workers=2)  testset = torchvision.datasets.CIFAR10(root='./data', train=False,                                        download=True, transform=transform) testloader = torch.utils.data.DataLoader(testset, batch_size=4,                                          shuffle=False, num_workers=2)  classes = ('plane', 'car', 'bird', 'cat',            'deer', 'dog', 'frog', 'horse', 'ship', 'truck')  class Net(nn.Module):     def __init__(self):         super(Net, self).__init__()         self.conv1 = nn.Conv2d(3, 6, 5)         self.pool = nn.MaxPool2d(2, 2)         self.conv2 = nn.Conv2d(6, 16, 5)         self.fc1 = nn.Linear(16 * 5 * 5, 120)         self.fc2 = nn.Linear(120, 84)         self.fc3 = nn.Linear(84, 10)      def forward(self, x):         x = self.pool(F.relu(self.conv1(x)))         x = self.pool(F.relu(self.conv2(x)))         x = x.view(-1, 16 * 5 * 5)         x = F.relu(self.fc1(x))         x = F.relu(self.fc2(x))         x = self.fc3(x)         return x  net = Net()  criterion = nn.CrossEntropyLoss() optimizer = optim.SGD(net.parameters(), lr=0.001, momentum=0.9)  for epoch in range(2):       running_loss = 0.0     for i, data in enumerate(trainloader, 0):         inputs, labels = data          optimizer.zero_grad()          outputs = net(inputs)         loss = criterion(outputs, labels)         loss.backward()         optimizer.step()          running_loss += loss.item()         if i % 2000 == 1999:              print('[%d, %5d] loss: %.3f' %                   (epoch + 1, i + 1, running_loss / 2000))             running_loss = 0.0  print('Finished Training')  correct = 0 total = 0 with torch.no_grad():     for data in testloader:         images, labels = data         outputs = net(images)         _, predicted = torch.max(outputs.data, 1)         total += labels.size(0)         correct += (predicted == labels).sum().item()  print('Accuracy of the network on the 10000 test images: %d %%' % (     100 * correct / total))

Output:

[1, 2000] loss: 2.279
[1, 4000] loss: 1.992
[1, 6000] loss: 1.718
[1, 8000] loss: 1.589
[1, 10000] loss: 1.513
[1, 12000] loss: 1.492
[2, 2000] loss: 1.410
[2, 4000] loss: 1.375
[2, 6000] loss: 1.366
[2, 8000] loss: 1.343
[2, 10000] loss: 1.325
[2, 12000] loss: 1.263
Finished Training
Accuracy of the network on the 10000 test images: 55 %

The model's accuracy of 55% shows that it is underperforming due to simple network architecture. To improve this we can experiment with adjusting the learning rate and momentum or can use better optimization techniques like Adam optimizer. These optimizations can help model achieve higher accuracy.

Convolutional Neural Network (CNN) in Tensorflow

agarwalyoge6kqa

Improve

Article Tags :

Building a Convolutional Neural Network using PyTorch

Implementation of Building a Convolutional Neural Network in PyTorch

Step 1: Import necessary libraries

Step 2: Prepare the dataset

Step 3: Define the CNN architecture

Step 4: Define loss function and optimizer

Step 5: Train the network

Step 6: Testing the network

Complete Code to Build CNN using PyTorch

Similar Reads