Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Load text in Tensorflow
Next article icon

Training Loop in TensorFlow

Last Updated : 28 Mar, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Training neural networks is at the core of machine learning, and understanding how to write a training loop from scratch is fundamental for any deep learning practitioner and TensorFlow provides powerful tools for building and training neural networks efficiently. In this article, we will get into the process of constructing a training loop using TensorFlow, providing a comprehensive explanation on training the model.

Constructing Training Loop in TensorFlow

A training loop is a repetitive process where the model iteratively learns from the training data to minimize a predefined loss function. Constructing a training loop involves the following steps:

Step 1: Prepare the Dataset

We have illustrated this step with a simple example of training a neural network to classify images from the CIFAR-10 dataset. The CIFAR-10 dataset is loaded, consisting of 50,000 training images and 10,000 testing images, each of size 32x32 pixels with 3 color channels.The pixel values are normalized to the range [0, 1].

Python
import tensorflow as tf from tensorflow.keras import datasets  # Load CIFAR-10 dataset (train_images, train_labels), (test_images, test_labels) = datasets.cifar10.load_data()  # Normalize pixel values to range [0, 1] train_images, test_images = train_images / 255.0, test_images / 255.0  # Print shape of loaded datasets print("Shape of training images:", train_images.shape) print("Shape of training labels:", train_labels.shape) print("Shape of testing images:", test_images.shape) print("Shape of testing labels:", test_labels.shape) 

Output:

Shape of training images: (50000, 32, 32, 3)
Shape of training labels: (50000, 1)
Shape of testing images: (10000, 32, 32, 3)
Shape of testing labels: (10000, 1)

Define the Model:

We have defined a convolutional neural network (CNN) using TensorFlow's Keras API. The model consists of three convolutional layers followed by max-pooling layers for downsampling, and two fully connected (dense) layers for classification.

Python
from tensorflow.keras import layers, models  # Define the model model = models.Sequential([     layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),     layers.MaxPooling2D((2, 2)),     layers.Conv2D(64, (3, 3), activation='relu'),     layers.MaxPooling2D((2, 2)),     layers.Conv2D(64, (3, 3), activation='relu'),     layers.Flatten(),     layers.Dense(64, activation='relu'),     layers.Dense(10) ])  # Print model summary print("\nModel Summary:") model.summary() 

Output:

Model Summary:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d_3 (Conv2D) (None, 30, 30, 32) 896

max_pooling2d_2 (MaxPoolin (None, 15, 15, 32) 0
g2D)

conv2d_4 (Conv2D) (None, 13, 13, 64) 18496

max_pooling2d_3 (MaxPoolin (None, 6, 6, 64) 0
g2D)

conv2d_5 (Conv2D) (None, 4, 4, 64) 36928

flatten_1 (Flatten) (None, 1024) 0

dense_2 (Dense) (None, 64) 65600

dense_3 (Dense) (None, 10) 650

=================================================================
Total params: 122570 (478.79 KB)
Trainable params: 122570 (478.79 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________

The model's summary provides details about each layer, including the layer type, output shape, and number of parameters. It helps understand the flow of data through the network and the complexity of the model.

Step 3: Define Loss Function and Optimizer

In this step, we will defined a loss function and optimizer for training a neural network. We have chosen Sparse Categorical Crossentropy as the loss function and defined two metrics: train_loss to compute the training loss and train_accuracy to compute the accuracy of the models prediction during training.

Python3
# Define loss function and optimizer loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True) optimizer = tf.keras.optimizers.Adam()  # Define metrics train_loss = tf.keras.metrics.Mean(name='train_loss') train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy') 


Step 4: Model Training

Finally, we have implemented the training loop, to construct the training loop, we have defined the training step and training loop. Let's explore the code in detail:

1. Training Step:

  1. We have used the @tf.function decorator to covert the python function into a TensorFlow graph to improve performance.
  2. Inside the train_step function, a gradient tape (tf.GradientTape) is employed to record operations for automatic differentiation.
  3. Predictions are obtained by passing input images through the model in training mode (training=True).
  4. The loss is computed using the specified loss function (loss_fn) by comparing the predicted labels with the true labels.
  5. Gradients of the loss with respect to the model's trainable variables are computed using the gradient tape.
  6. The optimizer applies these gradients to update the model's trainable variables.
  7. Additionally, the train_loss and train_accuracy metrics are updated using the computed loss and predictions, respectively.

2. Training Loop:

  1. The training loop iterates over a fixed number of epochs, where each epoch involves iterating over the entire training dataset in batches.
  2. For each batch, the train_step function is called with input images and corresponding labels.
  3. Batches are sliced from the training dataset (train_images and train_labels) based on the specified batch_size.
  4. After each epoch, training metrics are printed for monitoring the training progress.
  5. Finally, the train_loss and train_accuracy metrics are reset for the next epoch using the reset_states() method.
Python3
# Define training step @tf.function def train_step(images, labels):     with tf.GradientTape() as tape:         predictions = model(images, training=True)         loss = loss_fn(labels, predictions)     gradients = tape.gradient(loss, model.trainable_variables)     optimizer.apply_gradients(zip(gradients, model.trainable_variables))     train_loss(loss)     train_accuracy(labels, predictions)  # Training loop epochs = 10 batch_size = 64 for epoch in range(epochs):     for batch in range(len(train_images) // batch_size):         start = batch * batch_size         end = start + batch_size         train_step(train_images[start:end], train_labels[start:end])     # Print metrics     print(f'Epoch {epoch + 1}, Loss: {train_loss.result()}, Accuracy: {train_accuracy.result() * 100}%')     # Reset metrics for next epoch     train_loss.reset_states()     train_accuracy.reset_states() 

Output:

Epoch 1, Loss: 1.6167317628860474, Accuracy: 41.04313278198242%
Epoch 2, Loss: 1.233251690864563, Accuracy: 56.099952697753906%
Epoch 3, Loss: 1.0807808637619019, Accuracy: 62.05986022949219%
Epoch 4, Loss: 0.9831880331039429, Accuracy: 65.49295806884766%
Epoch 5, Loss: 0.9078642129898071, Accuracy: 68.04977416992188%
Epoch 6, Loss: 0.8455548882484436, Accuracy: 70.3905258178711%
Epoch 7, Loss: 0.7960028648376465, Accuracy: 71.96102905273438%
Epoch 8, Loss: 0.7521368265151978, Accuracy: 73.61555480957031%
Epoch 9, Loss: 0.713749885559082, Accuracy: 74.93798065185547%
Epoch 10, Loss: 0.6778918504714966, Accuracy: 76.44245910644531%

Key Components in Model Training using TensorFlow

There are several key components in the training process:

1. Forward Pass

The forward pass refers to the process of passing input data through the neural network to obtain predictions. In the above example, inside the train_step function, the forward pass occurs when the input images are fed into the model using model(images, training=True), which computes the predictions for the given inputs.

2. Loss Computation

After obtaining predictions from the forward pass, the next step is to compute the loss, which quantifies how well the model's predictions match the true labels. The loss function is responsible for quantifying the difference between the predictions and the actual targets.

The loss function specified in the code (loss_fn) is used to compute the loss between the predicted labels and the true labels. In this case, SparseCategoricalCrossentropy loss computes the cross-entropy loss between the predicted probabilities and the true label indices.

3. Backward Pass (Gradient Calculation)

The backward pass computes the gradients of the loss function with respect to the model parameters. These gradients indicate the direction and magnitude of the parameter updates required to minimize the loss.

Inside the train_step function, a gradient tape is used to record operations for automatic differentiation. During the forward pass, TensorFlow automatically tracks operations involving trainable variables within the gradient tape context. After the loss is computed, gradients of the loss with respect to the model's trainable variables are calculated using the tape.gradient() method. These gradients represent the sensitivity of the loss to changes in each parameter of the model.

4. Parameter Update

Once the gradients are computed, the optimizer updates the model's trainable parameters using an optimization algorithm (e.g., Adam, SGD). The optimizer.apply_gradients() method is used to apply the computed gradients to the model's trainable variables, thereby updating their values to minimize the loss.

These steps are repeated over multiple epochs to train the neural network effectively.

Conclusion

In this article, we've walked through the process of constructing a training loop from scratch using TensorFlow. Understanding this process is crucial for building and training neural networks effectively. By mastering this fundamental concept, you'll have the foundation to tackle more complex deep learning tasks and experiments in the future.


Next Article
Load text in Tensorflow

S

sahilm231199
Improve
Article Tags :
  • Deep Learning
  • Dev Scripter
  • AI-ML-DS
  • Tensorflow
  • Dev Scripter 2024
  • AI-ML-DS With Python

Similar Reads

  • Load text in Tensorflow
    In this article, we are going to see how to load the text in Tensorflow using Python. Tensorflow is an open-source Machine Learning platform that helps to create production-ready Machine Learning pipelines. Using Tensorflow, one can easily manage large datasets and develop a Neural network model in
    3 min read
  • Distributed Training with TensorFlow
    As the size of data sets and model complexity is increasing day by day, traditional training methods are often unable to stand up to the heavy requirements of various contemporary tasks. Therefore, this has given rise to the necessity for distributed training. In simple words, when we use distribute
    8 min read
  • Tensorflow.js tf.train.sgd() Function
    Tensorflow.js is an open-source library that is developed by Google for running machine learning models as well as deep learning neural networks in the browser or node environment. The .train.sgd() function is used to build a tf.SGDOptimizer which utilizes stochastic gradient descent. Syntax: tf.tra
    1 min read
  • Sparse tensors in Tensorflow
    Imagine you are working with a massive dataset which is represented by multi-dimensional arrays called tensors. In simple terms, tensors are the building blocks of mathematical operations on the data. However, sometimes, tensors can have majority of values as zero. Such a tensor with a lot of zero v
    10 min read
  • Introduction to TensorFlow
    TensorFlow is an open-source framework for machine learning (ML) and artificial intelligence (AI) that was developed by Google Brain. It was designed to facilitate the development of machine learning models, particularly deep learning models, by providing tools to easily build, train, and deploy the
    6 min read
  • Variables in Tensorflow
    TensorFlow is a Python library for efficient numerical computing. It's a foundation library that can be used to develop machine learning and deep learning models. Tensorflow is a high-level library. A variable is a state or value that can be modified by performing operations on it. In TensorFlow var
    6 min read
  • How to visualize training progress in TensorFlow?
    Visualization training progress provides insights into how model is learning overtime, hence allowing practioners to monitor performance and gain insights from the training process. We can visualize the training progess using TensorBoard. TensorBoard is a web-based interface that monitors metrics li
    4 min read
  • Tensorflow.js tf.train.adam() Function
    Tensorflow.js is a javascript library developed by Google to run and train machine learning model in the browser or in Node.js. Adam optimizer (or Adaptive Moment Estimation) is a stochastic gradient descent method that is based on adaptive estimation of first-order and second-order moments. The opt
    3 min read
  • Tensorflow.js tf.pool() Function
    Introduction: Tensorflow.js is an open-source library that is developed by Google for running machine learning models as well as deep learning neural networks in the browser or node environment. The .pool() function is used to execute an N-D pooling functioning. Syntax: tf.pool(input, windowShape, p
    2 min read
  • Optimizers in Tensorflow
    Optimizers adjust weights of the model based on the gradient of loss function, aiming to minimize the loss and improve model accuracy. In TensorFlow, optimizers are available through tf.keras.optimizers. You can use these optimizers in your models by specifying them when compiling the model. Here's
    3 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences