Skip to content
geeksforgeeks
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Tutorials
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
  • Practice
    • Build your AI Agent
    • GfG 160
    • Problem of the Day
    • Practice Coding Problems
    • GfG SDE Sheet
  • Contests
    • Accenture Hackathon (Ending Soon!)
    • GfG Weekly [Rated Contest]
    • Job-A-Thon Hiring Challenge
    • All Contests and Events
  • Python Tutorial
  • Interview Questions
  • Python Quiz
  • Python Glossary
  • Python Projects
  • Practice Python
  • Data Science With Python
  • Python Web Dev
  • DSA with Python
  • Python OOPs
Open In App
Next Article:
Applying Gradient Clipping in TensorFlow
Next article icon

Applications of Gradient Descent in TensorFlow

Last Updated : 24 Apr, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

To reduce a model's cost function, machine learning practitioners frequently employ the gradient descent optimization procedure. It entails incrementally changing the model's parameters in the direction of the cost function's steepest decline. A free machine learning package called TensorFlow has built-in support for gradient descent optimization. In this article, we will examine the uses of gradient descent in TensorFlow, as well as how to use TensorFlow's integrated optimizers to achieve gradient descent.

Gradient Descent:

For determining a function's minimal value, an iterative optimization process called gradient descent is performed. For training machine learning models, it is frequently employed.

The approach works by incrementally changing a model's parameters in the direction of the cost function's steepest descent with respect to those parameters. The cost function, which is a mathematical function, calculates the discrepancy between the model's projected and actual outputs.
Mathematically speaking, the generic update rule for gradient descent is:

θ = θ - α ∇J(θ)

where:

  1. θ is the parameter vector that has to be optimized.
  2. The size of the step in each iteration is determined by the learning rate, which is α.
  3. The cost function J's gradient vector, ∇J(θ), shows the cost function's steepest descent in relation to.

Iteratively updating until a minimum of J is attained is the algorithm's aim. An essential hyperparameter that affects convergence stability and speed is the learning rate. The method may overshoot the minimum and fail to converge if the learning rate is too high. The approach may take a very long time to converge if the learning rate is too low.

The methods used to compute the gradient and update the parameters vary across several types of gradient descent, including batch gradient descent, stochastic gradient descent, and mini-batch gradient descent.

Implementations

We wish to use gradient descent to optimize a straightforward linear regression model. Finding the slope and intercept parameter values that reduce the model's mean squared error on a certain batch of training data is the aim of the optimization process. This is how we can use TensorFlow's built-in optimizers to achieve gradient descent:

Step 1: Import the necessary libraries

Python3
# Import the necessary libraries import tensorflow as tf import matplotlib.pyplot as plt 

Step 2: Generate some random dataset

Python3
# Generate some random data tf.random.set_seed(23) x = tf.random.uniform(     shape = (100,1),     minval=0,     maxval=100,     dtype=tf.dtypes.float32, )  y =2*x + tf.random.normal(shape = (100,1),                      mean=50.0,                       stddev=20,                       dtype=tf.dtypes.float32                              )  plt.scatter(x,y) plt.show() 

Output:

Input Data - Geeksforgeeks
Input Data

Step 3:Define the weight and bias for model

Python3
# Define the weight and bias for model W = tf.Variable(tf.random.normal([1]), name="weight") b = tf.Variable(tf.random.normal([1]), name="bias") print('Weight :',W) print('Bias   :',b) 

Output:

Weight : <tf.Variable 'weight:0' shape=(1,) dtype=float32, numpy=array([0.26008585], dtype=float32)>  Bias   : <tf.Variable 'bias:0' shape=(1,) dtype=float32, numpy=array([0.31952116], dtype=float32)>

Step 4: Define the linear Regression

Python3
# Define linear def linear_regression(x):     return W * x + b 

Step  5:Define the mean squared error

Python3
# Define the cost function def mean_squared_error(y_true, y_pred):     return tf.reduce_mean(tf.square(y_true - y_pred)) 

Step 6: Define Optimizer or gradient descent

An optimization approach called gradient descent is used in machine learning to reduce the discrepancy between a model's expected and actual output. The model's weights and biases are iteratively adjusted depending on the gradient of the loss function relative to the parameters.

By computing the gradients of the error with respect to the parameters and traveling in the direction of the negative gradients, gradient descent updates the weight and bias. The learning rate determines the size of the update. The objective is to locate the model's error surface's lowest point.

Here we are using a 0.00001 as the learning rate, we define the stochastic gradient descent (SGD) optimizer.

Python3
# Define the optimizer optimizer = tf.optimizers.SGD(learning_rate=0.00001) 

Step 7: Define the Training Loop

tf.GradientTape() records the automatic differentiation operations. tape.gradient() release the gradient and the optimizer applies this gradient to weight W and bias b. The training loop is described as a function that receives a batch of data, calculates the gradients of the cost function with respect to the model parameters, and modifies the model parameters via the optimizer. 

Python3
# Define the training loop def train_step(x, y):     with tf.GradientTape() as tape:         y_pred = linear_regression(x)         loss = mean_squared_error(y, y_pred)     gradients = tape.gradient(loss, [W, b])     optimizer.apply_gradients(zip(gradients, [W, b]))     return loss 

Step 8: Train the model and plot weight, bias, and loss over the iterations

Iteration vs Weight: Each gradient descent iteration updates the model's weight. Depending on the gradient and learning rate, the weight may go up or down.

Iteration vs Bias: In addition, each gradient descent iteration updates the model's bias. Depending on the gradient and learning rate, the bias may rise or decrease.

Iteration vs Loss: The discrepancy between the output that was expected and what actually occurred is represented by the loss function. In gradient descent, the loss function is minimized.

Python3
# Train the model # plt.figure(figsize=(15,7)) fig1, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4), dpi=500) fig2, (ax) = plt.subplots(1, figsize=(7, 5)) for i in range(50):     loss = train_step(x, y)     ax1.plot(i, W, 'b*')     ax2.plot(i, b, 'g+')     ax.plot(i, loss, 'ro')  ax1.set_title('Weight over iterations') ax1.set_xlabel('iterations') ax1.set_ylabel('Weight')  ax2.set_title('Bias over iterations') ax2.set_xlabel('iterations') ax2.set_ylabel('Bias')  ax.set_title('Losses over iterations') ax.set_xlabel('iterations') ax.set_ylabel('Losses')  plt.show() 

Output:

Loss optimization - Geeksforgeeks
Loss optimization

Step 9: Plot the regression line with input data

Python3
print('Weight :',W) print('Bias :',b)  plt.scatter(x, y) plt.plot(x, W * x + b, color='red') plt.title('Regression Line') plt.xlabel('Input') plt.ylabel('Target') plt.show() 

Output:

Weight : <tf.Variable 'weight:0' shape=(1,) dtype=float32, numpy=array([2.6064723], dtype=float32)>  Bias : <tf.Variable 'bias:0' shape=(1,) dtype=float32, numpy=array([0.36663133], dtype=float32)>
Regression Line - Geeksforgeeks
Regression Line

The training data are represented by the blue dots in this image, the optimal linear regression model by the red line, and the actual linear regression model by the green line (which is unknown to the model). It is evident that the optimized model comes quite near to the genuine model.

Full Code:

Python3
# Import the necessary libraries import tensorflow as tf import matplotlib.pyplot as plt  # Generate some random data tf.random.set_seed(23) x = tf.random.uniform(     shape=(100, 1),     minval=0,     maxval=100,     dtype=tf.dtypes.float32, )  y = 2*x + tf.random.normal(shape=(100, 1),                            mean=50.0,                            stddev=20,                            dtype=tf.dtypes.float32                            )  # Define the weight and bias for model W = tf.Variable(tf.random.normal([1]), name="weight") b = tf.Variable(tf.random.normal([1]), name="bias")  # Define linear   def linear_regression(x):     return W * x + b  # Define the cost function   def mean_squared_error(y_true, y_pred):     return tf.reduce_mean(tf.square(y_true - y_pred))   # Define the optimizer optimizer = tf.optimizers.SGD(learning_rate=0.00001)  # Define the training loop   def train_step(x, y):     with tf.GradientTape() as tape:         y_pred = linear_regression(x)         loss = mean_squared_error(y, y_pred)     gradients = tape.gradient(loss, [W, b])     optimizer.apply_gradients(zip(gradients, [W, b]))     return loss   # Train the model # plt.figure(figsize=(15,7)) fig1, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4), dpi=500) fig2, (ax) = plt.subplots(1, figsize=(7, 5)) for i in range(50):     loss = train_step(x, y)     ax1.plot(i, W, 'b*')     ax2.plot(i, b, 'g+')     ax.plot(i, loss, 'ro')  ax1.set_title('Weight over iterations') ax1.set_xlabel('iterations') ax1.set_ylabel('Weight')  ax2.set_title('Bias over iterations') ax2.set_xlabel('iterations') ax2.set_ylabel('Bias')  ax.set_title('Losses over iterations') ax.set_xlabel('iterations') ax.set_ylabel('Losses')  plt.show() print('Weight :', W) print('Bias :', b)  plt.scatter(x, y) plt.plot(x, W * x + b, color='red') plt.show() 

Output:

Loss optimization -Geeksforgeeks
Loss optimization
Weight : <tf.Variable 'weight:0' shape=(1,) dtype=float32, numpy=array([2.6111314], dtype=float32)>  Bias : <tf.Variable 'bias:0' shape=(1,) dtype=float32, numpy=array([-0.5400178], dtype=float32)>
Regression Line - Geeksforgeeks
Regression Line 

Conclusion:

An optimization approach called gradient descent is used in machine learning to reduce the discrepancy between a model's expected and actual output. Plotting the iteration vs. weight, bias, loss, and accuracy allows us to see how gradient descent operates. We may reduce the loss and raise the model's accuracy by repeatedly modifying the weights and biases of the model depending on the gradient of the loss function with respect to the parameters.

TensorFlow supports a number of gradient descent optimization variants, including:

  1. Mini-batch gradient descent: At each iteration, the model parameters are updated using a random portion of the training data.
  2. Utilizes a moving average of previous gradients to hasten convergence and prevent local maxima.
  3. Optimizes the learning rate adaptively by taking into account the gradients of the cost function.

Moreover, TensorFlow supports more sophisticated optimization methods like Adam, Adagrad, and RMSprop. These methods integrate momentum, adaptive learning rates, and other elements to enhance the optimization process's convergence and stability.


Next Article
Applying Gradient Clipping in TensorFlow

P

priteshhirani
Improve
Article Tags :
  • Python
  • Tensorflow
Practice Tags :
  • python

Similar Reads

  • Applying Gradient Clipping in TensorFlow
    In deep learning, gradient clipping is an essential technique to prevent gradients from becoming too large during backpropagation, which can lead to unstable training and exploding gradients. This article provides a detailed overview of how to apply gradient clipping in TensorFlow, starting from the
    5 min read
  • Gradient Descent Optimization in Tensorflow
    Gradient descent is an optimization algorithm used to find the values of parameters (coefficients) of a function (f) that minimizes a cost function. In other words, gradient descent is an iterative algorithm that helps to find the optimal solution to a given problem. In this blog, we will discuss gr
    15+ min read
  • Custom gradients in TensorFlow
    Custom gradients in TensorFlow allow you to define your gradient functions for operations, providing flexibility in how gradients are computed for complex or non-standard operations. This can be useful for tasks such as implementing custom loss functions, incorporating domain-specific knowledge into
    6 min read
  • Gradient Descent Algorithm in R
    Gradient Descent is a fundamental optimization algorithm used in machine learning and statistics. It is designed to minimize a function by iteratively moving toward the direction of the steepest descent, as defined by the negative of the gradient. The goal is to find the set of parameters that resul
    7 min read
  • Activation Function in TensorFlow
    Activation functions add non-linearity to deep learning models and allow them to learn complex patterns. TensorFlow’s tf.keras.activations module provides a variety of activation functions to use in different scenarios. An activation function is a mathematical transformation applied to the output of
    4 min read
  • Automatic differentiation in TensorFlow
    In this post, we'll go over the concepts underlying TensorFlow's automated differentiation and provide helpful, step-by-step instructions and screenshots to demonstrate how to utilize it. Automatic differentiation (AD) is a fundamental technique in machine learning, particularly in frameworks like T
    5 min read
  • Python - tensorflow.GradientTape.gradient()
    TensorFlow is open-source Python library designed by Google to develop Machine Learning models and deep learning  neural networks.  gradient() is used to computes the gradient using operations recorded in context of this tape. Syntax: gradient(target, sources, output_gradients, unconnected_gradients
    2 min read
  • Graphs and Functions in TensorFlow
    TensorFlow is a powerful machine learning library that allows developers to create and train models efficiently. One of the foundational concepts in TensorFlow is its computational graph system, which provides a structured way to define and execute operations. Along with graphs, TensorFlow offers tf
    9 min read
  • Python - tensorflow.GradientTape.batch_jacobian()
    TensorFlow is open-source Python library designed by Google to develop Machine Learning models and deep learning  neural networks.  batch_jacobian() is used to compute and stack the per example jacobian. Syntax: batch_jacobian( target, source, unconnected_gradients, parallel_iterations, experimental
    2 min read
  • Architecture of TensorFlow
    Prerequisite: Introduction to TensorFlow TensorFlow is an end-to-end open-source platform for machine learning developed by Google with many enthusiastic open-source contributors. TensorFlow is scalable and flexible to run on data centers as well as mobile phones. It can run on single-machine as wel
    6 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences