Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Building Artificial Neural Networks (ANN) from Scratch
Next article icon

Building Artificial Neural Networks (ANN) from Scratch

Last Updated : 03 Jun, 2025
Comments
Improve
Suggest changes
Like Article
Like
Report

Artificial Neural Networks (ANNs) are a collection of interconnected layers of neurons. It includes:

  • Input Layer: Receives input features.
  • Hidden Layers: Process information through weighted connections and activation functions.
  • Output Layer: Produces the final prediction.
  • Weights and Biases: Trainable parameters that adjust during learning.
  • Activation Functions: Introduces non-linearity which allows the network to learn complex patterns.

Let's build an ANN from scratch using Python and NumPy without relying on deep learning libraries such as TensorFlow or PyTorch. This approach will help in better understanding of the workings of neural networks.

nn
Neural Network

Step 1: Importing Necessary Libraries

We will use NumPy to handle numerical computations efficiently.

Python
import numpy as np 

Step 2: Initializing the Neural Network

  • Sets initial weights and biases for a two-layer neural network.
  • Uses np.random.seed(42) for reproducible results.
  • Weights (W1, W2) initialized with small random values scaled by 0.01 to avoid large initial weights.
  • W1 shape: (hidden layer size, input layer size).
  • W2 shape: (output layer size, hidden layer size).
  • Biases (b1, b2) initialized to zero vectors matching their layer sizes.
Python
def initialize_parameters(input_size, hidden_size, output_size):     np.random.seed(42)  # For reproducibility     parameters = {         "W1": np.random.randn(hidden_size, input_size) * 0.01,         "b1": np.zeros((hidden_size, 1)),         "W2": np.random.randn(output_size, hidden_size) * 0.01,         "b2": np.zeros((output_size, 1))     }     return parameters 

Step 3: Defining Activation Functions

Activation functions introduce non-linearity into the model, helping it learn complex patterns. We here are using:

  • ReLU for the hidden layer
  • Sigmoid for the output layer.
Python
def sigmoid(Z):     return 1 / (1 + np.exp(-Z))  def relu(Z):     return np.maximum(0, Z)  def relu_derivative(Z):     return (Z > 0).astype(int) 

Step 4: Forward Propagation

In Forward propagation the function computes the output of the neural network for a given input X and parameters.

  • First, it calculates the linear combination Z1 for the hidden layer by multiplying the input X with the weights W1 and adding bias b1.
  • It then applies the ReLU activation function to Z1 producing the hidden layer activations A1.
  • Next, it calculates the linear combination Z2 for the output layer by multiplying A1 with W2 and adding b2.
  • The sigmoid activation function is applied to Z2 to produce the final output A2.
  • The function returns the output A2 along with a cache containing intermediate values needed for backpropagation.
Python
def forward_propagation(X, parameters):     W1, b1, W2, b2 = parameters["W1"], parameters["b1"], parameters["W2"], parameters["b2"]          Z1 = np.dot(W1, X) + b1     A1 = relu(Z1)     Z2 = np.dot(W2, A1) + b2     A2 = sigmoid(Z2)          cache = {"Z1": Z1, "A1": A1, "Z2": Z2, "A2": A2}     return A2, cache 

Step 5: Computing the Cost

Cost function calculates the binary cross-entropy loss which measures how well the neural network’s predictions A2 match the true labels Y.

  • m is the number of examples.
  • np.squeeze removes any extra dimensions, returning the cost as a scalar.
Python
def compute_cost(Y, A2):     m = Y.shape[1]     cost = -np.sum(Y * np.log(A2) + (1 - Y) * np.log(1 - A2)) / m     return np.squeeze(cost) 

Step 6: Backpropagation

Backpropagation computes the gradients needed to update the network parameters during training.

  • It calculates the error at the output layer (dZ2) as the difference between predicted outputs (A2) and true labels (Y).
  • Using this error, it computes gradients of the weights (dW2) and biases (db2) for the output layer.
  • Then, it backpropagates the error to the hidden layer by multiplying with the transpose of W2 and element-wise with the derivative of the ReLU activation (relu_derivative).
  • Finally, it calculates gradients for the hidden layer weights (dW1) and biases (db1).
  • All gradients are averaged over the number of examples m to ensure stable updates.
Python
def backward_propagation(X, Y, parameters, cache):     m = X.shape[1]     W2 = parameters["W2"]          dZ2 = cache["A2"] - Y     dW2 = np.dot(dZ2, cache["A1"].T) / m     db2 = np.sum(dZ2, axis=1, keepdims=True) / m          dZ1 = np.dot(W2.T, dZ2) * relu_derivative(cache["Z1"])     dW1 = np.dot(dZ1, X.T) / m     db1 = np.sum(dZ1, axis=1, keepdims=True) / m          grads = {"dW1": dW1, "db1": db1, "dW2": dW2, "db2": db2}     return grads 

Step 7: Updating Parameters

Gradient descent updates the parameters using the computed gradients and a learning rate.

Python
def update_parameters(parameters, grads, learning_rate):     for key in parameters.keys():         parameters[key] -= learning_rate * grads["d" + key]     return parameters 

Step 8: Training the Neural Network

We train the neural network over multiple iterations, updating parameters using backpropagation and gradient descent.

Python
def train_neural_network(X, Y, input_size, hidden_size, output_size, epochs=1000, learning_rate=0.01):     parameters = initialize_parameters(input_size, hidden_size, output_size)          for i in range(epochs):         A2, cache = forward_propagation(X, parameters)         cost = compute_cost(Y, A2)         grads = backward_propagation(X, Y, parameters, cache)         parameters = update_parameters(parameters, grads, learning_rate)                  if i % 100 == 0:             print(f"Epoch {i}: Cost = {cost}")          return parameters 

Step 9: Making Predictions

The trained model predicts outputs by performing forward propagation and applying a threshold of 0.5.

Python
def predict(X, parameters):     A2, _ = forward_propagation(X, parameters)     return (A2 > 0.5).astype(int) 

Step 10: Testing the Model

We test the model using an AND logic gate dataset.

Python
# Example data (AND logic gate) X = np.array([[0, 0, 1, 1], [0, 1, 0, 1]]) Y = np.array([[0, 0, 0, 1]])  trained_parameters = train_neural_network(X, Y, input_size=2, hidden_size=4, output_size=1, epochs=10000, learning_rate=0.1)  predictions = predict(X, trained_parameters) print("Predictions:", predictions) 

Output:

output

The neural network started with random weights and a high error. Over 10,000 epochs, it optimized its weights and biases using gradient descent. The cost function continuously decreased, confirming effective learning. The final predictions match the expected AND gate truth table, proving that the network has successfully generalized the AND logic.


Next Article
Building Artificial Neural Networks (ANN) from Scratch

A

alka1974
Improve
Article Tags :
  • Blogathon
  • Deep Learning
  • AI-ML-DS
  • AI-ML-DS With Python
  • Data Science Blogathon 2024

Similar Reads

    Layers in Artificial Neural Networks (ANN)
    In Artificial Neural Networks (ANNs), data flows from the input layer to the output layer through one or more hidden layers. Each layer consists of neurons that receive input, process it, and pass the output to the next layer. The layers work together to extract features, transform data, and make pr
    4 min read
    Introduction to Artificial Neural Networks (ANNs)
    Artificial Neural Networks (ANNs) are computational models inspired by the human brain. They are widely used for solving complex tasks such as pattern recognition, speech processing and decision-making. By mimicking the interconnected structure of biological neurons, ANNs can learn patterns and make
    5 min read
    Introduction to ANN (Artificial Neural Networks) | Set 3 (Hybrid Systems)
    Prerequisites: Genetic algorithms, Artificial Neural Networks, Fuzzy Logic Hybrid systems: A Hybrid system is an intelligent system that is framed by combining at least two intelligent technologies like Fuzzy Logic, Neural networks, Genetic algorithms, reinforcement learning, etc. The combination of
    4 min read
    Implementing Artificial Neural Network training process in Python
    An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the brain. ANNs, like people, learn by example. An ANN is configured for a specific application, such as pattern recognition or data classification, through a learning process. Learning largely involves adju
    4 min read
    Architecture and Learning process in neural network
    In order to learn about Backpropagation, we first have to understand the architecture of the neural network and then the learning process in ANN. So, let's start about knowing the various architectures of the ANN: Architectures of Neural Network: ANN is a computational system consisting of many inte
    9 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences