Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
ML | Getting Started With AlexNet
Next article icon

ML | Getting Started With AlexNet

Last Updated : 26 Mar, 2020
Comments
Improve
Suggest changes
Like Article
Like
Report
This article is focused on providing an introduction to the AlexNet architecture. Its name comes from one of the leading authors of the AlexNet paper- Alex Krizhevsky. It won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2012 with a top-5 error rate of 15.3% (beating the runner up which had a top-5 error rate of 26.2%). The most important features of the AlexNet paper are:
  • As the model had to train 60 million parameters (which is quite a lot), it was prone to overfitting. According to the paper, the usage of Dropout and Data Augmentation significantly helped in reducing overfitting. The first and second fully connected layers in the architecture thus used a dropout of 0.5 for the purpose. Artificially increasing the number of images through data augmentation helped in the expansion of the dataset dynamically during runtime, which helped the model generalize better.
  • Another distinct factor was using the ReLU activation function instead of tanh or sigmoid, which resulted in faster training times (a decrease in training time by 6 times). Deep Learning Networks usually employ ReLU non-linearity to achieve faster training times as the others start saturating when they hit higher activation values.

The Architecture

The architecture consists of 5 Convolutional layers, with the 1st, 2nd and 5th having Max-Pooling layers for proper feature extraction. The Max-Pooling layers are overlapped having strides of 2 with filter size 3x3. This resulted in decreasing the top-1 and top-5 error rates by 0.4% and 0.3% respectively in comparison to non-overlapped Max-Pooling layers. They are followed by 2 fully-connected layers (each with dropout) and a softmax layer at the end for predictions. The figure below shows the architecture of AlexNet with all the layers defined. Code: Python code to implement AlexNet for object classification Python3
model = Sequential()  # 1st Convolutional Layer model.add(Conv2D(filters = 96, input_shape = (224, 224, 3),              kernel_size = (11, 11), strides = (4, 4),              padding = 'valid')) model.add(Activation('relu')) # Max-Pooling  model.add(MaxPooling2D(pool_size = (2, 2),             strides = (2, 2), padding = 'valid')) # Batch Normalisation model.add(BatchNormalization())  # 2nd Convolutional Layer model.add(Conv2D(filters = 256, kernel_size = (11, 11),              strides = (1, 1), padding = 'valid')) model.add(Activation('relu')) # Max-Pooling model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2),              padding = 'valid')) # Batch Normalisation model.add(BatchNormalization())  # 3rd Convolutional Layer model.add(Conv2D(filters = 384, kernel_size = (3, 3),              strides = (1, 1), padding = 'valid')) model.add(Activation('relu')) # Batch Normalisation model.add(BatchNormalization())  # 4th Convolutional Layer model.add(Conv2D(filters = 384, kernel_size = (3, 3),              strides = (1, 1), padding = 'valid')) model.add(Activation('relu')) # Batch Normalisation model.add(BatchNormalization())  # 5th Convolutional Layer model.add(Conv2D(filters = 256, kernel_size = (3, 3),              strides = (1, 1), padding = 'valid')) model.add(Activation('relu')) # Max-Pooling model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2),              padding = 'valid')) # Batch Normalisation model.add(BatchNormalization())  # Flattening model.add(Flatten())  # 1st Dense Layer model.add(Dense(4096, input_shape = (224*224*3, ))) model.add(Activation('relu')) # Add Dropout to prevent overfitting model.add(Dropout(0.4)) # Batch Normalisation model.add(BatchNormalization())  # 2nd Dense Layer model.add(Dense(4096)) model.add(Activation('relu')) # Add Dropout model.add(Dropout(0.4)) # Batch Normalisation model.add(BatchNormalization())  # Output Softmax Layer model.add(Dense(num_classes)) model.add(Activation('softmax')) 

Next Article
ML | Getting Started With AlexNet

J

JaideepSinghSandhu
Improve
Article Tags :
  • Machine Learning
  • Neural Network
  • python
Practice Tags :
  • Machine Learning
  • python

Similar Reads

    Getting started with Kaggle : A quick guide for beginners
    Kaggle is an online community of Data Scientists and Machine Learning Engineers which is owned by Google. A general feeling of beginners in the field of Machine Learning and Data Science towards the website is of hesitance. This feeling mainly arises because of the misconceptions that the outside pe
    3 min read
    Train a Deep Learning Model With Pytorch
    Neural Network is a type of machine learning model inspired by the structure and function of human brain. It consists of layers of interconnected nodes called neurons which process and transmit information. Neural networks are particularly well-suited for tasks such as image and speech recognition,
    6 min read
    How Should a Machine Learning Beginner Get Started on Kaggle
    Are you fascinated by Data Science? Do you think Machine Learning is fun? Do you want to learn more about these fields but aren’t sure where to start? Well, start with Kaggle! Kaggle is an online community devoted to Data Scientists and Machine Learning, founded by Google in 2010. It is the largest
    8 min read
    Introduction in deep learning with julia
    A new transition in Data Science is Julia since it is fast and easy to learn and work with. Julia being a promising language is mainly focused on the scientific computing domain. It provides good execution speed which is comparable to C/C++. It also supports parallelism. Julia is good for writing co
    8 min read
    Build, Test, and Deploy Model With AutoML
    The term "Automated Machine Learning," or "AutoML," refers to a set of tools and methods used to speed up the creation of machine learning models. It automates a variety of processes, including model evaluation, feature selection, hyperparameter tweaking, and data preparation. By automating the intr
    6 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences