Skip to content
geeksforgeeks
  • Tutorials
    • Python
    • Java
    • Data Structures & Algorithms
    • ML & Data Science
    • Interview Corner
    • Programming Languages
    • Web Development
    • CS Subjects
    • DevOps And Linux
    • School Learning
    • Practice Coding Problems
  • Courses
    • DSA to Development
    • Get IBM Certification
    • Newly Launched!
      • Master Django Framework
      • Become AWS Certified
    • For Working Professionals
      • Interview 101: DSA & System Design
      • Data Science Training Program
      • JAVA Backend Development (Live)
      • DevOps Engineering (LIVE)
      • Data Structures & Algorithms in Python
    • For Students
      • Placement Preparation Course
      • Data Science (Live)
      • Data Structure & Algorithm-Self Paced (C++/JAVA)
      • Master Competitive Programming (Live)
      • Full Stack Development with React & Node JS (Live)
    • Full Stack Development
    • Data Science Program
    • All Courses
  • Data Science
  • Data Science Projects
  • Data Analysis
  • Data Visualization
  • Machine Learning
  • ML Projects
  • Deep Learning
  • NLP
  • Computer Vision
  • Artificial Intelligence
Open In App
Next Article:
Flight Delay Prediction using Deep Learning
Next article icon

Flight Delay Prediction using Deep Learning

Last Updated : 09 Sep, 2024
Comments
Improve
Suggest changes
Like Article
Like
Report

Air travel has become an important part of our lives, and with this comes the problem of flights being delayed. Deep learning models can automatically learn hierarchical representations from data, making them best for flight delay prediction. In the article, we will build a flight delay predictor using TensorFlow framework.

How can we use deep learning to build a flight delay predictor?

  • Deep learning is a subset of artificial intelligence that can learn complex patterns and make decisions. Deep learning has many applications like natural language processing, image recognition(computer vision), predictive modelling and many more.
  • Deep learning has the ability to learn hierarchical representations of data. This ability of deep learning makes it suitable for tasks with very large columns of data and for tasks that need spatial datasets.
  • In the context of flight delay prediction, deep learning can use information about the flight's total distance and the total time and predict by how many minutes that flight can be delayed. Additionally, deep learning can learn from new data making it perfect for our scenario.

Building a Flight Delay Predictor

We will use the US Domestic Flights Delay Prediction(2013-2018) dataset. The dataset will be used for training and testing the model. It has various features like flight date, origin, destination, scheduled departure time, distance, arrival time and many more. Now let's load the dataset into our Kaggle notebook and look into a few data points.

Python
import pandas as pd import numpy as np import plotly.express as px import matplotlib.pyplot as plt import seaborn as sns from sklearn.preprocessing import LabelEncoder from sklearn.model_selection import train_test_split from sklearn.preprocessing import StandardScaler from tensorflow.keras.models import Sequential from tensorflow.keras.layers import Dense, Dropout  data = pd.read_csv('/kaggle/input/us-domestic-flights-delay-prediction-2013-2018/flight_delay_predict.csv') data.head() 


Output:

is_delay    Year    Quarter    Month    DayofMonth    DayOfWeek    FlightDate    Reporting_Airline    Origin    OriginState    Dest    DestState    CRSDepTime    Cancelled    Diverted    Distance    DistanceGroup    ArrDelay    ArrDelayMinutes    AirTime
0 1.0 2014 1 1 1 3 2014-01-01 UA LAX CA ORD IL 900 0.0 0.0 1744.0 7 43.0 43.0 218.0
1 0.0 2014 1 1 1 3 2014-01-01 AA IAH TX DFW TX 1750 0.0 0.0 224.0 1 2.0 2.0 50.0
2 1.0 2014 1 1 1 3 2014-01-01 AA LAX CA ORD IL 1240 0.0 0.0 1744.0 7 26.0 26.0 220.0
3 1.0 2014 1 1 1 3 2014-01-01 AA DFW TX LAX CA 1905 0.0 0.0 1235.0 5 159.0 159.0 169.0
4 0.0 2014 1 1 1 3 2014-01-01 AA DFW TX CLT NC 1115 0.0 0.0 936.0 4 -13.0 0.0 108.0

EDA(Exploratory Data Analysis) and Model Building

EDA is a very important step in understanding the data. It helps us understand the structure, distribution, and relationships within the dataset. One important step of EDA is visualizing the dataset. We can visualize the average arrival delays at different origin and destination airports.

Python
avg_delay_by_origin = data.groupby('Origin')['ArrDelay'].mean().reset_index()  bar_plot = px.bar(avg_delay_by_origin, x='Origin', y='ArrDelay', title='Average Arrival Delay by Origin Airport') bar_plot.update_layout(xaxis_title='Origin Airport', yaxis_title='Average Arrival Delay')  bar_plot.show() 

Output:

Screenshot-2024-03-26-at-83330-PM
OUTPUT


Python
avg_delay_by_dest = data.groupby('Dest')['ArrDelay'].mean().reset_index()  bar_plot_dest = px.bar(avg_delay_by_dest, x='Dest', y='ArrDelay', title='Average Arrival Delay by Destination Airport') bar_plot_dest.update_layout(xaxis_title='Destination Airport', yaxis_title='Average Arrival Delay')  bar_plot_dest.show() 


Output:

Screenshot-2024-03-26-at-83222-PM
OUTPUT


Python
numeric_data = data.select_dtypes(include=['number'])  corr_matrix = numeric_data.corr()  plt.figure(figsize=(15, 10)) sns.heatmap(corr_matrix, annot = True) 

Output:

__results___7_1
OUTPUT


Python
data['FlightDate'] = pd.to_datetime(data['FlightDate'])  avg_delay_month = data.groupby(data['FlightDate'].dt.month)['is_delay'].mean().reset_index() fig = px.bar(avg_delay_month, x='FlightDate', y='is_delay', labels={'FlightDate': 'Month',  'is_delay': 'Average Delay'},               title='Average Delay by Month') fig.update_traces(marker_color='skyblue') fig.show() 

Output:

Screenshot-2024-03-26-at-84230-PM
OUTPUT


Splitting the Data

Now, let's get into the main part of this blog which is the model building. First, we will assign the features and the target variables to X and y respectively. Then we will split the dataset with 80% of the data for training and the rest 20% for testing. Then we will scale the features using the StandardScaler method from sklearn.

Python
# Splitting the data into training and testing sets X = data[['AirTime', 'Distance']] y = data[['ArrDelayMinutes', 'is_delay']] X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)  # Scaling the data scaler = StandardScaler() X_train = scaler.fit_transform(X_train) X_test = scaler.transform(X_test) 

Model Building

Now, we will define the architecture of our model using the Sequential model from TensorFlow.Keras. We will use three dense layers using relu activation function. Then we will compile the model using mean squared error as a loss function and an Adam Optimizer. Finally, we will train the model using the fit() function and save the model into our working directory.

Python
model = Sequential() model.add(Dense(64, input_dim=X_train.shape[1], activation='relu')) model.add(Dropout(0.5)) model.add(Dense(32, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(2, activation='linear'))  model.compile(loss='mean_squared_error', optimizer='adam', metrics=['accuracy'])  model.fit(X_train, y_train, epochs=5, batch_size=32, verbose=1) score, accuracy = model.evaluate(X_test, y_test, verbose=0)  model.save('/kaggle/working/model.h5') 

Output:

Epoch 1/5
40890/40890 ━━━━━━━━━━━━━━━━━━━━ 68s 2ms/step - accuracy: 0.9959 - loss: 793.4816
Epoch 2/5
40890/40890 ━━━━━━━━━━━━━━━━━━━━ 66s 2ms/step - accuracy: 1.0000 - loss: 803.0837
Epoch 3/5
40890/40890 ━━━━━━━━━━━━━━━━━━━━ 66s 2ms/step - accuracy: 1.0000 - loss: 781.1000
Epoch 4/5
40890/40890 ━━━━━━━━━━━━━━━━━━━━ 66s 2ms/step - accuracy: 1.0000 - loss: 751.3886
Epoch 5/5
40890/40890 ━━━━━━━━━━━━━━━━━━━━ 82s 2ms/step - accuracy: 1.0000 - loss: 777.7186
Test loss: 729.39306640625
Test accuracy: 1.0

Now, we will take input from the user, preprocess it and predict the output.

Python
# Real-time Prediction air_time = float(input("Enter Air Time in minutes: ")) distance = float(input("Enter Distance in miles: ")) user_input = np.array([[air_time, distance]]) user_input_scaled = scaler.transform(user_input) predictions = model.predict(user_input_scaled) if predictions[0][1] >= 0.5:     print(f"The flight is delayed by {predictions[0][0]} minutes.") else:     print("The flight is not delayed.") 
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 56ms/step
The flight is delayed by 75.59285736083984 minutes.

Get the complete notebook link here:

Colab Link : click here.

Dataset Link : click here.

Conclusion

In this blog, you have learned about the critical issues of flight delays and how they can impact both passengers and airlines. Through hands-on experience, we learned how to preprocess data, build a deep learning mode, and integrate it into a web application using Flask.

Key Takeaways

  • Flight delay is a critical issue impacting both passengers and airlines, leading to inconvenience and financial losses.
  • Deep Learning is used to predict the flight delay accurately using the Sequential Model from TensorFlow.keras.
  • Data Preprocessing and Exploratory Data Analysis (EDA) are important steps in understanding the structure and relationships in the dataset.
  • We can use Flask to integrate the trained model with the front end.

Next Article
Flight Delay Prediction using Deep Learning

A

adilnaib
Improve
Article Tags :
  • Deep Learning
  • AI-ML-DS
  • Deep Learning Projects
  • AI-ML-DS With Python

Similar Reads

    Flight Delay Prediction Using R
    Predicting flight delays is an important aspect in today's moving modern world. This step is important for better time management and customer satisfaction. These delays can cause significant dissatisfaction among passengers even resulting in churn for further flights in the future. Using Machine Le
    9 min read
    Flight Fare Prediction Using Machine Learning
    In this article, we will develop a predictive machine learning model that can effectively predict flight fares. Why do we need to predict flight fares?There are several use cases of flight fare prediction, which are discussed below: Trip planning apps: Several Travel planning apps use airfare calcul
    5 min read
    IPL Score Prediction using Deep Learning
    In today’s world of cricket every run and decision can turn the game around. Using Deep Learning to predict IPL scores during live matches is becoming a game changer. This article shows how advanced algorithms help us to forecast scores with impressive accuracy, giving fans and analysts valuable ins
    7 min read
    Prediction of Wine type using Deep Learning
    Deep learning is commonly used to analyze large datasets but to understand its core concepts it’s helpful to start with smaller, more manageable datasets. One such dataset is the Wine Quality dataset which includes information about the chemical properties of wines and their quality ratings. In this
    4 min read
    Human Activity Recognition - Using Deep Learning Model
    Human activity recognition using smartphone sensors like accelerometer is one of the hectic topics of research. HAR is one of the time series classification problem. In this project various machine learning and deep learning models have been worked out to get the best final result. In the same seque
    6 min read
geeksforgeeks-footer-logo
Corporate & Communications Address:
A-143, 7th Floor, Sovereign Corporate Tower, Sector- 136, Noida, Uttar Pradesh (201305)
Registered Address:
K 061, Tower K, Gulshan Vivante Apartment, Sector 137, Noida, Gautam Buddh Nagar, Uttar Pradesh, 201305
GFG App on Play Store GFG App on App Store
Advertise with us
  • Company
  • About Us
  • Legal
  • Privacy Policy
  • In Media
  • Contact Us
  • Advertise with us
  • GFG Corporate Solution
  • Placement Training Program
  • Languages
  • Python
  • Java
  • C++
  • PHP
  • GoLang
  • SQL
  • R Language
  • Android Tutorial
  • Tutorials Archive
  • DSA
  • Data Structures
  • Algorithms
  • DSA for Beginners
  • Basic DSA Problems
  • DSA Roadmap
  • Top 100 DSA Interview Problems
  • DSA Roadmap by Sandeep Jain
  • All Cheat Sheets
  • Data Science & ML
  • Data Science With Python
  • Data Science For Beginner
  • Machine Learning
  • ML Maths
  • Data Visualisation
  • Pandas
  • NumPy
  • NLP
  • Deep Learning
  • Web Technologies
  • HTML
  • CSS
  • JavaScript
  • TypeScript
  • ReactJS
  • NextJS
  • Bootstrap
  • Web Design
  • Python Tutorial
  • Python Programming Examples
  • Python Projects
  • Python Tkinter
  • Python Web Scraping
  • OpenCV Tutorial
  • Python Interview Question
  • Django
  • Computer Science
  • Operating Systems
  • Computer Network
  • Database Management System
  • Software Engineering
  • Digital Logic Design
  • Engineering Maths
  • Software Development
  • Software Testing
  • DevOps
  • Git
  • Linux
  • AWS
  • Docker
  • Kubernetes
  • Azure
  • GCP
  • DevOps Roadmap
  • System Design
  • High Level Design
  • Low Level Design
  • UML Diagrams
  • Interview Guide
  • Design Patterns
  • OOAD
  • System Design Bootcamp
  • Interview Questions
  • Inteview Preparation
  • Competitive Programming
  • Top DS or Algo for CP
  • Company-Wise Recruitment Process
  • Company-Wise Preparation
  • Aptitude Preparation
  • Puzzles
  • School Subjects
  • Mathematics
  • Physics
  • Chemistry
  • Biology
  • Social Science
  • English Grammar
  • Commerce
  • World GK
  • GeeksforGeeks Videos
  • DSA
  • Python
  • Java
  • C++
  • Web Development
  • Data Science
  • CS Subjects
@GeeksforGeeks, Sanchhaya Education Private Limited, All rights reserved
We use cookies to ensure you have the best browsing experience on our website. By using our site, you acknowledge that you have read and understood our Cookie Policy & Privacy Policy
Lightbox
Improvement
Suggest Changes
Help us improve. Share your suggestions to enhance the article. Contribute your expertise and make a difference in the GeeksforGeeks portal.
geeksforgeeks-suggest-icon
Create Improvement
Enhance the article with your expertise. Contribute to the GeeksforGeeks community and help create better learning resources for all.
geeksforgeeks-improvement-icon
Suggest Changes
min 4 words, max Words Limit:1000

Thank You!

Your suggestions are valuable to us.

What kind of Experience do you want to share?

Interview Experiences
Admission Experiences
Career Journeys
Work Experiences
Campus Experiences
Competitive Exam Experiences