The LAG() function in SQL is a powerful window function that allows you to retrieve the value of a column from the previous row in the result set. It is commonly used for tasks like calculating differences between rows, tracking trends, and comparing data within specific partitions.
In this article, we’ll explore the SQL LAG() function in detail, covering its syntax, examples, and practical applications.
What is the SQL LAG() Function?
The LAG() function belongs to a category of functions known as window functions. This means it is possible to retrieve the value of the previous row from the current row of the result set. While aggregate functions are used to return a single value for a certain group of rows, for example, the LAG() function does not collapse and returns values for each row over a fixed set of rows.
Syntax:
.LAG (scalar_expression [, offset [, default ]]) OVER ( [ partition_by_clause ] order_by_clause )
Where:
- scalar_expression – The value to be returned based on the specified offset.
- offset – The number of rows back from the current row from which to obtain a value. If not specified, the default is 1.
- default – default is the value to be returned if offset goes beyond the scope of the partition. If a default value is not specified, NULL is returned.
- partition_by_clause: An optional clause that divides the result set into partitions. The LAG() function is applied to each partition separately.
- order_by_clause: The order of the rows within each partition. This is mandatory and must be specified.
Why Use the LAG() Function?
- Comparing Rows: Helps compare the current row with a previous row’s data.
- Trend Analysis: Useful for analyzing changes in values, like stock prices, sales figures, or other time-series data.
- Finding Differences: Calculate the difference between consecutive rows in terms of time, quantity, or any other metric.
Examples of SQL LAG() Function
Let’s look at some examples of SQL LAG function and understand how to use LAG Function in SQL.
Example 1 : Basic Usage of LAG()
SELECT Organisation, [Year], Revenue, LAG (Revenue, 1, 0) OVER (PARTITION BY Organisation ORDER BY [Year]) AS PrevYearRevenue FROM Org ORDER BY Organisation, [Year];
Output:
Organisation | Year | Revenue | PrevYearRevenue |
ABCD News | 2013 | 440000 | 0 |
ABCD News | 2014 | 480000 | 440000 |
ABCD News | 2015 | 490000 | 480000 |
ABCD News | 2016 | 500000 | 490000 |
ABCD News | 2017 | 520000 | 500000 |
ABCD News | 2018 | 525000 | 520000 |
ABCD News | 2019 | 540000 | 525000 |
ABCD News | 2020 | 550000 | 540000 |
Z News | 2016 | 720000 | 0 |
Z News | 2017 | 750000 | 720000 |
Z News | 2018 | 780000 | 750000 |
Z News | 2019 | 880000 | 780000 |
Z News | 2020 | 910000 | 880000 |
Explantion:
In the above example, We have 2 TV News Channel whose Current and Previous Year’s Revenue is presented on the same row using the LAG() function. As You can see that the very first record for each of the TV News channels don’t have previous year revenues so it shows the default value of 0. This function can be very useful in yielding data for BI reports when you want to compare values in consecutive periods, for e.g. Year on Year or Quarter on Quarter or Daily Comparisons.
Example 2 : Calculate Year-on-Year Growth
SELECT Z.*, (Z.Revenue - z.PrevYearRevenue) as YearonYearGrowth FROM (SELECT Organisation, [Year], Revenue, LAG (Revenue, 1) OVER (PARTITION BY Organisation ORDER BY [Year] ) AS PrevYearRevenue FROM Org) Z ORDER BY Organisation, [Year];
Output:
Organisation | Year | Revenue | PrevYearRevenue | YearOnYearGrowth |
ABCD News | 2013 | 440000 | NULL | NULL |
ABCD News | 2014 | 480000 | 440000 | 40000 |
ABCD News | 2015 | 490000 | 480000 | 10000 |
ABCD News | 2016 | 500000 | 490000 | 10000 |
ABCD News | 2017 | 520000 | 500000 | 20000 |
ABCD News | 2018 | 525000 | 520000 | 5000 |
ABCD News | 2019 | 540000 | 525000 | 15000 |
ABCD News | 2020 | 550000 | 540000 | 10000 |
Z News | 2016 | 720000 | NULL | NULL |
Z News | 2017 | 750000 | 720000 | 30000 |
Z News | 2018 | 780000 | 750000 | 30000 |
Z News | 2019 | 880000 | 780000 | 100000 |
Z News | 2020 | 910000 | 880000 | 30000 |
Explanation:
In the above example, We can similarly calculate Year On Year Growth for the TV News Channel. Also, one thing to notice in this example is we haven’t supplied any default parameter to LAG(), and hence the LAG() function returns NULL in case there are no previous values. The LAG() function can be implemented at the database level and BI Reporting solutions like Power BI and Tableau can avoid using the cumbersome measures at the reporting layer.
Use Cases of SQL LAG() Function
- Sales Trends: Compare daily, monthly, or yearly sales figures.
- Stock Analysis: Analyze changes in stock prices over time.
- Employee Salaries: Track changes in salaries within departments.
- Data Validation: Identify missing or duplicate rows in a sequence.
Important Points About SQL LAG() Function
- The SQL LAG() function is a window function that allows users to access data from earlier rows in a dataset.
- It enables users to compare current row values with values from previous rows, especially those related to time or specific columns.
- The LAG() function is valuable for analyzing changes over time, such as stock market data, daily trends, and alterations in multiple columns.
Conclusion
The LAG() function in SQL is a versatile and powerful tool for comparing rows, analyzing trends, and calculating differences in datasets. Its ability to access previous row data makes it ideal for tasks like year-over-year analysis, sales trends, and financial reporting.
By mastering the LAG() function, you can perform advanced analytics directly within SQL and reduce the complexity of your reporting queries. Use the examples and tips in this article to implement the function effectively in your database operations.
Similar Reads
PL/SQL Functions
PL/SQL functions are reusable blocks of code that can be used to perform specific tasks. They are similar to procedures but must always return a value. A function in PL/SQL contains:Function Header: The function header includes the function name and an optional parameter list. It is the first part o
4 min read
PLSQL | LOG Function
The PLSQL LOG function is used for returning the logarithm of n base m. The LOG function accepts two parameters which are used to calculate the logarithmic value. The LOG function returns a value of the numeric data type. This function takes as an argument any numeric data type as well as any non-nu
2 min read
PLSQL | LPAD Function
The PLSQL LPAD function is used for padding the left-side of a string with a specific set of characters. a prerequisite for this is that string shouldn't be NULL. The LPAD function in PLSQL is useful for formatting the output of a query. The LPAD function accepts three parameters which are input_str
2 min read
PLSQL | LN Function
The LN function is an inbuilt function in PLSQL which is used to return the natural logarithm of a given input number. The natural logarithm of a number is the logarithm of that number to the base e, where e is the mathematical constant approximately equal to 2.718. This is written using the notatio
2 min read
PLSQL | LEAST Function
The LEAST is an inbuilt function in PLSQL which is used to return the least value from a given list of some expressions. These expressions may be numbers, alphabets etc. Syntax: LEAST(exp1, exp2, ... exp_n) Parameters Used: This function accept some parameters like exp1, exp2, ... exp_n. These each
2 min read
PostgreSQL - LAG Function
In PostgreSQL, the LAG() function is a powerful window function that allows you to access data from a previous row within the same result set. Itâs particularly useful for comparing values in the current row with values in the preceding row, making it ideal for analytical queries in PostgreSQL. For
5 min read
SQL LTRIM() Function
The SQL LTRIM() function is an essential tool used in data cleaning and manipulation tasks. This function helps remove unwanted leading spaces or specific characters from the left side of a string or string expression. It's commonly used to tidy up data by eliminating unnecessary spaces or character
4 min read
PLSQL | LENGTH Function
The PLSQL LENGTH function is used for returning the length of the specified string, in other words, it returns the length of char. The char accepted by the LENGTH function in PLSQL can be of any of the datatypes such as CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB. The value returned by the LENG
1 min read
PL/SQL AVG() Function
The PL/SQL AVG() function serves as a powerful tool for performing aggregate calculations on numeric datasets within a database. By allowing developers to calculate average values while excluding NULL entries, it enhances data analysis capabilities. In this article, we will explore the AVG() functio
5 min read
PL/SQL MAX() Function
The PL/SQL MAX() function is an essential aggregate function in Oracle databases, enabling users to efficiently determine the largest value in a dataset. Whether working with numerical data, dates, or strings, the MAX() function is flexible and widely applicable. In this article, we will provide a d
4 min read