Widrow-Hoff Algorithm is developed by Bernard Widrow and his student Ted Hoff in the 1960s for minimizing the mean square error between a desired output and output produce by a linear predictor. The aim of the article is explore the fundamentals of the Widrow-Hoff algorithm and its impact on the evolution of learning algorithms.
Understanding Widrow-Hoff Algorithm
Widrow-Hoff Learning Algorithm, known as the Least Mean Squares (LMS) algorithm, is used in machine learning, deep learning, and adaptive signal processing. The algorithm is primarily used for supervised learning where the system iteratively adjusts the parameters to approximate a desired target function. It operates by updating the weights of the linear predictor so that the predicted output converges to the actual output over time.
Weight Update Rule in Widrow-Hoff Rule
The update rule guides how the weights of Adaptive Linear Neuron (ADALINE) are adjusted based on the error between expected output and observed output. The weight update rule in the Widrow-Hoff algorithm is given by:
w(t+1)=w(t)+η(d(t)−y(t))x(t)
Here,
- w(t) and w(t+1) are the weight vectors before and after the update, respectively.
- η is the learning rate, a small positive constant that determines the step size of the weight update.
- d(t) is the desired output at time t.
- y(t) is the predicted output at time t.
- x(t) is the input vector at time t.
Interpretation
- Error Signal: d(t)−y(t) calculates the error. Positive error means actual output needs to increase, negative means decrease.
- Learning Rate: η scales the error for weight update. A larger error will result in a bigger adjustment to the weights.
- Direction of Update: x(t) dictates weight update direction. Positive error adjusts weights in input direction, negative adjusts in opposite direction.
Working Principal of Widrow-Hoff Algorithm
The key steps of the Widrow-Hoff algorithm are:
- Initialization: Initialize weights randomly.
- Iterative Update:
- Compute the predicted output by taking the dot product of the input features and the current weights.
- Calculate the error between the predicted output and the true output.
- Update the weights by adding the learning rate multiplied by the error and the input features (gradient descent step).
- Convergence: Repeat step 2 for multiple epochs or until convergence.

Implementing Widrow-Hoff Algorithm for Linear Regression Problem
We will implement Widrow-Hoff (or LMS) learning algorithm using Python and NumPy to learn weights for a linear regression model, then apply it to synthetic data and print the true weights alongside the learned weights. We have followed these steps:
Step 1: Define the Widrow-Hoff Learning Algorithm
The widrow_hoff_learning
function takes the input features (X)
, target values (y)
, learning rate, and number of epochs epochs
. It initializes random weights, iterates over the dataset for the specified number of epochs, calculates predictions, computes errors, and updates the weights using the Widrow-Hoff update rule.
# Define the Widrow-Hoff (LMS) learning algorithm def widrow_hoff_learning(X, y, learning_rate=0.01, epochs=100): num_features = X.shape[1] weights = np.random.randn(num_features) # Initialize weights randomly for _ in range(epochs): for i in range(X.shape[0]): prediction = np.dot(X[i], weights) error = y[i] - prediction weights += learning_rate * error * X[i] return weights
Step 2: Generate Random Dataset
Random data is generated using NumPy. X
is a matrix of 100 samples with 2 features. true_weights
represent the true weights used for generating y
. Noise is added to y
to simulate real data.
# Generate some synthetic training data np.random.seed(0) X = np.random.randn(100, 2) # 100 samples with 2 features true_weights = np.array([3, 5]) # True weights for generating y y = np.dot(X, true_weights) + np.random.randn(100) # Add noise to simulate real data
Step 3: Add a Bias Term
A bias term (constant feature) is added to the feature matrix X
to account for the intercept term in the linear model.
# Add a bias term (constant feature) to X X = np.concatenate([X, np.ones((X.shape[0], 1))], axis=1)
Step 4: Apply the Widrow-Hoff Algorithm to learn weights
The widrow_hoff_learning
function is called with the generated data X
and y
to learn the weights.
# Apply the Widrow-Hoff algorithm to learn weights learned_weights = widrow_hoff_learning(X, y)
Complete Implementation of Widrow-Hoff Algorithm for Linear Regression Problem
Python import numpy as np # Define the Widrow-Hoff (LMS) learning algorithm def widrow_hoff_learning(X, y, learning_rate=0.01, epochs=100): num_features = X.shape[1] weights = np.random.randn(num_features) # Initialize weights randomly for _ in range(epochs): for i in range(X.shape[0]): prediction = np.dot(X[i], weights) error = y[i] - prediction weights += learning_rate * error * X[i] return weights # Generate some synthetic training data np.random.seed(0) X = np.random.randn(100, 2) # 100 samples with 2 features true_weights = np.array([3, 5]) # True weights for generating y y = np.dot(X, true_weights) + np.random.randn(100) # Add noise to simulate real data # Add a bias term (constant feature) to X X = np.concatenate([X, np.ones((X.shape[0], 1))], axis=1) # Apply the Widrow-Hoff algorithm to learn weights learned_weights = widrow_hoff_learning(X, y) print("True Weights:", true_weights) print("Learned Weights:", learned_weights)
Output:
True Weights: [3 5] Learned Weights: [3.06124683 5.09292019 0.03074168]
Applications of Widrow-Hoff Algorithm
- Adaptive Filtering: Widrow-Hoff is used in systems for reducing unwanted noise or interference and enhancing the desired signal.
- Pattern Recognition: Widrow-Hoff assists in categorizing input data into defined groups based on their features.
- Signal Processing: Widrow-Hoff is helpful where signals are distorted or have different conditions.
Similar Reads
ML | Find S Algorithm
Introduction : The find-S algorithm is a basic concept learning algorithm in machine learning. The find-S algorithm finds the most specific hypothesis that fits all the positive examples. We have to note here that the algorithm considers only those positive training example. The find-S algorithm sta
4 min read
Johnson Algorithm in C
Johnson's Algorithm is an efficient algorithm used to find the shortest paths between all pairs of vertices in a weighted graph. It works even for graphs with negative weights, provided there are no negative weight cycles. This algorithm is particularly useful for sparse graphs and combines both Dij
5 min read
Johnson Algorithm in C++
Johnsonâs Algorithm is an algorithm used to find the shortest paths between all pairs of vertices in a weighted graph. It is especially useful for sparse graphs and can handle negative weights, provided there are no negative weight cycles. This algorithm uses both Bellman-Ford and Dijkstra's algorit
8 min read
Weighted PageRank Algorithm
Prerequisite: PageRank Algorithm The more popular a webpage is, the more are linkages that other webpages tend to have to them. Weighted PageRank algorithm is an extension of the conventional PageRank algorithm based on the same concept. Weighted PageRank algorithm assigns higher rank values to more
4 min read
Boyer Moore Algorithm in Python
Boyer-Moore algorithm is an efficient string search algorithm that is particularly useful for large-scale searches. Unlike some other string search algorithms, the Boyer-Moore does not require preprocessing, making it ideal where the sample is relatively large relative to the data being searched. Wh
3 min read
Flajolet Martin Algorithm
The Flajolet-Martin algorithm is also known as probabilistic algorithm which is mainly used to count the number of unique elements in a stream or database . This algorithm was invented by Philippe Flajolet and G. Nigel Martin in 1983 and since then it has been used in various applications such as ,
5 min read
A* Search Algorithm in Python
Given an adjacency list and a heuristic function for a directed graph, implement the A* search algorithm to find the shortest path from a start node to a goal node. Examples: Input:Start Node: AGoal Node: FNodes: A, B, C, D, E, FEdges with weights:(A, B, 1), (A, C, 4),(B, D, 3), (B, E, 5),(C, F, 2),
8 min read
Strassen algorithm in Python
Strassen's algorithm is an efficient method for matrix multiplication. It reduces the number of arithmetic operations required for multiplying two matrices by decomposing them into smaller submatrices and performing recursive multiplication. Strassen's algorithm is based on the divide-and-conquer ap
3 min read
Floyd-Warshall Algorithm in C
Floyd-Warshall algorithm is a dynamic programming algorithm used to find the shortest paths between all pairs of vertices in a weighted graph. It works for both directed and undirected graphs and can handle negative weights, provided there are no negative weight cycles. In this article, we will lear
5 min read
Floyd-Warshall Algorithm in C++
The Floyd-Warshall algorithm is a dynamic programming technique used to find the shortest paths between all pairs of vertices in a weighted graph. This algorithm is particularly useful for graphs with dense connections and can handle both positive and negative edge weights, though it cannot handle n
4 min read