Multivariate Data Visualization with R
Last Updated : 28 Apr, 2025
A method for visualizing data with numerous variables is called multivariate data visualization with R. In this method, graphs and charts are made to show how the various factors relate to one another. The programming language R, which is frequently used for data visualization, provides a number of tools for the visualization of multivariate data.
- Scatterplots: In multivariate data visualization, scatterplots can be used to visualize the relationship between multiple variables by coloring the points or adding shapes or sizes.
- Line Graphs: In multivariate data visualization, Line graphs can be used to compare the trends of two variables or to show the trends of several variables across time.
- Bar Charts: In multivariate data visualization, By stacking or grouping the bars, bar charts can be used to compare the values of several category variables.
- Heatmaps: Heatmaps are a type of plot used to display the relationship between multiple variables. Heatmaps use colors to represent the values of the variables, and rows and columns represent the different variables.
- Boxplots: In multivariate data visualization, the distributions of a continuous variable over various categories can be compared using boxplots.
- Correlation Matrix: In multivariate data visualization, to see the connections between variables, a heatmap or other graphical display can be used to visualize a correlation matrix.
Overall, Multivariate data visualization's primary objective is to find patterns and relationships in the data and effectively communicate these findings.
Scatterplot Matrix
It is possible to see the pairwise correlations between various variables using a scatterplot matrix. Using the "pairs()" function in R Programming Language, this may be generated. Here is an illustration utilizing the integrated iris dataset.
R library(datasets) data(iris) pairs(iris[, 1:5], col = as.integer(iris$Species), pch = 18)
Output:
Scatter plot matrix for multivariate data visualizations The scatterplots in this illustration are colored according to species, and each point is represented by the same symbol.
Correlation Matrix Heatmap
The pairwise correlations between several variables are shown visually using a correlation matrix heatmap. The "ggcorrplot()" function from the "ggcorrplot" package in R can be used to construct this. Here is an illustration utilizing the integrated "mtcars" dataset:
R library(ggcorrplot) data(mtcars) corr_matrix <- cor(mtcars) ggcorrplot(corr_matrix, hc.order = TRUE, type = "lower", colors = c("#6D9EC1", "yellow", "#E46726"))
Output:
Correlation heatmap for multivariate data visualizations In this illustration, the correlation matrix is displayed as a heatmap, with orange denoting positive correlations and blue denoting negative correlations. With the use of hierarchical clustering, the variables are also arranged.
Parallel Coordinate Plot
To see the link between several variables and spot patterns or outliers, utilize a parallel coordinate graphic. Using the "ggparcoord()" function from the "GGally" package in R, this may be generated. Here is an illustration utilizing the integrated "mtcars" dataset:
R library(GGally) data(mtcars) ggparcoord(mtcars, columns = 1:4, groupColumn = "cyl", alphaLines = 1.0)
Output:
Parallel Coordinate Plot for multivariate data visualizations In this example, the parallel coordinate plot is categorized according to the number of cylinders, and the same variable is used to color the lines.
Bubble Chart
The size of the points in a bubble chart, a type of scatterplot, represents a third variable. By doing so, it is possible to simultaneously visualize the relationship between two continuous variables and a categorical variable. A bubble chart can be made in R using the ggplot2 package.
R library(ggplot2) ggplot(data = mtcars, aes(x = mpg, y = wt, size = hp, color = gear)) + geom_point() + scale_size(range = c(1, 5)) + labs(title = "Bubble chart", x = "Miles_per_gallon", y = "Weights") + theme_bw()
Output:
Bubble Chart for multivariate data visualizations The "mpg" variable is plotted on the x-axis, the weight variable is plotted on the y-axis, the "hp" variable is used to size the bubbles, and the gear variable is used to color the bubbles in this case's bubble chart of the "mtcars" dataset. The "labs()" function adds a title and axis labels to the plot, while the "scale_size()" function modifies the bubbles' size range. The plot theme is transformed into a black-and-white color scheme using the "theme_bw()" method.
Bubble Chart
For developing multivariate visualizations in R, the Lattice package is an effective resource. It offers a high-level interface for developing a variety of visualizations and is built on top of the grid graphics system. This is also most widely used for Multivariate data visualization. A few illustrations of multivariate visualizations made with the Lattice program are provided below:
R # Load the Lattice package library(lattice) # Load the mtcars dataset data(mtcars) # Create a bubble chart xyplot(mpg ~ wt | factor(cyl), data=mtcars, main="Bubble Diagram for Weight and MPG", xlab="Weight (1000 lbs)", ylab="Miles per Gallon", alpha=0.8, col=rainbow(nlevels(factor(mtcars$cyl))), scales=list(cex=1.5, x=list(alternating=1)), par.settings=list(superpose.symbol=list(pch=16), layout.heights=list(bottom.padding=4)))
Output:
Bubble Chart for multivariate data visualizations using Lattice Using the "xyplot()" function from the Lattice package, the following code will generate a bubble chart showing the weight and MPG of the "mtcars" dataset. By the number of cylinders in the engine, the bubbles are divided into groups. The bubbles' transparency can be adjusted with the alpha argument, and their colors can be changed using the "col" option. The layout and style of the chart can be modified using the "scales" and "par.settings" parameters.
Conclusion
The main distinctions that make Lattice better and more popular for multivariate data visualization include the trellis graphics system, flexibility, ease of use, extensive collection of graph kinds, and support for conditional graphics as clearly shown.
Lattice's popularity for multivariate data visualization can also be attributed to its open-source nature, a rich documentation, active user community, compatibility with other R packages, and capacity for handling big datasets. Overall, We may produce visualizations that clearly illustrate the relationships and patterns in our data by mapping variables to various plot aesthetic features.
Similar Reads
Master Data Visualization With ggplot2 In this article, we are going to see the master data visualization with ggplot2 in R Programming Language. Generally, data visualization is the pictorial representation of a dataset in a visual format like charts, plots, etc. These are the important graphs in data visualization with ggplot2, Bar Ch
8 min read
Data visualization with R and ggplot2 The ggplot2 ( Grammar of Graphics ) is a free, open-source visualization package widely used in R Programming Language. It includes several layers on which it is governed. The layers are as follows:Layers with the grammar of graphicsData: The element is the data set itself.Aesthetics: The data is to
7 min read
Getting started with Data Visualization in R Data visualization is the practice of representing data through visual elements like graphs, charts, and maps. It helps in understanding large datasets more easily, making it possible to identify patterns and trends that support better decision-making. R is a language designed for statistical analys
5 min read
Weather Data Visualization using R Visualizing weather data is essential for understanding trends, patterns, and anomalies in meteorological information. R, with its robust packages for data manipulation and visualization, offers powerful tools to create insightful weather data visualizations. This article will guide you through the
5 min read
Interactive Data Visualization with Plotly Express in R Data Visualization in R is the process of representing data so that it is easy to understand and interpret. Various packages are present in the R Programming Language for data visualization. Plotly's R graphing library makes interactive, publication-quality graphs. Plotly can be used to make various
8 min read