Convert dataframe to data.table in R
Last Updated : 17 May, 2021
In this article, we will discuss how to convert dataframe to data.table in R Programming Language. data.table is an R package that provides an enhanced version of dataframe. Characteristics of data.table :
- data.table doesn’t set or use row names
- row numbers are printed with a : for better readability
- Unlike dataframes, columns of character type are never converted to factors by default in data.table.
Method 1 : Using setDT() method
While dataframes are available using the base R packages, data.table object is a part of the data.table package, which needs to be installed in the working space. The setDT() method can be used to coerce the dataframe or the lists into data.table, where the conversion is made to the original dataframe. The modification is made by reference to the original data structure.
Syntax: setDT(x)
Arguments :
- x : A named or unnamed list, data.frame or data.table.
Example 1:
R # using the required library library(data.table) # declare a dataframe data_frame <- data.frame(col1 = c(1:7), col2 = LETTERS[1:7], col3 = letters[1:7]) print ("Original DataFrame") print (data_frame) # converting into data.table setDT(data_frame) print ("Resultant DataFrame") print (data_frame)
Output
[1] "Original DataFrame" > print (data_frame) col1 col2 col3 1 1 A a 2 2 B b 3 3 C c 4 4 D d 5 5 E e 6 6 F f 7 7 G g [1] "Resultant DataFrame" > print (data_frame) col1 col2 col3 1: 1 A a 2: 2 B b 3: 3 C c 4: 4 D d 5: 5 E e 6: 6 F f 7: 7 G g
All the missing and NA values stored in a dataframe are preserved in data.table as well. The row names are reassigned to identifiers beginning with integer values starting from 1 till the number of rows in the dataframe. The library data.table also provides other functions to verify if the R object is a data.table using is.data.table(data_frame). It returns true if the specified argument is data.table else false.
Example 2:
R # using the required library library(data.table) # declare a dataframe data_frame <- data.frame(col1 = c(1, NA, 4, NA, 3, NA), col2 = c("a", NA, "b", "e", "f", "G"), row.names = c("row1","row2","row3", "row4","row5","row6")) print ("Original DataFrame") print (data_frame) # converting into data.table setDT(data_frame) print ("Resultant DataFrame") print (data_frame) # checking if the dataframe is data table print ("Check if data table") print (is.data.table(data_frame))
Output
[1] "Original DataFrame" col1 col2 row1 1 a row2 NA <NA> row3 4 b row4 NA e row5 3 f row6 NA G [1] "Resultant DataFrame" col1 col2 1: 1 a 2: NA <NA> 3: 4 b 4: NA e 5: 3 f 6: NA G [1] "Check if data table" [1] TRUE
Explanation: The original dataframe is stored as a data.frame object and then using the setDT method the same dataframe is returned with row numbers appended at the beginning, with the row number identifier followed by a colon. The missing values, that is NA are returned as it is. Since the changes are made to the dataframe, when we check whether it is a data table or not using is.data.table(), it returns logical TRUE value.
Method 2 : Using as.data.table() method
The as.data.table() method can be used to coerce the dataframe or the lists into data.table if the specified object is not originally a data.table, and the conversion is possible. The changes are not made to the original dataframe, therefore, it creates a copy of the base object.
Syntax: as.data.table(x,keep.rownames=FALSE)
Arguments :
- x : A named or unnamed list, data.frame or data.table.
- keep.rownames : By default: False. For data.frames, TRUE retains the data.frame’s row names under a new column rn. keep.rownames = "id" names the column "id" instead.
Example:
R # using the required library library(data.table) # declare a dataframe data_frame <- data.frame(col1 = c(1, NA, 4, NA, 3, NA), col2 = c("a", NA, "b", "e", "f", "G"), row.names = c("row1","row2","row3", "row4","row5","row6")) print ("Original DataFrame") print (data_frame) # converting into data.table dt <- as.data.table(data_frame, TRUE) print ("Resultant DataFrame") print (dt) print ("Check if data table") print (is.data.table(dt))
Output
[1] "Original DataFrame" > print (data_frame) col1 col2 row1 1 a row2 NA <NA> row3 4 b row4 NA e row5 3 f row6 NA G [1] "Resultant DataFrame" rn col1 col2 1: row1 1 a 2: row2 NA <NA> 3: row3 4 b 4: row4 NA e 5: row5 3 f 6: row6 NA G [1] "Check if data table" [1] TRUE
Explanation: The original dataframe has a row name for each of the rows. When the dataframe is converted to data table, the row names form a separate column "rn" and also each row is lead by a row number identifier followed by colon. However, the changes are not made to the original dataframe. So, when we apply the is.data.table() method to the original dataframe, it returns FALSE. On the contrary, if we apply this method to the result of the as.data.table() method, we get TRUE value.
Similar Reads
Convert Tibble to Data Frame in R
Tibbles are a type of data frame in R Programming Language that has an enhanced print method, making it easier to display data. However, in some situations, we may need to convert a tibble to a data frame. So, in this article, we will explore some approaches to convert tibbles to data frames. Tibble
4 min read
Convert JSON data to Dataframe in R
In Data Analysis, we have to manage data in various formats, one of which is JSON (JavaScript Object Notation). JSON is used for storing and exchanging data between different systems and is hugely used in web development. In R Programming language, we have to work often with data in different format
4 min read
Convert DataFrame to vector in R
In this article, we will discuss how a dataframe can be converted to a vector in R. For the Conversion of dataframe into a vector, we can simply pass the dataframe column name as [[index]]. Approach: We are taking a column in the dataframe and passing it into another variable by the selection method
2 min read
How to Convert XML to DataFrame in R?
A Data Frame is a two-dimensional and tabular data structure in the R that is similar to a table in the database or an Excel spreadsheet. It is one of the most commonly used data structures for the data analysis in R with the columns representing the various and rows representing the observations. X
4 min read
Create table from DataFrame in R
In this article, we are going to discuss how to create a table from the given Data-Frame in the R Programming language. Function Used: table(): This function is an essential function for performing interactive data analyses. As it simply creates tabular results of categorical variables. Syntax: tabl
3 min read
Convert dataframe to nested list in R
In R Programming Data frame is nothing but a two-dimensional entity that consists of rows and columns whereas Lists in R are data structures that are able to store multiple data values at once. List Creation Lists in R can be created using list() function. Syntax: list(element1, element2, element3,.
2 min read
Convert Data Frame Column to Numeric in R
In R, converting DataFrame columns to numeric is a common and important step in data preprocessing, particularly for statistical modeling and analysis. Below are effective methods to perform this conversion using base R and the readr package. Method 1: Convert One Column to NumericStart by checking
2 min read
How to Convert a List to a Dataframe in R
We have a list of values and if we want to Convert a List to a Dataframe within it, we can use a as.data.frame. it Convert a List to a Dataframe for each value. A DataFrame is a two-dimensional tabular data structure that can store different types of data. Various functions and packages, such as dat
4 min read
How to create dataframe in R
Dataframes are fundamental data structures in R for storing and manipulating data in tabular form. They allow you to organize data into rows and columns, similar to a spreadsheet or a database table. Creating a data frame in the R Programming Language is a simple yet essential task for data analysis
3 min read
data.table vs data.frame in R Programming
data.table in R is an enhanced version of the data.frame. Due to its speed of execution and the less code to type it became popular in R. The purpose of data.table is to create tabular data same as a data frame but the syntax varies. In the below example let we can see the syntax for the data table:
3 min read