Reading and writing in R: In R, we can read data from files, stored outside the R environment. In this chapter we will learn
- how to read from .csv file in R
- how to read from excel (.xlsx and .xls) file in R
- how to read from database file in R
- how to write into .csv file in R .
- how to write into excel (.xlsx and .xls) file in R
- how to write into database in R
Getting and Setting the Working Directory in R:
Before Reading we have to specify the location where we can find the respective file to be read.
For that we use the function setwd() which will set the specific location as working directory
# Get the current working directory. getwd() # output will be "C:/Users/username/Documents" # Set current working directory. setwd("D:/Folder_name") #now the working directory has been set to Folder_name in D Drive # Get the current working directory. getwd() # output will be "D:/Folder_name"
Reading a CSV File in R:
read.csv() is the function to read a CSV file in R.
(i) If the file is available in your current working directory
# read a csv file in R mydata = read.csv("input.csv") # reads the csv file in R object named mydata print(mydata)
(ii) If the file is available in some other location you have to specify the path along with the file name
# read a csv file in R mydata = read.csv("D:/other_folder/input.csv") #reads the file named "input.csv" from "other_folder" in "D drive" print(mydata)
In the Above examples csv Files are read and stored in the R object called “mydata”.
Writing into a CSV File in R:
In R to write data into csv file we use the function write.csv()
# Write data into a csv file in R write.csv(mydata,"output.csv") # contents in the object “mydata” are written to a csv file named “output.csv” in Current working directory Write.csv(mydata, “D:/other_folder/output.csv”) # similarly to write outside the working directory you have to provide the path along with file name
Reading From Excel(.xls and .xlsx) in R:
Microsoft Excel is the most widely used spreadsheet program which stores data in the .xls or .xlsx format. R can read directly from these files using some excel specific packages. Few such packages are – XLConnect, xlsx, gdata etc. We will be using “xlsx” package. R can also write into excel file using this package.
# read data from excel (.xls and .xlsx) file in R install.packages("xlsx") library(xlsx) # to read the data from nth sheet (say 4) mydata = read.xlsx("D:/myexcel.xlsx",4) # read the data from worksheet named mysheet1 mydata = read.xlsx ("D:/myexcel.xlsx", sheetName = "mysheet1")
Writing into excel file (.xls and .xlsx) in R:
write.xlsx() is the function used to write R object to excel file (.xls and .xlsx)
# write data into excel (.xls and .xlsx) file in R install.packages("xlsx") library(xlsx) # data in the object “mydata” is written in a file named ”dummy.xlsx” in D drive with sheet named ”Newdata” write.xlsx(mydata,"D:/dummy.xlsx",sheetName = "Newdata")
Read from Databases in R
Next lets see how to read a database file from R.For that first we have to setup connection with the function odbcConnect()
Below is an example of reading from an ODBC database. Function odbcConnect() sets up a connection to database, sqlQuery() sends an SQL query to the database, and odbcClose() closes the connection.
# read from data base in R install.packages("RODBC") library(RODBC) # establishes Connection connection = odbcConnect(dsn="servername",uid="userid",pwd="******") # Query from Database table query = "SELECT * FROM lib.table WHERE ..." myData = sqlQuery(connection, query, errors=TRUE) #Close the connection odbcClose(connection)
Read from PostGreSql in R:
lets look how to read the data from PostGreSql
install.packages("RPostgreSQL") library(RPostgreSQL) ## establish connection drv = dbDriver("PostgreSQL") con = dbConnect(drv, dbname="Databasename",host="x.xx.x.xx",port=xxxx,user="username",password="*****") ##query from postgres df_postgres = dbGetQuery(con, "SELECT * from Table_Name where …")
Write to PostGresql in R
install.packages("RPostgreSQL") library(RPostgreSQL) dbWriteTable(con, "New_Table", df_postgres)