How to read multiple excel spreadsheets from one excel workbook?

We have multiple workbooks that contain annual data that is split up into worksheets by months.

Each workbook includes the beginning year, “2011” as shown in this first example, below.

Each worksheet has an identical format, and is named “JANUARY”, “FEBRUARY”, MARCH"…

One tedious way to import each sheet is to create an R Script Data Frame for each sheet by copying and pasting this code:

exploratory::read_excel_file( "C:\\FY22_Projects\\...\\Field Observation Data 2011-2012.xls", sheet = 1, na = c('','NA'), skip=15, col_names=TRUE, trim_ws=TRUE, tzone='America/Chicago', col_types = "text") %>%
  exploratory::clean_data_frame()

and then increasing the sheet number by one each time.

But how do we import all the worksheets from all of the workbooks and merge them?

Thinking this is a similar question to one you all helped me with earlier this year, but specific to Excel files.

Lastly, I found on this Cheatsheet that there is a way to use map_dfr to read in multiple sheets but am struggling to make it work.

Thank you for taking a look at this.

I found that this stackoverflow article is helpful. Here is what I tried and it can read all the tabs in an Excel file.

filename<-"/Users/kei/Downloads/Untitled.xlsx"
sheets <- readxl::excel_sheets(filename)
x <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X, skip=0))
bind_rows(x)

1 Like

This works great for me! For one Excel workbook. How about if I want to import all the Excel workbooks in a folder?

Can we do something with “searchAndReadExcelFiles”?

This script allows me to import one sheet from all 10 workbooks:

exploratory::searchAndReadExcelFiles(folder = "C:/FY22_Projects/.../folder", pattern = "*.xls|*.xlsx|*.xlsm", sheet = "SEPTEMBER", na = c('','NA'), skip=15, col_names=TRUE, trim_ws=TRUE, tzone='America/Chicago', col_types = "text") %>%
  exploratory::clean_data_frame()

Is there a way to combine these functions? Thanks!

This article helps how to read multiple files in a folder.

https://exploratory.io/note/kanaugust/Reading-Multiple-Excel-or-CSV-Files-Together-Rna9YmN0Sr

Instead of directly calling read_excel, you can call a function that reads all the tabs in a sheet.

read_all_tabs <- function(filename) {
    sheets <- readxl::excel_sheets(filename)
    x <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X, skip=0))
    bind_rows(x)
}

files <- list.files(path = "/Users/kei/Downloads/excelFolder", pattern = "*.xlsx", full.names = T)
tbl <- sapply(files, read_all_tabs, simplify=FALSE)
bind_rows(tbl)

1 Like

Thank you again Kei. This worked perfectly.

Just a quick little note - if someone wanted to add the filepath and name of the imported file, just add %>% bind_rows(.id = "id") after the sapply function, like this

read_all_tabs <- function(filename) {
    sheets <- readxl::excel_sheets(filename)
    x <- lapply(sheets, function(X) readxl::read_excel(filename, sheet = X, skip=0))
    bind_rows(x)
}

files <- list.files(path = "/Users/kei/Downloads/excelFolder", pattern = "*.xlsx", full.names = T)
tbl <- sapply(files, read_all_tabs, simplify=FALSE) %>% bind_rows(.id = "id")
bind_rows(tbl)