Exploratory Community

There are multiple tables in the same Excel spreadsheet. How can I import one that I want?

#1

Hi all.

I downloaded an excel file and I’m trying to import a table data from one of sheets in that file. The problem is, there are a couple of other tables in the same sheet. How can I import one that I want?

I’m using “Residential and nonresidential building fire and fire loss estimates by property use and cause (2003-2014)” https://www.usfa.fema.gov/downloads/xls/statistics/residential_nonresidential_fire_loss_estimates.xlsx from US Fire Administration page https://www.usfa.fema.gov/data/statistics/order_download_data.html, and I’m trying to grab the 1st table “OVERALL Residential Building Fires - Causes” table on “Res Bldg Fire Est by Causes” tab. I can open the sheet in the import dialog (see the pic following), but I don’t know what to do next.

Please let me know if you have any good idea.
Thanks,
Riku

#2

Hi Riku.

You are on the right track! Here’s what I did.

1). At the import dialog, set skip first N rows and click ‘Get Data’ again. Then you see a better data view like this. If you scroll down, you still see other tables that you don’t want. But here just go ahead and click “Save”.

2). Once you finish importing, you will see a summary view of the imported data frame. Click “Table” tab in the middle. Then you can check the data.

3). If you look at the data, you can easily see you just need first 12 rows. You can use slice command to crop only first 12 rows. Type slice(1:12) at the command box and click ‘Run’, then you see what you want.

Hope this helps.
–Kei

2 Likes
#3

Thanks Kei!
This works on my end too!

Thanks again.
Riku

#4

As a follow-up, is there a way to automate importing the first table from multiple spreadsheets (each with multiple tables) where the number of rows is inconsistent, but there is always a blank row between tables?

#5

Here is the dataset I’m looking at:
image

Perhaps it would be possible to determine the first row that has an <NA> (161) and then slice at that position?

#6

How about creating a column like this?

mutate(the_first_na_rownum = sum(case_when(!is.na(End_date)~1, TRUE~0)))

This will create a column called the_first_na_rownum and the value is populated with the row number of the first NA value row.

1 Like
#7

You can use “Remove Empty Rows” to remove those rows.

#8

Kei, thank you and sorry the image is misleading but the next table starts up at row 166.

#9

Hide, Thank you. For some reason it’s returning “632:” instead of “161”.

#10

Is there non NA value in End_date after line number 161?

#11

Yes, there are multiple tables in the the spreadsheet, I just want to import the first one.