When you use ‘summarize’ command, the data will be summarized by the groups.
Original data:
After Summarize:
But sometimes you don’t want to ‘summarize’ the data, instead you want to keep all the original rows and have one new column to have the aggregated values like below.
You can do this simply by the following two steps:
\1. Group the data
group_by(CARRIER)
\2. Use ‘mutate’ instead of ‘summarize’.
mutate(average_arr_delay = mean(ARR_DELAY, na.rm=TRUE))
This will keep all the rows as the original data, and simply add a new column with the calculation defined in the ‘mutate’ command.