Great Deal! Get Instant $10 FREE in Account on First Order + 10% Cashback on Every Order Order Now

For this assignment, you will use the following three data sets. US_airlines.csv, US_airports.csv, US_airrecord.csv Using R you will prepare and explore the datasets using data cleaning and analysis...

1 answer below »

For this assignment, you will use the following three data sets. US_airlines.csv, US_airports.csv, US_airrecord.csv

Using R you will prepare and explore the datasets using data cleaning and analysis techniques and will discuss the discovered trends and points of interest.

Steps you will complete should include:

Inspect and summarise your data.

Clean and combine datasets where appropriate

· Check for and handle missing values

· Remove any unnecessary variables

· Transform any variables that you would like to use in a different form

Plot data and identify trends and/or points of interest

Perform data analysis to investigate

· The airlines which experience the most delays

· The busiest routes

· The relationship between distance between airports and flying time

Predicting flying time based on distance

Discuss your findings, comment your code and prepare explanatory visualisations.
Some observations i have made

· The times are as per the 24 hour clock so 10 is 00.10 and 1542 is 15.42.

· Any time differences are in minutes.

· There are 19 flights have a wheels off time but have a cancellation reason what do we do with them. All reasons relate to the weather and airline.

· TAIL_NUMBER – One value needs an N put in front of it 7819A

· Elapsed time NA values need to be calculated by AIR_TIME+TAXI_IN+TAXI_OUT however first the NA values in AIR_TIME need to be replaced with a calculation of time duration between WHEELS_OFF and WHEELS_ON

· Need to add relevant data from US_Airlines.csv and US_airports.csv using the IATA_COD

· The NA values for these variables below relate to where there was no delay. The 0 values relate to where there was a delay but not for that reason. Values other than these signify how long of a delay there was for each reason. Some delays can be for more than one reason. Eg Air system and airline delay. These need some transformation.

o AIR_SYSTEM_DELAY

o SECURITY_DELAY

o AIRLINE_DELAY

o LATE_AIRCRAFT_DELAY

o WEATHER_DELAY

Answered Same Day Feb 15, 2021

Solution

Rohith answered on Feb 18 2021
126 Votes
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here