The dataset consists of 8 columns and their respective description is as follows:

Day of Month

The specified day of the month

Day of Week

The specified day of the week

Reporting Carrier

Unique Carrier Code. When the same code has been used by multiple carriers, a numeric suffix is used for earlier users, for example, PA, PA(1), PA(2). Use this field for analysis across a range of years.


Origin Airport


Destination Airport

Departure Delay Minutes

Difference in minutes between scheduled and actual departure time. Early departures set to 0.

Arrival Delay Minutes

Difference in minutes between scheduled and actual arrival time. Early arrivals set to 0.

Weather Delay

Weather delay in minutes

Querying data stored in HDFS with HIVE

Problem statement :

Write an HQL statement to list all flights whose departure delay time is greater than the average departure delay time and show how much their delay time is greater than the average.

Sample Solution

This question has been answered.

Get Answer