Kaggle 2015 Flight Delays and Cancellations dataset (with this fix for some airport codes). It has information about flights for major US airlines in 2015 (arrival/departure times, delay, flight ID, distance, ...) and information about airport cities/coordinates, airlines.

The data was retrieved via SQL queries in PostgresSQL database, using ROW_TO_JSON for export.

The most/least busiest months of the year

Flights per month

SQL
SELECT month, count(*)
FROM flights
GROUP BY month

Flights by date

Hover to see data at the selected point.
SQL
SELECT make_date(year, month, day) as date, count(*)
FROM flights
GROUP BY date

As we can see, the most busiest months are in the summer (especially July-August), probably because of holidays, also there are some increases in the beginning of Spring (Easter, Spring break?), and the least active period is from middle of January to middle of February.

Also it depends on day of the week: there is a huge drop of activity on Saturdays, probably related to business trips.

The most active airports

Line width depends on number of flights.

Select airport

SQL
SELECT origin_airport, destination_airport, count(*)
FROM flights
GROUP BY origin_airport, destination_airport

Most of the biggest airports (and flight destinations) are located near the east coast and in the south part of the west coast (LA, SF), the same as US population density.

Airports with the most/least delays

Circle size represents airport size (number of flights) and opacity shows its' delay rate (counting only delays that were longer than 10 minutes).

This map uses bigger scale to reduce circle sizes and overlapping, however it's still present in some cases.

SQL
SELECT origin_airport,
       ROUND(count(*) / (SELECT count(*) FROM flights WHERE origin_airport=f.origin_airport)::decimal, 4) as delay_rate,
       (SELECT count(*) FROM flights WHERE origin_airport=f.origin_airport) as flights_count
FROM flights as f
WHERE departure_delay > 10
GROUP BY origin_airport

From this map we can see that most airports have at least 10% of delayed flights.

Also it looks like delays are more likely in smaller airports, such as 2 airports on Alaska with 45+%. All of the bigger airports do not have such high delay rate.

Airlines with the most/least delays

Circle size represents airline size (number of flights) and Y axis shows its' delay rate (counting only delays that were longer than 10 minutes).

SQL
SELECT airline,
       ROUND(count(*) / (SELECT count(*) FROM flights WHERE f.airline=f.airline)::decimal, 4) as delay_rate,
       (SELECT count(*) FROM flights WHERE airline=f.airline) as flights_count
FROM flights as f
WHERE departure_delay > 10
GROUP BY airline

The chart shows that most of the delays occur on flights from big airlines, smaller airlines seem to have less delays.