Skip to content

Karla-Flores/Sqlalchemy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

59 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Sqlalchemy--Challenge


Background


This assignment involved analysing and exploring Hawaii's climate data in 2 steps:

  1. Climate Analysis and Exploration
  2. Climate App

Climate Analysis and Exploration


Python and SQLAlchemy was used to do basic climate analysis and data exploration of the given climate database. All of the analysis was completed using SQLAlchemy ORM queries, Pandas, and Matplotlib.

Precipitation Analysis

  • The datetime library was used to identify the date 12 months prior to the last date available. Using these dates and after dropping null values, the precipitation values for the last year of data was used to plot the following graph:

  • Screen Shot 2021-06-29 at 9 49 04 PM
  • A statistics summary using .describe() revealed the following:

Screen Shot 2021-06-29 at 9 53 48 PM

Station Analysis

This section asked to find the number of stations (nine) and the most active station (USC00519281).

Temperature observations at this station for the last 12 months was plotted as a histogram with the following results:


Screen Shot 2021-06-29 at 10 02 42 PM


Climate App



This is a web app in the app.py file, created using SQLAlchemy and Flask API. The climate database can be queried for the following to receive information in JSON format:

  • Precipitation (/api/v1.0/precipitation): This route displays every date and temperature observation across all weather stations in Hawaii.
  • Stations (/api/v1.0/stations): This route displays a list of all 9 stations (ID, Station and Name).
  • Temperature Observations (/api/v1.0/tobs): This route displays every date and temperature observation for the most active station in Hawaii (USC00519281) in the last 12 months of data available.
  • Daily Normals from start date (/api/v1.0/start_date): This route allows you to enter a start date in the format 'YYYY-MM-DD' to retrieve daily normals (TMIN, TAVG, TMAX) from that date onward until the end of data available.
  • Daily Normals between start and end date (/api/v1.0/start_date/end_date): This route allows you to enter a start date AND an end date in the format 'YYYY-MM-DD' to retrieve daily normals (TMIN, TAVG, TMAX) for the date range.

Bonus Challenge


Temperature Analysis I

June and December temperature observations were retrieved by converting string dates to DateTime objects to filter queries by month.

The average temperature in June at all stations across all available years in the dataset is 74.94 (F). And the average temperature in June at all stations across all available years in the dataset is 71.04 (F). Thus, the mean temperature difference between June and December is a mere 3.9 degrees Fahrenheit.

In this analysis, the t-test was paired because the samples do not have any data overlap. The high t-value shows the two sets are very different, while the low p-value shows that the data did not occur by chance.

Screen Shot 2021-06-29 at 11 04 16 PM

Temperature Analysis II

This challenge involved using a predefined function that calculated daily normals for a given date range (2018-06-01 to 2018-06-15). The .timedelta() method from the datetime library was also used to determine matching start and end dates from the previous year.

With the daily normals, the following graph was plotted using tavg, tmin and tmax values:

Screen Shot 2021-06-29 at 11 10 42 PM

Daily Rainfall Average

In this challenge, the first task was to calculate the precipitation for each weather station and display the results and station information. After querying the databases for both tables and checking for null values, the query results were saved in Pandas data frames to make it easier to manipulate data using groupby and merge. The next data frame is the result:

Screen Shot 2021-06-29 at 11 11 12 PM

The second part of this challenge involved finding daily normals for each date of our defined trip from 2017-07-01 to 2017-07-14 (using only the month and day to identify historic data with the same dates) and plotting an area plot as below:

Screen Shot 2021-06-29 at 11 11 50 PM

Trip daily normals dataframe:

Screen Shot 2021-06-29 at 11 18 53 PM

About

This project uses Python and SQLAlchemy to perform a fundamental climate analysis and data exploration of a climate database. Using this analysis, a Flask API was also created.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors