Show What You Know
Show What You Know
Show What You Know
unemployment rate of the same counties in the year 2010. I chose this data because I feel as if both, population and unemployment are on the rise and I wanted to use that to compare counties in Georgia. I found my data on the website www.bestplaces.net, and there I collected the population of each chosen county and the unemployment rate of each chosen county. The unemployment rate is the percentage of the countys population that is unemployed. After calculating the population, I found the mean to be 139,243, the standard deviation to be 267,118.7, the median to be 41,491.5, and the IQR to be 80,472. After calculating the unemployment rate, I found the mean to be 10.89, the standard deviation to be 2.6469, the median to be 10.5, and the IQR to be 3.85.
Histogram of Population
I then looked at the center, shape, and each set of data. For the population data, I found the shape
spread for
to be right skewed with outliers at 252938, 750167, and 1031472, and gaps at 421060.2 to 624530.8 and 828001.4 to 1031472. Since the data is skewed, I used the median as the center which is 41,491.5. Since I used the median as the center I used the IQR, 80472, to determine the spread, and the data has a high spread. For the unemployment rate data, I also found the shape to be right skewed, but with no gaps or outliers. I also used the median as the center which is 10.5, so I used the IQR to determine spread, which was 3.85, meaning that the data is fairly spread out. I have decided population to be the explanatory variable and the unemployment rate to be the response variable. I felt as if the population would have a slight effect on the unemployment rate because if the population is higher, there may be less jobs for employers to offer. Neither set of data is approximately Normal according to the empirical rule. I then conplot for the two structed a scattersets of data:
There is no obvious direction of the scatterplot, but if I had to guess, I would say that it is slightly negative. There is no form in the scatterplot either, but it contains at least two obvious outliers. This leads me to believe that there is no correlation in the data. I then used a calculator to get my least square regression line: predicted unemployment rate = 11.085 - 1.4005(population). While interpreting the slope, I found that
for every one person increase in population, the unemployment rate decreases by 1.4005. I also interpreted the y-inResidual Plot of Population Versus Unemployment Rate tercept and found that if the population was zero, the predicted unemployment rate to be 11.085, which makes no sense. While calculating the LSR line, I also found the r ( -.1413 ) and r squared ( .0199 ), which means that about 1.99% explained by the LSR of the variation can be line.
I then constructed a
There are no obvious patterns in the residual plot, however, this still does not lead me to believe that the LSR line is a good predictor because many of the residual points are bunched together in the same areas. I then used the LSR line to make a prediction on a population of 9,000: predicted unemployment rate = 11.085 - 1.4005(9000) = -12593.4% unemployment rate. Throughout this project, I have proven myself wrong. I felt that there would be a correlation between population in a county and the countys unemployment rate, but I
could not seem to find a correlation. There is very little relationship between these two variables.
County Colquitt Thomas Cook Brooks Tift Coffee Fulton Houston Ben Hill Dougherty Lowndes 45,579 45,954 16,613 16,417 42,366 40,617
Population
Unemployment Rate 9.4% 9.1% 12.8% 8.0% 11.0% 16.8% 10.5% 7.4% 16.1% 11.4% 8.5%
County Worth Decatur Chatam Mitchell Lee Camden DeKalb Berrien Jasper 21,141 28,770 252,938 24,366 34,579 50,141 750,167 16,836 14,119
Population
Unemployment Rate 10.7% 12.5% 8.4% 10.2% 8.2% 9.1% 10.5% 13.0% 14.2%