Homework for METR 5803, Spring 2013

Assignment 1

Assigned 17 January, due 24 January

Use the number of F1+ tornadoes in the United States per month from 1954-2012. (December 2012 is an estimate.) The data are available as csv files here. Note that the years are the first column.

Calculate the following quantities for each month:

Comment on any differences between the parametric and non-parametric approaches to examining these datasets.

Assignment 2

Assigned 5 February, due 14 February

Evaluate the SPC Day 1 (12 Z) convective outlooks from 1973-2010. The data are available as csv files here. Note that the years are the first column, the slight risk are the next 4 columns, and the moderate risk are the final 4. Also, the slight and moderate risk are the same for 1973 and 1974. Things were redefined in 1975 to make them different.

Calculate the following quantities for each year for each outlook:

You don't need to show me the table, although I won't be upset if you do. What you need to do is to plot time series of the 6 quantities. You can put more than one on a single graph as long as you retain enough range in the vertical scale to see what's going on. That is, if one of them varies between 0.9 and 1 and another varies between 0 and 0.1, putting them on the same scale is not very easy to read. Also, plot a time series on a performance diagram (POD-y axis, FOH-x axis).
You can also create additional, informative graphics that will allow you to address the basic questions of:
  1. How has SPC performance changed over time?
  2. How are the slight risk and moderate risks related to each other (e.g., are there different apparent goals for them?
  3. What information that's not included in the tables might impact your assessment of the performance?

Assignment 3

(Assigned 19 February, due 26 February)

The following is a set of three probability of precipitation forecasts from the NWS office in Norman for Oklahoma City. The forecasts are all valid at the same periods, but with different lead times. The first two columns are for 0-12 hours, the second for 24-36, and the third for 48-60. The first column of each pair is the number of times the forecast value was used and the second column is the number of times precipitation occurred when that forecast was made.

Probability

Forecast

Precip Obs.

Forecast

Precip Obs.

Forecast

Precip Obs.

0

175

3

149

3

235

13

5

1

0

4

0

0

0

10

48

4

72

9

0

0

20

37

5

45

8

43

9

30

20

4

22

4

41

16

40

10

4

18

5

9

3

50

10

5

5

1

6

2

60

13

5

14

8

1

1

70

13

7

6

5

3

0

80

3

3

3

1

0

0

90

2

1

0

0

0

0

100

6

3

0

0

0

0

a) Construct attributes diagrams for each.

b) Calculate the reliability, resolution, uncertainty, the Brier Score, and the Brier skill score for each.

c) Discuss differences, particularly focusing on the question of change in performance with lead time.

d) For the three datasets, construct Relative Operating Characteristics curves and calculate the area under the curve (dot-to-dot). You may want to look at the notes for ROC curves. As always, discuss what you learn.

Assignment 4

Assigned 28 March, due 4 March

Calculate the break-even decision point for the stolen base decision for all possible cases of 1 man on base and different out situations. Use the Nichols table. Calculate the decision point for
  1. Maximixing runs scored
  2. Scoring at least 1 run