References:
Doswell, C. A. III, R. DaviesJones, and D. L. Keller, 1990: On summary measures of skill in rare event forecasting based on contingency tables. Wea. Forecasting, 5, 576586. (PDF)
Marzban, C., 1998: Scalar measures of performance in rareevent situations. Wea. Forecasting, 13, 753763. (PDF)
Murphy, A. H., 1996: The Finley affair: a signal event in forecast verification. Wea. Forecasting, 11, 320. (PDF)
Roebber, P. J., 2009: Visualizing multiple measures of forecast quality. Wea. Forecasting, 24, 601608. (PDF)
The 2x2 verification problem is the most studied and widestknown problem in the discipline. Partially as a result of this, notation is not consistent from paper to paper and scores based off of the table have been rediscovered over the years. As a general framework, we write the problem in a variety of ways. In terms of the joint distributions, it is:

























A variety of scores can be derived from this table. The three references give these scores. We'll concentrate on just a small number, assuming that x=1 is the primary event for which we are forecasting. Brackets give worst, best possible scores
The Hit Rate or Fraction Correct is given by [0,1]
The probability of detection (POD) is given by[0,1]
The false alarm ratio (FAR) is given by [1,0]
Note that a related term usually called the false alarm rate is given by n_{12}/n_{.2}. Confusion arises because the name false alarm rate is sometimes used in conjunction with the score defined by FAR. The false alarm rate is usually used in the context of relative operator characteristics (ROC) curves.
Bias can be defined as [(0,1),(1,inf.)]
A score introduced by Gilbert in 1884 and later called the Threat Score or the Critical Success Index is given by [0,1]
Scores that try to account for some version of random chance of being correct:
Equitable Threat Score [1,1]
where C = (n_{1.}n_{.1})/n_{..}
Heidke skill score (originally from Doolittle) [inf.,1]
This is a measure of correct forecasts, with random correct forecasts subtracted out. The reference forecast is random forecasting, subject to the constraint that marginal distributions of forecasts are the same as the marginal distributions of observations.
HanssenKuipers skill score (originally from Peirce, also true skill statistic) [1,1]
Similar to Heidke, except the constraint on the reference forecasts is that they are constrained to be unbiased.
The latter two scores can also be written as
where H is the Hit Rate, defined above, H_{ran}=(n_{1.}n_{.1}+ n_{2.}n_{.2})/n_{..}^{2} and H_{u,ran}= (n_{.1}^{2}+n_{.2}^{2})/n_{..}^{2}