\documentclass[xcolor = pdftex,dvipsnames,table]{beamer}
\usetheme{Berkeley}
\title{Prediction}
\author{Lawrence Hubert}
\begin{document}
\begin{frame}
\titlepage
\end{frame}
\begin{frame}
\frametitle{Starting Quotes}
The race is not always to the swift nor the battle to the strong --- but that's the way to lay your bets.
-- Damon Runyon
\bigskip
\noindent The only function of economic forecasting is to make astrology look good.
-- John Kenneth Galbraith
\bigskip
\noindent If all else fails, immortality can always be assured by spectacular error.
-- John Kenneth Galbraith
\end{frame}
\begin{frame}
\frametitle{}
\noindent I like also the men who study the Great Pyramid, with a view to deciphering its mystical lore. Many great books have been written on this subject, some of which have been presented to me by their authors. It is a singular fact that the Great Pyramid always predicts the history of the world accurately up to the date of publication of the book in question, but after that date it becomes less reliable.
-- Bertrand Russell
\bigskip
\noindent The list of studies in which the regression factor has been neglected grows monotonous, as well as distressing.
-- Philip Rulon (1941)
\end{frame}
\begin{frame}
\frametitle{What you already know}
The attempt to predict the values on some (dependent) variable by a function of (independent) variables is typically approached by simple or multiple regression, for one and more than one predictor, respectively.
\bigskip
The most common combination rule is a linear function of the independent variables obtained by least-squares, i.e., the linear combination that minimizes the sum of the squared residuals between the actual values on the dependent variable and those predicted from the linear combination.
\end{frame}
\begin{frame}
\frametitle{}
In the case of simple regression, scatterplots again play a major role in assessing linearity of the relationship, the possible effects of outliers on the slope of the least-squares line, and the influence of individual objects in its calculation.
\bigskip
Regression slopes, in contrast to the correlation, are neither scale invariant nor symmetric in the dependent and independent variables.
\bigskip
One usually interprets the least-squares line as one of expecting, for each unit change in the independent variable, a regression slope change in the dependent variable.
\end{frame}
\begin{frame}
\frametitle{}
There are several topics involving prediction that do not (necessarily) concern linear regression, and because of this, no extended discussion of these is given. One area important for the legal system is Sex Offender Risk Assessment, and the prediction of recidivism for committing another offense.
\bigskip
The Rapid Risk Assessment for Sexual Offender Recidivism (or the more common acronym, RRASOR, and pronounced ``razor''). It is based on four items: Prior Sex Offense Convictions -- 0, 1, 2, or 3 points for 0, 1, 2, or 3+ prior convictions, respectively; Victim Gender: only female victims (0 points); only male victims (1 point); Relationship to Victim: only related victims (0 points); any unrelated victim (1 point); Age at Release: 25 or more (0 points); 18 up to 25 (1 point).
\end{frame}
\begin{frame}
\frametitle{}
Another approach to prediction that we do not develop, is in the theory behind chaotic systems, such as the weather. A hallmark of such dynamic prediction problems is an extreme sensitivity to initial conditions, and a general inaccuracy in prediction even over a relatively short time frame.
\bigskip
The person best known for chaos theory is Edward Lorenz and his ``butterfly effect'' -- very small differences in the initial conditions for a dynamical system (e.g., a butterfly flapping its wings somewhere in Latin America), may produce large variations in the long term behavior of the system.
\end{frame}
\begin{frame}
\frametitle{Topics you may not know as well}
regression toward the mean;
\bigskip
methods involved in using regression for prediction that incorporate corrections for unreliability;
\bigskip
differential prediction effects in selection based on tests;
\bigskip
interpreting and making inferences from regression weights;
\bigskip
the distinction between actuarial (statistical) and clinical prediction.
\end{frame}
\begin{frame}
\frametitle{Regression toward the mean}
Regression toward the mean is a phenomenon that will occur whenever dealing with (fallible) measures with a less-than-perfect correlation.
\bigskip
The word ``regression'' was first used by Sir Francis Galton in his 1886 paper, \emph{Regression Toward Mediocrity in Hereditary Stature}, where he showed that heights of children from very tall or short parents would regress toward mediocrity (i.e., toward the mean) --- exceptional scores on one variable (parental height) would not be matched with such exceptionality on the second (child height).
\end{frame}
\begin{frame}
\frametitle{}
Regression toward the mean is a ubiquitous phenomenon, and given the name ``regressive fallacy'' whenever cause is ascribed where none exists.
\bigskip
Generally, interventions are undertaken if processes are at an extreme, e.g., a crackdown on speeding or drunk driving as fatalities spike; treatment groups formed from individuals who are seriously depressed; individuals selected because of extreme behaviors, both good or bad; and so on.
\end{frame}
\begin{frame}
\frametitle{}
There are many common instances where regression may lead to invalid reasoning: I went to my doctor and my pain has now lessened; I instituted corporal punishment and behavior has improved; he was jinxed by a \emph{Sports Illustrated} cover because subsequent performance was poorer (i.e., the ``sophomore jinx''); although he hadn't had a hit in some time, he was ``due,'' and the coach played him; and on and on.
\bigskip
More generally, any time one optimizes with respect to a given sample of data by constructing prediction functions of some kind, there is an implicit use and reliance on data extremities. In other words, the various measures of goodness-of-fit or prediction we might calculate need to be cross-validated either on new data or by a clever sample reuse strategy such as the well-known jackknife or bootstrap procedures.
\end{frame}
\begin{frame}
\frametitle{}
The degree of ``shrinkage'' we see in our measures based on this cross-validation, is an indication of the fallibility of our measures and the (in)adequacy of the given sample sizes.
\bigskip
We have the ``winner's curse,'' where someone is chosen from a large pool (e.g., of job candidates), who then doesn't live up to expectation; or when we attribute some observed change to the operation of ``spontaneous remission.''
\bigskip
As Campbell and Kenny note: ``many a quack has made a good living from regression toward the mean.''
\end{frame}
\begin{frame}
\frametitle{}
\end{frame}
\begin{frame}
\frametitle{Incorporating reliability corrections in prediction}
The model for how any observed score, $X$, might be constructed additively from a true score, $T_{X}$, and an error score, $E_{X}$, where $E_{X}$ is typically assumed uncorrelated with $T_{X}$: $X = T_{X} + E_{X}$.
\bigskip
When we consider the distribution of an observed variable over, say, a population of individuals, there are two sources of variability present in the true and the error scores.
\bigskip
If we are interested primarily in structural models among true scores, then some correction must be made because the common regression models implicitly assume that variables are measured without error.
\end{frame}
\begin{frame}
\frametitle{}
The estimation, $\hat{T}_{X}$, of a true score from an observed score, $X$, was derived using the regression model by Kelley in the 1920's , with a reliance on the algebraic equivalence that the squared correlation between observed and true score is the reliability.
\bigskip
If we let $\hat{\rho}$ be the estimated reliability, Kelley's equation can be written as \[ \hat{T}_{X} = \hat{\rho} X + (1 - \hat{\rho})\bar{X} \ , \] \noindent where $\bar{X}$ is the mean of the group to which the individual belongs.
\bigskip
In other words, depending on the size of $\hat{\rho}$, a person's estimate is partly due to where they are in relation to the group --- upwards if below the mean; downwards if above.
\end{frame}
\begin{frame}
\frametitle{}
The application of this statistical tautology in the examination of group differences provides such a surprising result to the statistically naive, that this equation has been called ``Kelley's Paradox''.
\bigskip
We might note that this notion of being somewhat punitive of performances better than the group to which one supposedly belongs, was not original with Kelley, but was known at least 400 years earlier. In the words of Miguel de Cervantes (1547--1616): ``Tell me what company you keep and I'll tell you what you are.''
\end{frame}
\begin{frame}
\frametitle{}
In the topic of errors-in-variables regression, we try to compensate for the tacit assumption in regression that all variables are measured without error.
\bigskip
Measurement error in a response variable does not bias the regression coefficients per se, but it does increase standard errors, and thereby reduces power. This is generally a common effect: unreliability attenuates correlations and reduces power even in standard ANOVA paradigms.
\end{frame}
\begin{frame}
\frametitle{}
Measurement error in the predictor variables biases the regression coefficients. For example, for a single predictor, the observed regression coefficient is the ``true'' value multiplied by the reliability coefficient.
\bigskip
Thus, without taking account of measurement error in the predictors, regression coefficients will generally be underestimated, producing a biasing of the structural relationship among the true variables.
\end{frame}
\begin{frame}
\frametitle{Differential prediction effects in selection}
One area in which prediction is socially relevant is in selection based on test scores, whether for accreditation, certification, job placement, licensure, educational admission, or other high-stakes endeavors.
\bigskip
We note that most of these discussions about fairness of selection need to be phrased in terms of regression models relating a performance measure to a selection test; and whether the regressions are the same over all identified groups of relevance, e.g., ethnic, gender, age, and so on.
\bigskip
Specifically, are slopes and intercepts the same? If so or if not, how does this affect the selection mechanism being implemented, and whether it can be considered fair?
\end{frame}
\begin{frame}
\frametitle{Interpreting and making inferences from regression weights}
\noindent Mathematics has given economics rigor, but alas, also mortis.
-- Robert Heilbroner
\bigskip
\noindent Statistics are the triumph of the quantitative method, and the quantitative method is the victory of sterility
and death.
-- Hillaire Belloc (\emph{The Silence of the Sea})
\bigskip
\noindent Years ago a statistician might have claimed that statistics deals with the processing of data ... today's
statistician will be more likely to say that statistics is concerned with decision making in the face of
uncertainty.
-- H. Chernoff and L. E. Moses, \emph{Elementary Decision Theory} (1959)
\end{frame}
\begin{frame}
\frametitle{}
\noindent Let us remember the unfortunate econometrician who, in one of the major functions of his system, had to use a proxy for risk and a dummy for sex.
-- Fritz Machlup
\bigskip
\noindent When a true genius appears in this world, you may know him by this sign, that the dunces are all in confederacy against him.
-- Jonathan Swift
\end{frame}
\begin{frame}
\frametitle{}
Although multiple regression can be an invaluable tool in many arenas, the interpretive difficulties that result from the interrelated nature of the independent variables must always be kept in mind.
\bigskip
As in the World War II example in the reading, depending on what variables are (or are not) included, the structural relationship among the variables can change dramatically. At times, this malleability can be put to either good or ill usage.
\end{frame}
\begin{frame}
\frametitle{}
For example, in applying regression models to argue for employment discrimination (e.g., in pay, promotion, hiring, and so on), the multivariable system present could be problematic in arriving at a ``correct'' analysis.
\bigskip
Depending on what variables are included, some variables may ``act'' for others (as ``proxies''), or be used to hide (or at least, to mitigate) various effects. If a case for discrimination rests on the size of a coefficient for some ``dummy'' variable that indicates group membership (according to race, sex, age, and so on), it may be possible to change its size depending on what variables are included or excluded from the model, and their relationship to the dummy variable.
\end{frame}
\begin{frame}
\frametitle{The (Un)reliability of Clinical Predictions (of Violence)}
\noindent If your mother says she loves you, check it out.
Adage from the Chicago City News Bureau
\medskip
\noindent Prosecutors in Dallas have said for years -- any prosecutor can convict a guilty man. It takes a great prosecutor to convict an innocent man.
Melvyn Bruder (The Thin Blue Line)
\end{frame}
\begin{frame}
\frametitle{}
The last section in this chapter concerns the (un)reliability of clinical (behavioral) prediction, particularly for violence, and will include two extensive redactions at the end of the section:
\bigskip
one is the majority opinion in the Supreme Court case of Barefoot v.\ Estelle (Decided, July 6, 1983) and an eloquent Justice Blackmun dissent; the second is an Amicus Curiae brief in this same case from the American Psychiatric Association on the accuracy of clinical prediction of future violence.
\end{frame}
\begin{frame}
\frametitle{}
The Psychiatrist, James Grigson, featured so prominently in the opinions for Barefoot v.\ Estelle and the corresponding American Psychiatric Association Amicus brief, played the same role repeatedly in the Texas legal system.
\bigskip
For over three decades before his retirement in 2003, he would testify when requested at death sentence hearings to a high certainty as to ``whether there is a probability that the defendant would commit criminal acts of violence that would constitute a continuing threat to society.''
\bigskip
An affirmative answer by the sentencing jury imposed the death penalty automatically, as it was on Thomas Barefoot; he was executed on October 30, 1984.
\end{frame}
\end{document}