\documentclass[12pt]{article}



% In this case, right margin is 8.5in - 1.25in - 6in = 1.25in.
\setlength{\textwidth}{6in}

% Set top margin - The default is 1 inch, so the following
% command sets a 0.75-inch top margin.
\setlength{\topmargin}{-0.25in}

% Set height of the text - What is left will be the bottom margin.
% In this case, bottom margin is 11in - 0.75in - 9.5in = 0.75in
\setlength{\textheight}{8in}

% Set the beginning of a LaTeX document
\begin{document}

\title{Psychology 594}         % Enter your title between curly braces
\author{Final Take Home Exercise; Fall 2011}        % Enter your name between curly braces

\maketitle

\large

Questions I and II are to be done with your chosen proximity
matrix (from Michael Lee's web site).

\bigskip

\verb+http://cda.psych.uiuc.edu/multivariate_class_final_2011.zip+

\bigskip



I. (a) From routines available in the Matlab Statistical Toolbox,
carry out a complete-link hierarchical clustering and interpret your results.  Use at least the
following three M-files:

\begin{verbatim}
squareform.m
linkage.m
dendrogram.m
\end{verbatim}

(b) From the Cluster Analysis Toolbox M-files:

\bigskip

(i) use \verb+order.m+; \verb+ultrafit.m+ (with a complete-link
target); \verb+ultrafnd.m+ (with \verb+randperm+ several times).

\bigskip

(ii) use  \verb+ultrafnd_confit.m+ (with the order found in (i)),

and \verb+ultrafnd_confnd.m+.

\bigskip

(iii) use \verb+partitionfnd_averages.m+;
\verb+partitionfnd_diameters.m+; and then \verb+partitionfit.m+
after each.

\bigskip

(iv) use \verb+cent_ultrafnd_confit.m+ (with the order found in
(i)), and \verb+cent_ultrafnd_confnd.m+.

\bigskip

(v) use \verb+atreefit.m+ (with a complete-link target);
\verb+atreefnd.m+ (with \verb+randperm+ several times);
\verb+atreedec.m+, and \verb+ultraorder.m+.

\bigskip

(vi) use \verb+consec_subsetfit.m+ and
\verb+consec_subsetfit_alter.m+.

\bigskip

Again, interpret the results obtained from the various analyses.

\bigskip

II. Carry out multiple restart Monte Carlo on your data using an
M-script parallel to \verb+ms_script_yourdata_mds.m+.  Interpret the results obtained in relation to the previous cluster analyses in I and II.  

\begin{verbatim}
load decathlon.dat

decathlon_dissimilarities = 1 - decathlon;

decathlon_dissimilarities

n = 10;


tic;

opts = statset('Maxiter',1000);

best_vaf = 0.0;

store_vaf = zeros(100,1);

for k = 1:100

[coords,stress] = ...
mdscale(decathlon_dissimilarities,2,'Criterion',...
'metricsstress','Start','random',...
 'Replicates',1,'Options',opts);

n = size(coords,1);

distance_matrix = zeros(n,n);

for i = 1:n
    for j = 1:n

        distance_matrix(i,j) = ...
         sqrt(((coords(i,1) - coords(j,1))^2) + ...
            ((coords(i,2) - coords(j,2))^2));
    end
end


decathlon_vec = squareform(decathlon_dissimilarities);

distance_vec = squareform(distance_matrix);

r = corrcoef(decathlon_vec',distance_vec');

vaf = r(1,2)^2;

store_vaf(k) = vaf;

if(vaf > best_vaf)

    best_vaf = vaf;
    best_coords = coords;
    best_distance_vec = distance_vec;

end
end


sorted_vafs = sort(store_vaf');

sorted_vafs
best_vaf
best_coords
best_distance_vec

figure(1)

axis equal

plot(best_coords(:,1),best_coords(:,2),'ko')

hold on

for i = 1:n

    objectlabels{i,1} = int2str(i);

end

text(best_coords(:,1),best_coords(:,2),objectlabels,...
'fontsize',10,'verticalalignment','bottom')

toc;

euclidean_coordinates = [best_coords(:,1),best_coords(:,2)];

figure(2)

axis equal

plot(decathlon_vec,best_distance_vec,'bo')

hold on

xlabel('Dissimilarities')
ylabel('Distances')

tic;

best_vaf = 0.0;

 store_vaf = zeros(100,1);

 best_disparities = zeros(n,n);

for k = 1:100

[coords,stress,disparities] = ...
mdscale(decathlon_dissimilarities,2,'Criterion',...
'sstress','Start',...
    'random','Replicates',1,'Options',opts);

n = size(coords,1);

distance_matrix = zeros(n,n);

for i = 1:n
    for j = 1:n

        distance_matrix(i,j) = ...
         sqrt(((coords(i,1) - coords(j,1))^2) + ...
            ((coords(i,2) - coords(j,2))^2));
    end
end


decathlon_vec = squareform(decathlon_dissimilarities);

distance_vec = squareform(distance_matrix);

r = corrcoef(decathlon_vec',distance_vec');

vaf = r(1,2)^2;

store_vaf(k) = vaf;

if(vaf > best_vaf)

    best_vaf = vaf;
    best_coords = coords;
    best_disparities = disparities;
    best_distance_vec = distance_vec;

end end

store_vaf;

sorted_vafs = sort(store_vaf');

sorted_vafs
best_vaf
best_coords

figure(3)

axis equal

plot(best_coords(:,1),best_coords(:,2),'ko')

hold on

for i = 1:n

    objectlabels{i,1} = int2str(i);

end

text(best_coords(:,1),best_coords(:,2),objectlabels,...
'fontsize',10,'verticalalignment','bottom')

toc;

euclidean_coordinates_nonmetric = ...
[best_coords(:,1),best_coords(:,2)];

best_disparities_vec = squareform(best_disparities);

best_distance_vec

best_disparities_vec

figure(4)

axis equal

[dum,ord] = sortrows([best_disparities_vec(:) decathlon_vec(:)]);

plot(decathlon_vec,best_distance_vec,'bo',...
decathlon_vec(ord),best_disparities_vec(ord),'r.-')

hold on

xlabel('Dissimilarities')
ylabel('Distance/Disparities')

legend({'Distances' 'Disparities'}, 'Location', 'NW')




[d,z,transform] = ...
procrustes(euclidean_coordinates_nonmetric,euclidean_coordinates);

figure(5)

axis equal

plot(euclidean_coordinates_nonmetric(:,1),...
euclidean_coordinates_nonmetric(:,2),'rx',...
    euclidean_coordinates(:,1),...
    euclidean_coordinates(:,2),'b.',...
    z(:,1),z(:,2),'ko')

hold on

text(euclidean_coordinates_nonmetric(:,1),...
euclidean_coordinates_nonmetric(:,2),objectlabels,...
'fontsize',8,'verticalalignment','bottom')

text(z(:,1),z(:,2),objectlabels,'fontsize',8,...
'verticalalignment','bottom')

transform(1).b

transform(1).T

transform(1).c
\end{verbatim}


\newpage

III.  The data matrix \verb+supreme_court_08_09.dat+ gives the number of (non-unanimous) cases (out of 53) that a given pair of Supreme Court justices \emph{dis}agreed on during the 08/09 court term.  Thus, the numbers can be treated as dissimilarities.  The order of the rows and columns is as follows:

\smallskip

1: Ginsburg

2: Souter

3: Breyer

4: Stevens

5: Kenneday

6: Roberts

7: Alito

8: Scalia

9: Thomas

\smallskip

\noindent Using the M-files, \verb+order.m+, \verb+linfitac.m+, and \verb+ultrafnd.m+, evaluate whether a unidimensional scaling (i.e., a ``continuous'' model) or an ultrametric (a ``categorical'' model) gives a better fit.  Interpret the results of your analyses in terms of the political composition of the court in the 08/09 term.  If you wish some background reading, see Adam Liptak, \emph{Roberts Court Shifts Right, Tipped by Kennedy} (\emph{New York Times}, July 1, 2009). The contents of the file \verb+supreme_court_08_09.dat+:

\smallskip

\begin{verbatim}
  0 11 15 11 26 37 37 35 37
 11  0 18 10 27 34 39 32 34
 15 18  0 15 18 27 27 31 33
 11 10 15  0 32 39 41 41 43
 26 27 18 32  0 11 10 15 16
 37 34 27 39 11  0  6 10 14
 37 39 27 41 10  6  0 10 12
 35 32 31 41 15 10 10  0 10
 37 34 33 43 16 14 12 10  0
  \end{verbatim}




\end{document}
