Shape analysis with MATLAB
The data set and functions used on this page are in this file: shape.zip.
All one needs to do is unzip the file into a new directory, open MATLAB, change directory in MATLAB to the new directory, then the following should work.
Notes
- I have documented some of these functions within the m-files, and this
can be accessed in MATLAB with the
help
command:help OPA
,help GPA
, etc. - A configuration of k landmarks in m-D space is represented by an m-by-k matrix - the alternative convention to the one Ian Dryden used in his R shapes package - and a sample of n configurations is represented by an m-by-k-by-n matrix.
- The example dataset was kindly provided by Charlie Laughton.
Loading and plotting data
First to load some data:
load moleculeData.mat;
This puts a 3-by-22-by-250 array called
moleculeData
into the workspace. Data can be
visualised with plotShape
(this function works for arrays of
n ≥ 1 and
with m=2 or m=3). For example, to visualise the first configuration:
plotShape(moleculeData(:,:,1));
There is a rotate button (a box with an arrow around it) on the figure toolbar. Selecting this then left-clicking somewhere in the figure and and dragging allows the plot to be rotated. To superimpose the second configuration, this time plotted with circles:
hold on; plotShape(moleculeData(:,:,2),'o');
The shape of the two configurations is similar, but they are orientated and centred differently. Plotting all the configurations together using plotShape(moleculeData)
reveals a big mess!
Registering configurations
Registering one configuration to another can be done with Ordinary Procrustes analysis (function OPA
). Registering the second to the first:
secondRegToFirst = OPA(moleculeData(:,:,1),moleculeData(:,:,2));
and plotting again
figure;
plotShape(moleculeData(:,:,1));
hold on;
plotShape(secondRegToFirst,'o');
shows a better comparison. Note that OPA
by default
performs partial Procrustes analysis (i.e. no scaling); full Procrustes analysis
is performed if true
is entered as a third argument of
OPA
.
Registering a whole sample of configurations as closely as possible to each other
can be done with generalised Procrustes analysis (function GPA
):
moleculeData_GPA = GPA(moleculeData);
figure; plotShape(moleculeData_GPA);
Again, GPA
defaults to partial Procrustes analysis; full
Procrustes analysis is performed with the call
GPA(moleculeData,true)
.
Mean shapes
One way to define mean shape is as the arithmetic mean of the GPA-registered configurations
moleculeData_GPAmean = mean(GPA(moleculeData),3);
An alternative is the multidimensional-scaling mean shape. When k is large the MDS mean shape is much faster to calculate than the GPA mean and approximates it well for large, non-highly-dispersed samples. The MDS mean is:
moleculeData_MDSmean = MDSmean(moleculeData);
Comparing the two means for our moleculeData
(including
registration with OPA
),
figure; plotShape(moleculeData_GPAmean); hold on;
plotShape(OPA(moleculeData_GPAmean,moleculeData_MDSmean),'o');
shows with these data that the means are nearly identical.
(For determining confidence regions for the mean shape, and for testing equality of mean shape for two samples, I will add here soon a description of how to use our bootstrap code)
Shape variation: principal component analysis
shapePCA
shows how shape varies along principal
components; e.g.
shapePCA(moleculeData);
shows shape variation at +/- 3 standard deviations long the first 4 PCs. The figure titles show information such as how much of the variation is explained in the respective PC. The axes properties are locked together so that rotating one figure rotates all of them.
The second argument can be used to specify the PCs to be considered, and the third argument to specify the number of SDs for which to show variation; e.g.
shapePCA(moleculeData,[3 5],4);
shows the 3rd and 5th PCs plotted at +/- 4 SDs.
With a fourth argument equal to true
one can also plot an
animation of the variation along a chosen PC:
shapePCA(moleculeData,1,3,true);
Currently this can be done only one PC at a time.
Other simple stuff - removing location and scale
A configuration (or sample of configurations) can be centred with
centre
and Helmertised with helm
. Helmertised
configurations can be de-Helmertised with dehelm
. They can be
normalised to have unit Frobenius norm with unitFromNorm
.
Hence
unitFrobNorm(helm(moleculeData));
are the Helmertised pre-shapes and
unitFrobNorm(centre(moleculeData));
are the centred pre-shapes.
Note that dehelm(helm(X))
gives the same result as
centre(X)
.