som_normalize
Purpose
Add/apply/redo normalization on data structs/sets.
Syntax
sS = som_normalize(sS)
sS = som_normalize(sS,method)
D = som_normalize(D,sNorm)
sS = som_normalize(sS,csNorm)
sS = som_normalize(...,comps)
Description
This function is used to (initialize and) add, redo and apply
normalizations on data/map structs/sets. If a data/map struct is given,
the specified normalizations are added to the '.comp_norm' field of the
struct after ensuring that all normalizations specified therein have
status 'done'. SOM_NORMALIZE actually uses function SOM_NORM_VARIABLE
to handle the normalization operations, and only handles the data
struct/set specific stuff itself.
The different normalization methods are listed below. For more
detailed descriptions, see SOM_NORM_VARIABLE.
method description
'var' Variance is normalized to one (linear operation).
'range' Values are normalized between [0,1] (linear operation).
'log' Natural logarithm is applied to the values:
xnew = log(x-m+1)
where m = min(x).
'logistic' Logistic or softmax trasformation which scales all
possible values between [0,1].
'histD' Histogram equalization, values scaled between [0,1].
'histC' Approximate histogram equalization with partially
linear operations. Values scaled between [0,1].
'eval' freeform operations
To enable undoing and applying the exactly same normalization to
other data sets, normalization information is saved into a
normalization struct, which has the fields:
.type ; struct type, ='som_norm'
.method ; normalization method, a string
.params ; normalization parameters
.status ; string: 'uninit', 'undone' or 'done'
Normalizations are always one-variable operations. In the data and map
structs the normalization information for each component is saved in the
'.comp_norm' field, which is a cell array of length dim. Each cell
contains normalizations for one vector component in a struct array of
normalization structs. Each component may have different amounts of
different kinds of normalizations. Typically, all normalizations are
either 'undone' or 'done', but in special situations this may not be the
case. The easiest way to check out the status of the normalizations is to
use function SOM_INFO, e.g. som_info(sS,3)
Required input arguments
sS The data to which the normalization is applied.
(struct) Data or map struct. Before adding any new
normalizations, it is ensured that the
normalizations for the specified components in the
'.comp_norm' field have status 'done'.
(matrix) data matrix
Optional input arguments
method The normalization(s) to add/use. If missing,
or an empty variable ('' or []) is given, the
normalizations in the data struct are used.
(string) Identifier for a normalization method to be added:
'var', 'range', 'log', 'logistic', 'histD' or 'histC'. The
same method is applied to all specified components
(given in comps). The normalizations are first
initialized (for each component separately, of
course) and then applied.
(struct) Normalization struct, or an array of structs, which
is applied to all specified components. If the
'.status' field of the struct(s) is 'uninit',
the normalization(s) is initialized first.
Alternatively, the struct may be map or data struct
in which case its '.comp_norm' field is used
(see the cell array option below).
(cell array) In practice, the '.comp_norm' field of
a data/map struct. The length of the array
must be equal to the dimension of the given
data set (sS). Each cell contains the
normalization(s) for one component. Only the
normalizations listed in comps argument are
applied though.
(cellstr array) norm and denorm operations in a cellstr array
which are evaluated with EVAL command with variable
name 'x' reserved for the variable.
comps (vector) The components to which the normalization(s) is
applied. Default is to apply to all components.
Output arguments
sS Modified and/or updated data.
(struct) If a struct was given as input argument, the
same struct is returned with normalized data and
updated '.comp_norm' fields.
(matrix) If a matrix was given as input argument, the
normalized data matrix is returned.
Examples
To add (initialize and apply) a normalization to a data struct:
sS = som_normalize(sS,'var');
This uses 'var'-method to all components. To add a method only to
a few selected components, use the comps argument:
sS = som_normalize(sS,'log',[1 3:5]);
To ensure that all normalization operations have indeed been done:
sS = som_normalize(sS);
The same for only a few components:
sS = som_normalize(sS,'',[1 3:5]);
To apply the normalizations of a data struct sS to a new data set D:
D = som_normalize(D,sS);
or
D = som_normalize(D,sS.comp_norm);
To normalize a data set:
D = som_normalize(D,'histD');
Note that in this case the normalization information is lost.
To check out the status of normalization in a struct use SOM_INFO:
som_info(sS,3)
See also