# scatter_matrix

Produces a scatter_matrix with optional trend fitting
histograms, and Gaussian kernel density estimation.

The x-limits and y-limits are set to be 120% of the range of the data.
The subplots are also configured to be square. The plot resolution increases
with the number of variables plotted.

## Example

The simplest example is to load the data into a `DataFrame` and
tell `scatter_matrix()` what factors you would like in the matrix.

The defaults will give you factor labels down the diagonal and perform
linear trend fitting for the subplots above the diagonal.

>>> df=DataFrame()
>>> df.read_tbl('data/iqbrainsize.txt', delimiter='\t')
>>> df.scatter_matrix(['CCSA','FIQ','TOTSA','TOTVOL'])

produces ‘scatter_matrix(CCSA_X_FIQ_X_TOTSA_X_TOTVOL).png’

## Example with diagonal=’hist’

Specifying diagonal=’hist’ produces 20 bin histograms along the diagonal.
The y-axis labels do not relate to the frequency counts.

>>> df.scatter_matrix(['CCSA','FIQ','TOTSA','TOTVOL'],
diagonal='histogram')

produces ‘scatter_matrix(CCSA_X_FIQ_X_TOTSA_X_TOTVOL,diagonal=hist).png’

## Example with diagonal=’kde’ and alternate_labels=False

Specifying diagonal=’kde’ produces kernel density estimation plots along
the diagonal. The y-axis labels do not relate to the density estimates.

Special care was taken to make sure the appropriate labels and ticks are
plotted regardless of the number of variables or number of plots specified.
With matrices with 5 or more variables become a bit hard to reconcile with
the alternating ticks and labels. If you would like all the ticks on to the
left and bottom and all the variable labels to the top and right specify
just need to specify alternate_labels=False.

>>> df.scatter_matrix(['CCSA','HC','FIQ','TOTSA','TOTVOL'],
diagonal='histogram')

produces ‘scatter_matrix(CCSA_X_HC_X_FIQ_X_TOTSA_X_TOTVOL,diagonal=kde,alternate_labels=False).png’