R Biplot Example
R/ggplot_pca.R
Produces a ggplot2
variant of a so-called biplot for PCA (principal component analysis), but is more flexible and more appealing than the base Rbiplot()
function.
- R offers two functions for doing PCA: princomp and prcomp, while plots can be visualised using the biplot function. However, the plots produced by biplot are often hard to read and the function lacks many of the options commonly available for customising plots.
- The biplot graphic display of matrices with application to principal component analysis BY K. GABRIEL The Hebrew University, Jerusalem SUMMARY Any matrix of rank two can be displayed as a biplot which consists of a vector for each row and a vector for each column, chosen so that any element of the matrix is exactly the.
- Tioned are for purchase, except XLS-Biplot, BiPlot, Manet and ViSta which are available free of charge. R R (R Development Core Team2009) is a free statistical programming language and environ-ment capable of producing high-quality graphics. Initiated byIhaka and Gentleman(1996).
- The following relies on the Iris dataset in R, and specifically the first three variables (columns): Sepal.Length, Sepal.Width, Petal.Length. A biplot combines a loading plot (unstandardized eigenvectors) - in concrete, the first two loadings, and a score plot (rotated and dilated data points plotted with respect to principal components).
Arguments
If you want to learn R efficiently, Step by Step for Data Analysis or Data Science with Practical Examples, 1 on 1 live from a professional R Tutor please check this R Tutoring Online with Exercises and Projects. CSV files have many benefits, as they are simple text files consisting of lines and each line of data is represented by a line in csv.
x | an object returned by |
---|---|
choices | length 2 vector specifying the components to plot. Only the default is a biplot in the strict sense. |
scale | The variables are scaled by |
pc.biplot | If true, use what Gabriel (1971) refers to as a 'principal component biplot', with |
labels | an optional vector of labels for the observations. If set, the labels will be placed below their respective points. When using the |
labels_textsize | the size of the text used for the labels |
labels_text_placement | adjustment factor the placement of the variable names ( |
groups | an optional vector of groups for the labels, with the same length as |
ellipse | a logical to indicate whether a normal data ellipse should be drawn for each group (set with |
ellipse_prob | statistical size of the ellipse in normal probability |
ellipse_size | the size of the ellipse line |
ellipse_alpha | the alpha (transparency) of the ellipse line |
points_size | the size of the points |
points_alpha | the alpha (transparency) of the points |
arrows | a logical to indicate whether arrows should be drawn |
arrows_colour | the colour of the arrow and their text |
arrows_size | the size (thickness) of the arrow lines |
arrows_textsize | the size of the text at the end of the arrows |
arrows_textangled | a logical whether the text at the end of the arrows should be angled |
arrows_alpha | the alpha (transparency) of the arrows and their text |
base_textsize | the text size for all plot elements except the labels and arrows |
... | Arguments passed on to functions |
Source
The ggplot_pca()
function is based on the ggbiplot()
function from the ggbiplot
package by Vince Vu, as found on GitHub: https://github.com/vqv/ggbiplot (retrieved: 2 March 2020, their latest commit: 7325e88
; 12 February 2015).
R Biplot Example
As per their GPL-2 licence that demands documentation of code changes, the changes made based on the source code were:
R Biplot Examples
Rewritten code to remove the dependency on packages
plyr
,scales
andgrid
Parametrised more options, like arrow and ellipse settings
Hardened all input possibilities by defining the exact type of user input for every argument
Added total amount of explained variance as a caption in the plot
Cleaned all syntax based on the
lintr
package, fixed grammatical errors and added integrity checksUpdated documentation
Details
The colours for labels and points can be changed by adding another scale layer for colour, like scale_colour_viridis_d()
or scale_colour_brewer()
.
Maturing Lifecycle
R Biplot Pca Example
The lifecycle of this function is maturing. The unlying code of a maturing function has been roughed out, but finer details might still change. Since this function needs wider usage and more extensive testing, you are very welcome to suggest changes at our repository or write us an email (see section 'Contact Us').