-
If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.
-
You already know Dokkio is an AI-powered assistant to organize & manage your digital files & messages. Very soon, Dokkio will support Outlook as well as One Drive. Check it out today!
|
Multivariate analysis
Page history
last edited
by Maya 3 years, 2 months ago
ACCUEIL / HOME
Overview table
Analysis
|
Data
|
Principle
|
Functions
(package ade4)
|
Comments |
What for?
|
PCA (ACP)
Principal Component Analysis
(Analyse en Composantes Principales)
|
- 1 dataframe (n*p)
- quantitative variables
|
Diag
Maximizing variance
On centered/ reduced variables
|
dudi.pca()
|
n>p
|
Reduce the number of dimensions.
Find uncorrelated variables
|
HILLSMITH
Principal component Analysis
|
- 1 dataframe
- mixed quantitative variables and factors
|
Diag
Maximizing variance?
|
dudi.hillsmith()
|
n>p
|
Mixture between pca and mca, to use both qualitative and quantitative data.COA
|
COA (AFC)
Correspondance analysis
(Analyse Factorielle des Correspondances)
|
- 1 dataframe (site*species)
- quantitative homogeneous (contigency table)
|
Comparing profiles
On relative
Abundances
« qualitative variations »
|
dudi.coa()
dudi.nsc (non symetric correspondance analysis)
dudi.dec (decentered correspondance analysis)
|
n>p
Effectifs theoriques pas trop faibles!
|
Relationship between two qualitative variables/ contingency symetric
|
MCA (ACM)
Multiple correspondance analysis
(Analyse des correspondances multiple)
|
- 1 dataframe -> full disjonctive table
- factors with multi levels
|
How are the different factors « correlated »?
|
dudi.acm()
|
More weigth when more levels! Not too many levels
(Effectifs theoriques pas trop faibles!)
|
Same as PCA on qualitative data
|
Between & Within
ACP between and within groups
|
- 2 dataframes: 1 quantitative table (X) and 1 qualitative vector (A)
|
Within: eliminate effect of A on X
(PCA on X-mean group A)
Between : search for the effect of A on X (PCA on group means A)
Diag
Maximizing between groups variance
|
between()
|
|
|
LDA
Linear Discriminante analysis
|
- 2 dataframes: 1 quantitative table (X) and 1 qualitative vector (A)
|
Search for the effect of A
Diag
Maximizing between groups variance/total
|
discrimin() ade4
lda() MASS
|
lda can be projected
|
|
RDA
Redundancy analysis
(Analyse sur variables instrumentales)
|
- 2 tables: 1 site*species and 1 site*env
|
Mutiple regression on each species with env -> ACP on the prediction table
|
pcaiv()
|
Corresponds to:
Y~lm(S~Env)
dudi.pca(Y)
Needs more sites than species ?
Hypothesis of species response curve: linear
|
Correspondance between 2 tables :
relation between 2 variables (ex:species and env) using a third variable (ex:site)
|
CCA
Canonical Correspondance Analysis
(Analyse Canonique des Correspondances)
|
- 2 tables: 1 site*species and 1 site*env
|
Same idea as AFC
|
dudi.coa()
|
Hypothesis of species response curve: unimodal
|
Correspondance between 2 tables :
relation between 2 variables (ex:species and env) using a third variable (ex:site)
|
Co-inertia
|
- 2 tables: 1 site*species and 1 site*env
|
Correlation between both maximisation of inertia
|
|
|
|
Niche / OMI
Outlying Mean Index
|
- - 2 tables: 1 site*species and 1 site*env
|
Maximazing the differences between species
|
|
Hypothesis of species response curve: no hypothesis
|
Niche analysis: give tolerance and marginality.
|
Statis: k-tables
|
|
|
|
|
|
pcoA
positionnement dimentionnel
|
|
|
|
|
|
Basic graph functions
1 dataframe :
scatter ()
s.corcircle()
s.label()
2 dataframes:
s.class()
plot()
Multi-dataframes:
plot(statis)
kplot()
Interface: ade4TkGUI & tkrplot
Astuce
Pour que les labels ne se chevauchent pas : utiliser pointLabel de la librarie maptools (d'autres fonctions similaires existent)
s.label(pca$li, clab=0) ## un graphique d'analyse multivarié
library(maptools)
pointLabel(pca$li, rownames(pca$li))
|
R code
### 1. Hill Smith, ACP et AFC(M) ###===============================
library(ade4)
## données
data(pap) dat <- data.frame(famille=pap$taxo$famille, pap$tab) head(dat)
## exploration graphique example(pairs) pairs(dat[,-1], diag.panel = panel.hist) dat[which.max(dat$Group.Size),]
dat <- dat[-which(rownames(dat)=="Crocuta_crocuta"),] pairs(dat[,-1], diag.panel = panel.hist)
|
### acp sur les variables quantitatives
acp <- dudi.pca(dat[,2:5], scannf = F, nf = 4) plot(acp$eig, type="b") inertia.dudi(acp, col=TRUE) inertia.dudi(acp, row=TRUE) s.arrow(acp$co) s.label(acp$li, clab=0.8, add.plot=T) scatter(acp)
s.class(acp$li, dat$famille)
|
### hill et smith : sur variables mixtes
hm <- dudi.mix(dat, scannf=F, nf=ncol(dat)) plot(hm$eig, type="b") inertia.dudi(hm, col=TRUE) inertia.dudi(hm, row=TRUE) s.arrow(hm$co) s.label(hm$li, clab=0.8, add.plot=T) scatter(hm)
s.class(hm$li, dat$famille)
|
### AFC /COA sur variables de contingence (presence/absence, abondances) ## Donne une comparaison des profils des différents sites ## Relationship between two qualitative variables/symetric
data(doubs) table.value(doubs$fish,csize=0.5) par(mfrow=c(2,2)) data(doubs) COA <- dudi.coa(doubs$fish)
score(COA) #sco.distri(COA$l1[,1], doubs$fish) scatter(COA)
|
### ACM, comme une ACP mais sur données qualitative/ établir des corrélations, réduire le jeu de données
data(ours) ## The ours (bears) data frame has 38 rows, areas of the "Inventaire National Forestier", and 10 columns. summary(ours) ACM <- dudi.acm(ours, scan = FALSE) score(ACM) scatter(ACM) s.label(ACM$cr) # boxplot(dudi.acm(ours, scan = FALSE))
|
### 2. Analyse discriminante ###=============================== # L'analyse discriminante recherche une combinaison des variables permettant de séparer les espèces.
library(ade4)
## données data(iris) expl <- iris[,1:4] classes <- iris[,5]
## graphe exploratoire example(pairs) pairs(dat, col = c("red", "green3", "blue")[as.numeric(classes)])
|
# acp sur les variables explicatives
acpExpl <- dudi.pca(expl, scannf = F, nf = ncol(expl)) plot(acpExpl$eig, type="b") inertia.dudi(acpExpl, col=TRUE) s.arrow(acpExpl$co) s.label(acpExpl$li, clab=0.5, add.plot=T) plot(acpExpl)
|
### analyse discriminante (ou lda dans la librairie MASS)
AD <- discrimin(acpExpl, classes, scannf = F, nf = nlevels(classes)-1) AD plot(AD) s.class(acpExpl$li,classes)
|
### ACP inter et intra groupes : # between: trouver l'effet d'un facteur sur une matrice quantitative # Revient à faire une acp sur les moyennes des groupes
data(meaudret)
pca1 <- dudi.pca(meaudret$mil, scan = FALSE, nf = 4) pca2 <- dudi.pca(meaudret$fau, scal = FALSE, scan = FALSE, nf = 4)
par(mfrow = c(2,2))
s.class(pca1$li, meaudret$plan$sta, sub = "Principal Component Analysis (mil)", csub = 1.75) s.class(pca2$li, meaudret$pla$sta, sub = "Principal Component Analysis (fau)", csub = 1.75)
bet1 <- between(pca1, meaudret$plan$sta, scan = FALSE, nf = 2) bet2 <- between(pca2, meaudret$plan$sta, scan = FALSE, nf = 2) s.class(bet1$ls, meaudret$plan$sta, sub = "Between sites PCA (mil)", csub = 1.75) s.class(bet2$ls, meaudret$plan$sta, sub = "Between sites PCA (fau)", csub = 1.75)
coib <- coinertia(bet1, bet2, scann = FALSE) par(mfrow = c(1,1))
plot(coib)
|
# within: trouver la struture de la matrice sans l'effet du facteur # revient à faire une acp sur les données centrées pour chaque groupe sur la moyenne de celui-ci
data(meaudret)
par(mfrow = c(2,2)) pca1 <- dudi.pca(meaudret$mil, scan = FALSE, nf = 4) s.traject(pca1$li, meaudret$plan$sta, sub = "Principal Component Analysis", csub = 1.5)
wit1 <- within(pca1, meaudret$plan$sta, scan = FALSE, nf = 2) s.traject(wit1$li, meaudret$plan$sta, sub = "Within site Principal Component Analysis", csub = 1.5) s.corcircle (wit1$as)
par(mfrow = c(1,1)) plot(wit1)
|
### 3. Couplage de tableaux ###===============================
library(ade4)
## données data(doubs) str(doubs) ?doubs
|
## ACP sur l'environnement
env <- doubs$mil head(env)
# das - distance to the source (km * 10), # alt - altitude (m), # pen (log(x + 1) where _x_ is the slope (per mil * 100), # deb - minimum average debit (m3/s * 100), # pH (* 10), # dur - total hardness of water (mg/l of Calcium), # pho - phosphates (mg/l * 100), # nit - nitrates (mg/l * 100), # amm - ammonia nitrogen (mg/l * 100), # oxy - dissolved oxygen (mg/l * 10), # dbo - biological demand for oxygen (mg/l * 10).
## Exploration des données, recherche de point atypique
example(pairs) pairs(env, diag.panel=panel.hist) boxplot(scale(env))
env[which.max(env$pho),]
env1 <- env[-which.max(env$pho),] pairs(env1, diag.panel=panel.hist) boxplot(scale(env1))
## ACP
acpEnv <- dudi.pca(env, scannf=F, nf= ncol(env)) plot(acpEnv$eig, type="b") inertia.dudi(acpEnv, col=TRUE) inertia.dudi(acpEnv, row=TRUE) s.arrow(3* acpEnv$co) s.label(acpEnv$li, clab=0.5, add.plot=T)
|
## Couplage de tableaux
## CCA (hypothèse de réponse unimodale des espèces à l'environnement)
CCA <- cca(com, env, scannf=FALSE, nf= ncol(env))
plot(CCA$eig, type="b") inertia.dudi(CCA, col=TRUE) inertia.dudi(CCA, row=TRUE) s.arrow(3* CCA$co) s.label(CCA$li, clab=0.5, add.plot=T) plot(CCA)
## niches de toutes les espèces sur un axe axe <- 1 sco.distri(CCA$ls[,axe], com)
## niche d'une espèce sur le premier plan esp <- "CHA" s.distri(CCA$ls[,1:2],com[,esp], sub=esp)
|
## RDA (hypothese de reponse lineaire des especes a l'environnement)
RDA <- pcaiv(acpEnv, com, scannf=FALSE, nf=ncol(com))
plot(RDA$eig, type="b") inertia.dudi(RDA, col=TRUE) inertia.dudi(RDA, row=TRUE) s.arrow(3* RDA$co) s.label(RDA$li, clab=0.5, add.plot=T) plot(RDA)
## niches de toutes les espèces sur un axe axe <- 1 sco.distri(RDA$ls[,axe], com)
## niche d'une espèce sur le premier plan esp <- "CHA" s.distri(RDA$ls[,1:2],com[,esp], sub=esp) s.arrow(3* RDA$co, clab=0.8, add.plot=T)
|
## OMI (analyse de co inertie) (pas d'hypothese sur la forme de la reponse des especes a l'environnement)
OMI <- niche(acpEnv, com, scannf=F, nf=ncol(com))
plot(OMI$eig, type="b") inertia.dudi(OMI, col=TRUE) inertia.dudi(OMI, row=TRUE) s.arrow(3* OMI$co) s.label(OMI$li, clab=0.5, add.plot=T) plot(OMI)
## niches de toutes les espèces sur un axe axe <- 1 sco.distri(OMI$ls[,axe], com)
## niche d'une espèce sur le premier plan esp <- "CHA" s.distri(OMI$ls[,1:2],com[,esp], sub=esp) s.arrow(3* OMI$co, clab=0.8 add.plot=T)
|
ACCUEIL / HOME
Multivariate analysis
|
Tip: To turn text into a link, highlight the text, then click on a page or file from the list above.
|
|
|
|
|
Comments (0)
You don't have permission to comment on this page.