Metadata V6

From wiki
Revision as of 07:44, 2 December 2016 by Calypso (Talk | contribs)

(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

When using the V6 format, please set the format to "V6" when uploading your metadata file.

The meta data file must have at least the following 5 columns in exactly this order:

Format meta data file
Column Name Description
1 Sample id The identifier of each sample. Sample ids must match sample ids of the uploaded data matrix
2 Label A unique sample label. These labels are shown in generated figures instead of the sample id.
3 Individual or animal. This information is used for paired analysis if several samples were taken from the same individual, e.g. at different time points during a longitudinal study or at different locations from the same individual. Set pair to different ids (e.g. P1,P2,P3…. ) if each sample was taken from a different individual.
4 Include flag This column takes values 0 and 1 and indicates if a sample should be included (1) or excluded (0) from the analysis. This parameter can be used to exclude samples from subsequent analysis without modifying the data matrix. For example, problematic samples can be excluded from data analysis by simply setting their value to 0.
5, 6, 7, 8, 9, … One or more explanatory variables These variables can be treatment groups (e.g. case/control) or any other variable of interest (e.g. age, gender, ph, BMI, temperature).

Explanatory variables

Explanatory variables are used to examine complex environment-microbiome (or host-microbiome) associations.

Explanatory variables can be numeric and/or categorical. Example variables are sampe groups (e.g. case/control), BMI, gender, age, tissue, batch, blood sugar level, temperature, time of sampling, sample location, or ph. Categorical variables must contain non-numeric characters (e.g. T1, T2, ..). Categorical variables must not be encoded numerically (e.g. 1,2,3,...). Numeric variables must not contain non-numeric values (e.g. "unknown"). Missing values must be given as NA.

Example: Assume a case/control study in which gender and age are potential confounding factors. To explore if case/control status controlled for gender and age explains variation in microbial community composition, define case/control status, gender and age as explanatory variables in the meta data file. Then, run a multivariate analysis, such as RDA++, CCA++ or multivariate regression. To identify correlations between taxa and case/control status, gender and age apply Network on the same data files.

Example meta data file in V6 format

MetaDataV6.PNG

Download an example meta data file in V6 format.