MsCompare: Comparative Analysis for BioMarker Discovery and Metabonomics

Metabolic profiling of humans, animals and plants is increasingly important for understanding disease and monitoring the effects of drug treatment, nutritional regimes and toxicity. Metabolomics is the study of plant metabolites and is useful for varietal studies and the effect of growing conditions or genetic modification while Metabonomics is the study of animal or human metabolism on body fluids such as urine. The identification and quantification of changes in proteins related to disease or disease-modifying agents is part of Proteomics.

In all of the above research fields, Mass Spectrometry (LC/MS) plays an increasingly important role, due to its high sensitivity, rapid analysis and ease of identification, using accurate mass. The MsCompare® software offers a large collection of tools, algorithms and visualization techniques for what in general could be called: Comparative Analysis.

MsCompare offers all of the operations needed for processing LC/MS datasets and finding statistically significant differences between groups of samples:

MsCompare Features

Supported Vendor Formats

  • Currently supported LC/MS and GC/MS data formats: Thermo Xcalibur, Waters Masslynx, Bruker, Agilent, AB Sciex, mzXML, mzData and NetCDF.

Data Pre-Preprocessing Tools

  • If needed, data can be pre-processed in different ways: Smoothing of extracted Ion Currents, Baseline Correction, De-Isotoping, De-Spiking, etc.
  • Normalization: MsCompare offers visualization tools to decide whether normalization of your data is necessary: Normalization types that can be selected are: Area TIC, Range Scaling, Unit Vector Scaling, Normalization based on selected Reference Peak(s) or import of user defined normalization constants from external methods.
  • Creating of Groups and Classes: easily define your experiments and create multiple groups. All visualization tools in MsCompare offer coloring of individual samples based on group assignment.


Alignment of chromatograms is necessary if case that shifts between peaks in different samples are moderate or severe. Depending on the complexity of the samples, alignment can be very difficult and often results in non-optimal results. MsCompare offers 4 different alignment algorithms that can be used individually or sequentially.

  • Fixed Off-Set Correction. To be used in simple situations or as a first step for the other alignment algorithms
  • Cross Correlation: Linear shift correction based on automatic Cross Correlation.
  • COW: Correlation Optimized Warping. In many cases COW produces satisfactory results for chromatograms that contain few to moderate number of peaks (5-100).
  • Reference Peak Warping: in many cases the chromatograms (TIC/BPC) contain too many peaks to get good results using COW. In these cases use manual or automatic Reference Peak Warping. When used in “Auto Mode” the algorithm will find proper reference peak present in all samples. Alignment is subsequently applied to the selected reference peaks.
  • Manual Alignment. In case you have from 2 to about 75 samples, you can use a graphical procedure to perform alignment.

Comparative Analysis

  • BioMarker Surface Maps: Find unique, up- or down regulated features by using the BioMarker Surface Map. The map is easily converted to a table in which all differential peaks are displayed.
  • Peak Matching: Run Optimized Peak Picking and filtering at any sensitivity and accuracy on one selected sample, e.g. a reference or pooled sample. Subsequently find all results for all other samples based on the selected Peak Picking list.
  • Peak Picking: Run Peak Picking on all samples individually. Cluster peaks into groups and create final result table to be used for further statistical or multivariate analysis.

Statistical and Multivariate Analysis

Once you have created a result table, you can apply univariate or Multivariate methods to find significant and relevant peaks that are different between samples from different groups.

Univariate Tools:

  • P-and t- statistics, Ratio Group Statistics, Uniqueness Values (find peak completely absent in one of the groups), Fisher Z-statistics, Intensity weighted ratios. View all statistics in one table for all groups.

Multivariate Tools:

  • Principal Component Analysis (PCA) on the result table, TIC/BPC, Mass Spectra etc. Produces interactive Scores and Loading plots. Selected samples or variables are visualized in different plots (overlay of EIC, Concentration Profiles etc.)
  • Partial Least Squares (PLS) and PLS-DA using interactive visualization tools.
  • Storing and loading of PCA and PLS models.
  • Extended Canonical Variates Analysis (ECVA)
  • Support Vector Machines (SVM).
  • Hierarchical Clustering of samples and variables.
  • Correlation Maps.