Statistical Analysis Robust statistical techniques are often required to reveal the information contained in a multi dimensional data set. They are also used to confirm the statistical significance of observed trends. Data summarisation All factors and responses can be summarised to give basic statistical data such as minimum, maximum, standard deviation, confidence interval etc. This data can also be displayed graphically as a box plot. Understanding variability It is crucial that you understand the extent and sources of variability within your data as this will impact the level of confidence for any model built upon it. We use statistical tools that allow the nested analysis and quanitification of batch to batch variation. This understanding will allow you to focus more experimental time and resources on understanding and minimising the major sources of your variabiity. Sampling strategies In order to make most efficeint use of experiments you should create a sampling strategy based on the experimental variability. Numerical models and trends (MLR,PLS) For a given block of data, numerical models can be built and trends analysed. A numerical model will allow you to find the optimum conditions and response. It will also predict results for areas where there is no experimental data. These numerical models are analysed statistically against the known variability to provide confidence and avoid over-fitting. Techniques used include:
- Multiple Linear Regression (MLR)
- Partial Least Squares (PLS)
Diversity and Domain of Applicability It is important to understand the domain of applicability for any model that is built. Similarly, it is important to take account of the diversity of your data. Outliers do not build good numerical models. Data clustering (PCA, Hierarchical, K-Means, Self-Organising Maps) One route to simplifying data analysis is to cluster data. This clustering can also reveal patterns and trends in the data. Some of these areas are also described on the following web pages: |