MSc in OR and MSc in Management Sciences

Graduate Training Programme

————————————————————————————

Computer Intensive Analysis of Data and Models

The course comprises three main components:

(i) Working Notes: These contain the material covered in the lectures. This has a strong practical emphasis.

(ii) Worked Examples: These are spreadsheet examples. They are embedded in the Working Notes.

(iii) Technical Notes: These cover the theoretical foundations of bootstrapping.

————————————————————————————

The Working notes cover the following:

I Introduction

1. Introduction

2. Statistical Metamodels

II Classical Methods

3. Random Variables

4. Fitting Parametric Distributions to Random Samples; Input Modelling

5. Maximum Likelihood Estimation

6. Accuracy of MLEs

III Computer Intensive Methods

7. Empirical Distribution Functions

8. Basic Bootstrap Method

9. Evaluating the Distribution of MLEs by Bootstrapping

10. Comparing Samples Using the Basic Bootstrap

11. The Parametric Bootstrap

12 Goodness of Fit Testing

12.1 Classical Goodness of Fit

12.2 Bootstrapping a GOF statistic

13 Comparison of Different Models; Model Selection

14 Final Comments

You can access the working notes by clicking on the links given below. The Working Notes are meant to be worked through.

They contain Examples and Exercises. These illustrate the topic or method being discussed. They are an essential part of the text and need to be studied.

Many of the Examples and Exercises come with their own link. (i) Some of the links contain additional notes and more detailed formulas, (ii) The other links are to actual spreadsheets containing data and the worked details using the data.

Some of the initial spreadsheets contain elementary exercises connected with generating random variables and simple sampling experiments. The point of these exercises is to remind you of the basic formulas and functions that you will need for the more complicated later examples. You should already be familiar with this material. However you might wish to spend a short time checking that you do know this material well.

The other spreadsheets contain more substantial problems.

These are solved using VBA macros for carrying out more substantial calculations and more extensive analyses. The macros are fairly generic in that they only need minor adjustment to solve other similar problems.

The main reason for using such macros is to demonstrate that the structure of many problems follows a similar pattern, depending on the solution of a limited number of standard problems.

You are expected to follow the working of the macros in sufficient detail to appreciate this and to be able to make the minor changes to them to solve similar problems.

I have tried to make the macros transparent and relatively easy to modify.

In the spreadsheets, the following convention for cells is used:

Cells with a Yellow background - Headings, Incidental Information

Cells with a Green background - Input Information used in calculations on that Sheet

Intermediate Results and Calculations are not usually coloured.

————————————————————————————

The Technical Notes cover the following

1. The Bootstrap

The Bootstrap Concept

Basic Method

The Double Bootstrap and Bias Correction

Parametric Bootstrap

2. Percentiles and Confidence Intervals

Percentiles

Confidence Intervals by Direct Bootstrapping

Studentization

Percentile Methods

3. Theory

Convergence Rates

Asymptotic Accuracy of EDF's

Asymptotic Accuracy of Confidence Intervals

Failure of Bootstrapping

4. Monte-Carlo/Simulation Models

Direct Models

Metamodels

Linear Metamodels

NonLinear Metamodels

Uses of Metamodels

Metamodel Comparison and Selection

5. Bootstrap Comparisons

Goodness-of-Fit and Validation

Comparison of Different Systems

6. Bayesian Models

7. Time Series Output

Residual Sampling

Block Sampling

Spectral resampling

8. Final Comment

————————————————————————————

Links

• Working Notes: Part I

• Working Notes: Part II

• Working Notes: Part III

• Technical Notes

References are at the end of the Technical Notes (some references also at the end of Part III of the Working Notes).

————————————————————————————

Synopsis of Lectures

Lecture #1

W1. Introduction

W2. Statistical MetaModels.

Traffic Queue Length EG

Moroccan TB Data

Vaso Constriction Data

W3. Random Variables

W4. Fitting Parametric Distributions to Random Samples; Input Modelling

Normal Var Generator

Gamma Var Generator

Lecture #2

W5. Maximum Likelihood Estimation

W6. Accuracy of ML Estimators

Gamma MLE

Regression Fit Morocco Data

Vaso Constriction Data

Lecture #3

W7. Empirical Distribution Functions

W8. Basic Bootstrap Method

Bootstrap Median

W9. Evaluating the Distribution of MLEs by Bootstrapping

Gamma Bootstrap

Vaso Constriction Data.

Lab #1

Examine examples of Lectures 1, 2 and 3

Lecture #4

T1 The Bootstrap

T1.1 The Bootstrap Concept

T1.2 Basic Method

T1.3 The Double Bootstrap and Bias Correction

T2 Percentiles

Lecture #5

W11. The Parametric Bootstrap

ParametricBS-GammaEG

T4.2 Metamodels

T4.3 Linear Metamodels

T4.4 Nonlinear Metamodels

Lab #2

Examine examples of Lectures 3 and 5

Fit a suitable model to the Traffic Queue and Cortisol Assay Data

Lecture #6

W12 Goodness of Fit Testing

Gamma Fit Toll Booth Data

Normal Fit Toll Booth Data

T5.1 Goodness-Of-Fit and Validation

Lecture #7

W10 Comparing Samples Using the Basic Bootstrap

Law and Kelton EG

W13 Comparison of Different Models; Model Selection

Cement Data

T4.6 Metamodel Comparison and Selection

Lab #3

Examine examples of Lectures 6 and 7

Particular data sets you may wish to consider are

(i) ANOVA analysis of Tyre Wear Data

(ii) An analysis of Component Lifetimes. For this problem

there are some accompanying notes you should read first

to help you: Component lifetime Notes

Lecture #8

T3 Theory

T3.1 Convergence Rates

T3.2 Asymptotic Accuracy of EDF’s

T3.3 Asymptotic Accuracy of Confidence Intervals

T3.4 Failure of Bootstrapping

W14 Final Comments