Machine Learning Workflow, from Andrew Ng's lecture at Deep Learning Summer School 2016
This document attempts to summarize Andrew Ng's recommended machine learning workflow from his "Nuts and Bolts of Applying Deep Learning" talk at Deep Learning Summer School 2016. Any errors or misinterpretations are my own.
The real goal of measuring human-level performance is to estimate the Bayes Error Rate. Knowing your Bayes Error Rate helps you figure out if your model is underfitting or overfitting your training data. More specifically, it will let us measure 'Bias' (as Ng defines it), which we use later in the workflow.
Ng recommends a Train / Dev / Test split of approximately 70% / 15% / 15%.
Calculate your bias and variance as: * Bias = (Training Set Error) - (Human Error) * Variance = (Dev Set Error) - (Training Set Error)
An example of high bias:
Error Type |
Error Rate |
---|---|
Human Error | 1% |
Training Set Error | 5% |
Dev Set Error | 6% |
Fix high bias before going on to the next step.
An example of high variance:
Error Type |
Error Rate |
---|---|
Human Error | 1% |
Training Set Error | 2% |
Dev Set Error | 6% |
Once you fix your high variance then you're done!
If your train and test data come from different distributions, make sure at least your dev and test sets are from the same distribution. You can do this by taking your test set and using half as dev and half as test.
Carve out a small portion of your training set (call this Train-Dev) and split your Test data into Dev and Test: ``` |---------------------------------|-----------------------| | Train (Distribution 1) | Test (Distribution 2) | |---------------------------------|-----------------------| | Train | Train-Dev | Dev | Test | |---------------------------------|-----------------------|
#### 2. Measure Your Errors, and Calculate the Relevant MetricsCalculate these metrics to help know where to focus your efforts:
Error Type | Formula |
---|---|
Bias | (Training Error) - (Human Error) |
Variance | (Train-Dev Error) - (Training Error) |
Train/Test Mismatch | (Dev Error) - (Train-Dev Error) |
Overfitting of Dev | (Test Error) - (Dev Error) |
An example of high bias:
Error Type | Error Rate |
---|---|
Human Error | 1% |
Training Set Error | 10% |
Train-Dev Set Error | 10.1% |
Dev Set Error | 10.2% |
Fix high bias before going on to the next step.
An example of high variance:
Error Type | Error Rate |
---|---|
Human Error | 1% |
Training Set Error | 2% |
Train-Dev Set Error | 10.1% |
Dev Set Error | 10.2% |
Fix your high variance before going on to the next step.
An example of train/test mismatch:
Error Type | Error Rate |
---|---|
Human Error | 1% |
Training Set Error | 2% |
Train-Dev Set Error | 2.1% |
Dev Set Error | 10% |
Fix your train/test mismatch before going on to the next step.
An example of overfitting your dev set:
Error Type | Error Rate |
---|---|
Human Error | 1% |
Training Set Error | 2% |
Train-Dev Set Error | 2.1% |
Dev Set Error | 2.2% |
Test Error | 10% |
Once you fix your dev set overfitting, you're done!
Ng suggests these ways for fixing a model with high bias:
Ng suggests these ways for fixing a model with high variance:
Ng suggests these ways for fixing a model with high train/test mismatch:
Ng suggests only one way of fixing dev set overfitting:
Presumably this would include data synthesis and data augmentation as well.