Why ReLU networks yield high-confidence predictions far away from the training data and how to mitig...

Available items

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? **Create a request here**:

Readme

**CVPR 2019 (accepted with an oral)**

**Matthias Hein, Maksym Andriushchenko, Julian Bitterwolf**

**University of Tübingen and Saarland University**

https://arxiv.org/abs/1812.05720

ReLU-networks yield overconfident predictions on unrelated tasks. Here a ResNet-18 was trained on CIFAR-10 and evaluated on SVHN perceives digits as dogs, birds or airplanes with 100% confidence. Clearly, this questions the usage of neural networks for safety-critical tasks.

We can clearly see this on the two moons dataset: the classifier outputs 100% confidence almost everywhere, no matter how far away the points are from the training data.

We give a theoretical argument of why ReLU activation function can lead to models with overconfident predictions far away from the training data. We leverage the fact that a ReLU network partitions the input space on the finite set of polytopes.

The intuition is made formal in the main theorem.

In order to mitigate this problem, we propose a training scheme **CEDA** that enforces uniform confidences on
out-of-distribution noise shown below.

We can see that CEDA already improves on the problem by preventing overconfident predictions.

Next we propose a robust optimization training scheme **ACET** which minimizes the worst-case noise in
a neighbourhood of noise points. This works even better.

Although, as predicted by Theorem 1, overconfident predictions still exist since this is an inherent property of ReLU-networks regardless of the training procedure. If we zoom out from 0, 1 to [-10, 10], we can clearly see that overconfident predictions are still there.

We provide a systematic comparison of **ACET** over plain and **CEDA** models on various benchmarks: from evaluation on
noise to out-of-distribution detection on other image datasets.

We also illustrate the advantage of **ACET** on per-image basis. For example, if we train a plain model on MNIST and evaluate
it on CIFAR-10, it is also prone to overconfident predictions (up to 99.6%) on images that do not have anything in common
with digits.

However, **ACET** mitigates this problem to a large extent.

Finally, we illustrate Theorem 3.1 experimentally. We observe that simple alpha-scaling (i.e. brightness
increase) of, e.g., uniform noise can easily lead to overconfident predictions for plain models
even in the image domain [0, 1]^d. At the same time, **ACET** models helps to mitigate this problem as well.

toy2d.ipynbis a very short notebook that shows how to train an ACET model on two moons dataset. For full-fledged code on how to train CNNs with ACET on multiple GPUs with tensorboard summaries, please take a look onto

train.py.

Models are created and trained by using the

train.pyscript. Get the list of possible arguments via

python3 train.py --help.

The value of

at_fracdetermines the number of noise examples (on which the maximal confidence should be small) per clean training example. Thus, if

--at_frac=0, a

--at_frac=1, which means 50% clean and 50% (adversarial) noise images.

The value

pgd_niterdetermines how many steps of pgd will be applied to the noise. If its value is zero, the first random step will not be applied and just the noise will be used -- this is the correct setting for

For **ACET**, the parameters

at_frac,

pgd_niterand

pgd_epsshould be positive. In the paper, we used

pgd_niter=40and

pgd_eps=0.3. While for plain and

pgd_epsis not used, you can set it to compare their TensorBoard evaluations. As an example, for training the

python3 train.py --exp_name='acet_svhn' --gpus=0 --n_epochs=100 --dataset=svhn --model=resnet_small --loss='max_conf' --opt=momentum --lr=0.1 --lmbd=0.0005 --pgd_eps=0.3 --at_frac=1 --pgd_niter=40

You can find all the models presented in Table 1 in the folder

exps_paper. The file names of the models contain the hyperparameters used for their training. For example:

2019-04-04 16:27:49_dataset=cifar100 model=resnet_small p_norm=inf lmbd=0.0005 at_frac=1.0 pgd_eps=0.3 pgd_niter=40 frac_perm=0.5 loss=max_conf

means that the model was trained on CIFAR-100, the architecture was

resnet_small(see

models.ResNetSmallfor its definition),

at_frac=1.0) wrt the Linf-norm

eps=0.3, where the 40 steps of PGD were used in the robust optimization procedure. The loss function used in PGD was the maximum log probability taken over all classes.

frac_perm=0.5means that the noise is generated with 50% uniform and 50% permutation noise.

Evaluation is split into two parts: Collecting the model outputs on several datasets, including adversarial noise and adversarial samples, and analyzing the output numbers.

First, run

evaluation.pywith the chosen hyperparameter settings for the evaluation as well as the model folder and name as arguments. This will save the predictions in numpy files within the model folder. For calculating the resulting statistics and plotting histograms and ROC curves, call

analysis.pyon that same folder. The resulting numbers are appended to a specified

.csvfile in the main folder, while you can find the plots as

.pngfiles within the model folder.

python3 evaluation.py --model_folder "{model_folder}" --model_name "{model_name}" --model "{model_type}" --dataset "{model_set} --gpus=1 --at_frac=1 --pgd_niter=80 --pgd_eps "{eps}" --pgd_step "{step}" --loss='max_conf'

python3 analysis.py --model_folder "{model_folder}" --save_all_evals='results.csv'

During training, several metrics are recorded as TensorBoard logs, which can be reviewed with

tensorboard --logdir=exps.

Our code runs on

Python3.5. For the computations we use

tensorflow:1.10.1and dataloading is running with PyTorch version

0.4.1and

torchvision.

With Python3.5 installed, you can create a docker container that has all necessary dependencies installed by using the provided Dockerfile.

Please contact Maksym Andriushchenko or Julian Bitterwolf regarding this code.

@article{hein2019relu, title={Why ReLU networks yield high-confidence predictions far away from the training data and how to mitigate the problem}, author={Hein, Matthias and Andriushchenko, Maksym and Bitterwolf, Julian}, conference={CVPR 2019}, year={2019} }