Need help with pytorch-hessian-eigenthings?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

#### About the developer

noahgolmant
262 Stars 20 Forks MIT License 34 Commits 9 Opened issues

#### Description

Efficient PyTorch Hessian eigendecomposition tools!

!
?

# 383,091
Python
31 commits

# pytorch-hessian-eigenthings

The

`hessian-eigenthings`
module provides an efficient (and scalable!) way to compute the eigendecomposition of the Hessian for an arbitrary PyTorch model. It uses PyTorch's Hessian-vector product and your choice of (a) the Lanczos method or (b) stochastic power iteration with deflation in order to compute the top eigenvalues and eigenvectors of the Hessian.

## Why use this?

The eigenvalues and eigenvectors of the Hessian have been implicated in many generalization properties of neural networks. For example, many people hypothesize that "flat minima" with lower eigenvalues generalize better, that the Hessians of large models are very low-rank, and that certain optimization algorithms may lead to flatter or sharper minima. However, computing and storing the full Hessian requires memory that is quadratic in the number of parameters, which is infeasible for anything but toy problems.

Iterative methods like Lanczos and power iteration can be used to find the eigendecomposition of arbitrary linear operators given access to a matrix-vector multiplication function. The Hessian-vector product (HVP) is the matrix-vector multiplication between the Hessian and an arbitrary vector v. It can be computed with linear memory usage by taking the derivative of the inner product between the gradient and v. So this library combines the Hessian-vector product computation with these iterative methods to compute the eigendecomposition without the quadratic memory bottleneck.

You can use this library for Hessian-vector product computation, the more general eigendecomposition routines for linear operators, or the conjunction of the two for Hessian spectrum analysis.

## Installation

For now, you have to install from this repo. It's a tiny thing so why put it on pypi.

`pip install --upgrade git+https://github.com/noahgolmant/[email protected]#egg=hessian-eigenthings`

## Usage

The main function you're probably interested in is

`compute_hessian_eigenthings`
. Sample usage is like so:
```import torch
from hessian_eigenthings import compute_hessian_eigenthings

model = ResNet18()
dataloader = ...
loss = torch.nn.functional.cross_entropy
num_eigenthings = 20  # compute top 20 eigenvalues/eigenvectors
eigenvals, eigenvecs = compute_hessian_eigenthings(model, dataloader,
loss, num_eigenthings)
```

This also includes a more general power iteration with deflation implementation in

`power_iter.py`
.
`lanczos.py`
calls a
`scipy`
hook
to a battle-tested ARPACK implementation.

## Example file

The example file in

`example/main.py`
utilizes
`skeletor`
version
`0.1.4`
for experiment orchestration, which can be installed via
`pip install skeletor-ml`
, but the rest of this library does not depend on it. You can execute the example via a command like
`python example/main.py  --mode=power_iter `
, where
is a useful name like
`resnet18_cifar10`
. But it may just be easier to use a simpler codebase to instantiate PyTorch models and dataloaders (such as
`pytorch-cifar`
).

## Citing this work

If you find this repo useful and would like to cite it in a publication (as others have done, thank you!), here is a BibTeX entry:

```@misc{hessian-eigenthings,
author       = {Noah Golmant, Zhewei Yao, Amir Gholami, Michael Mahoney, Joseph Gonzalez},
title        = {pytorch-hessian-eigenthings: efficient PyTorch Hessian eigendecomposition},
month        = oct,
year         = 2018,
version      = {1.0},
url          = {https://github.com/noahgolmant/pytorch-hessian-eigenthings}
}
```

## Acknowledgements

This code was written in collaboration with Zhewei Yao, Amir Gholami, Michael Mahoney, and Joseph Gonzalez in UC Berkeley's RISELab.

The deflated power iteration routine is based on code in the HessianFlow repository recently described in the following paper: Z. Yao, A. Gholami, Q. Lei, K. Keutzer, M. Mahoney. "Hessian-based Analysis of Large Batch Training and Robustness to Adversaries", NIPS'18 (arXiv:1802.08241)

Stochastic power iteration with acceleration is based on the following paper: C. De Sa, B. He, I. Mitliagkas, C. Ré, P. Xu. "Accelerated Stochastic Power Iteration", PMLR-21 (arXiv:1707.02670)

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.