fb-caffe-exts

by facebookarchive

facebookarchive / fb-caffe-exts

Some handy utility libraries and tools for the Caffe deep learning framework.

459 Stars 154 Forks Last release: Not found Other 11 Commits 0 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

  • =fb-caffe-exts= =fb-caffe-exts= is a collection of extensions developed at FB while using Caffe in (mainly) production scenarios.

** =predictor/= A simple C++ library that wraps the common pattern of running a =caffe::Net= in multiple threads while sharing weights. It also provides a slightly more convenient usage API for the inference case.

+BEGIN_SRC c++

#include "caffe/predictor/Predictor.h"

// In your setup phase predictor_ = Predictor::paths(FLAGSprototxtpath, FLAGSweightspath, FLAGS_optimization);

// When calling in a worker thread static threadlocal caffe::Blob inputblob; inputblob.setcpudata(inputdata); // avoid the copy. const auto& outputblobs = predictor->forward({&inputblob}); return outputblobs[FLAGSoutputlayer_name];

+END_SRC

Of note is the =predictor/Optimize.{h,cpp}=, which optimizes memory usage by automatically reusing the intermediate activations when this is safe. This reduces the amount of memory required for intermediate activations by around 50% for AlexNet-style models, and around 75% for GoogLeNet-style models.

We can plot each set of activations in the topological ordering of the network, with a unique color for each reused activation buffer, with the height of the blob proportional to the size of the buffer.

For example, in an AlexNet-like model, the allocation looks like

+ATTR_HTML: :height 300px

[[./doc/caffenet.png]]

A corresponding allocation for GoogLeNet looks like

+ATTR_HTML: :height 300px

[[./doc/googlenet.png]]

The idea is essentially linear scan register allocation. We

  • compute a set of "live ranges" for each =caffe::SyncedMemory= (due to sharing, we can't do this at a =caffe::Blob= level)
  • compute a set of live intervals, and schedule each =caffe::SyncedMemory= in a non-overlapping fashion onto each live interval
  • allocate a canonical =caffe::SyncedMemory= buffer for each live interval
  • Update the blob internal pointers to point to the canonical buffer

Depending on the model, the buffer reuse can also lead to some non-trivial performance improvements at inference time.

To enable this just pass =Predictor::Optimization::MEMORY= to the =Predictor= constructor.

=predictor/PooledPredictor{.h,cpp}= maintains a thread-pool with thread-local instances of =caffe::Net=. Calls to =PooledPredictor::forward()= are added to a =folly::MPMCQueue=, which are then dequeued by the thread-pool for processing. Calls to =forward()= are non-blocking and return a =folly::Future= that will be satisfied when the forward pass job finishes. =PooledPredictor= also supports running multiple models over the same thread-pool. That is, if you load two models, each thread in the thread-pool will maintain two instances of =caffe::Net= (one for each model), and the =netId= param in =forward()= specifies the model to run. =PinnedPooledPredictor= is an abstraction over =PooledPredictor= when used with multiple models to pin the =forward()= calls to a specific model.

+BEGIN_SRC c++

#include "caffe/predictor/PooledPredictor.h"

// In your setup phase caffe::fb::PooledPredictor::Config config; config.numThreads_ = 10; config.optimization_ = caffe::fb::Predictor::Optimization::MEMORY; config.protoWeightPaths.emplaceback(FLAGSprototxtpath, FLAGSweightspath); pooledPredictor_ = caffe::fb::PooledPredictor::makePredictor(config);

// When calling predictor caffe::fb::PooledPredictor::OutputLayers outputblobs; pooledPredictor->forward({&inputblob}, &outputblobs) .then([&] { const auto& outputblob = outputsblobs[FLAGSoutputlayername]; // Do something with outputblob });

+END_SRC

** =torch2caffe/= A library for converting pre-trained Torch models to the equivalent Caffe models.

=torch_layers.lua= describes the set of layers that we can automatically convert, and =test.lua= shows some examples of more complex models being converted end to end.

For example, complex CNNs ([[http://arxiv.org/abs/1409.4842][GoogLeNet]], etc), deep LSTMs (created in [[https://github.com/torch/nngraph][nngraph]]), models with tricky parallel/split connectivity structures ([[http://arxiv.org/abs/1103.0398][Natural Language Processing (almost) from Scratch]]), etc.

This can be invoked as

+BEGIN_EXAMPLE

∴ th torch2caffe/torch2caffe.lua --help --input (default "") Input model file --preprocessing (default "") Preprocess the model --prototxt (default "") Output prototxt model file --caffemodel (default "") Output model weights file --format (default "lua") Format: lua | luathrift --input-tensor (default "") (Optional) Predefined input tensor --verify (default "") (Optional) Verify existing (number) Input dimensions (e.g. 10N x 3C x 227H x 227W)

+END_EXAMPLE

This works by

  • (optionally) preprocessing the model provided in =--input=, (folding BatchNormalization layers into the preceding layer, etc),
  • walking the Torch module graph of the model provide in =--input=,
  • converting it to the equivalent Caffe module graph,
  • copying the weights into the Caffe model,
  • Running some test inputs (of size =input_dims...=) through both models and verifying the outputs are identical. ** =conversions/= A simple CLI tool for running some simple Caffe network transformations.

+BEGIN_EXAMPLE

∴ python conversions.py vision --help Usage: conversions.py vision [OPTIONS]

Options: --prototxt TEXT [required] --caffemodel TEXT [required] --output-prototxt TEXT [required] --output-caffemodel TEXT [required] --help Show this message and exit.

+END_EXAMPLE

The main usage at the moment is automating the [[https://github.com/BVLC/caffe/blob/master/examples/net_surgery.ipynb][Net Surgery]] notebook.

** Building and Installing As you might expect, this library depends on an up-to-date [[http://caffe.berkeleyvision.org/][BVLC Caffe]] installation.

The additional dependencies are

  • The C++ libraries require [[https://github.com/facebook/folly][folly]].
  • The Python =conversions= libraries requires [[http://click.pocoo.org/5/][click]].

You can drop the C++ components into an existing Caffe installation. We'll update the repo with an example modification to an existing =Makefile.config= and a =CMake= based solution.

** Contact Feel free to open issues on this repo for requests/bugs, or contact [[mailto:[email protected]][Andrew Tulloch]] directly.

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.