Need help with HorizonNet?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

165 Stars 39 Forks MIT License 96 Commits 19 Opened issues


Pytorch implementation of HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation.

Services available


Need anything else?

Contributors list

No Data


This is the implementation of our CVPR'19 " HorizonNet: Learning Room Layout with 1D Representation and Pano Stretch Data Augmentation" (project page).

News, June 15, 2019 - Critical bug fix for general layout (
)\ News, Aug 19, 2019 - Report results on Structured3D dataset. (See the report :clipboard: on ST3D).

This repo is a pure python implementation that you can: - Inference on your images to get cuboid or general shaped room layout - 3D layout viewer - Correct pose for your panorama images - Pano Stretch Augmentation copy and paste to apply on your own task - Quantitative evaluatation (3D IoU, Corner Error, Pixel Error) - cuboid shape - general shape - Your own dataset preparation and training

Method Pipeline overview:


  • Python 3
  • pytorch>=1.0.0
  • numpy
  • scipy
  • sklearn
  • Pillow
  • tqdm
  • tensorboardX
  • opencv-python>=3.1 (for pre-processing) (also can't be too new, the latest opencv removed a key algorithm due to patent, works. )
  • open3d>=0.7 (for layout 3D viewer)
  • shapely
  • torchvision



  • PanoContext/Stanford2D3D Dataset
    • Download preprocessed pano/s2d3d for training/validation/testing
      • Put all of them under
        directory so you should get:
        |  ├──layoutnet_dataset/
        |  |  |--finetune_general/
        |  |  |--test/
        |  |  |--train/
        |  |  |--valid/
      • test
        are processed from LayoutNet's cuboid dataset.
      • finetune_general
        is re-annotated by us from
        . It contains 65 general shaped rooms.
  • Structured3D Dataset
    • See the tutorial to prepare training/validation/testing for HorizonNet.

Pretrained Models

  • resnet50rnn_panos2d3d.pth
    • Trained on PanoContext/Stanford2d3d 817 pano images.
    • Trained for 300 epoch
  • resnet50rnn_st3d.pth
    • Trained on Structured3D 18362 pano images with setting of original furniture and lighting.
    • Trained for 50 epoch.
    • Select 50th epoch according to loss function on validation set.

Inference on your images

In below explaination, I will use

for example. - (modified from PanoContext dataset)

1. Pre-processing (Align camera rotation pose)

  • Execution: Pre-process the above
    by firing below command.
    python --img_glob assets/demo.png --output_dir assets/preprocessed/
    • --img_glob
      telling the path to your 360 room image(s).
      • support shell-style wildcards with quote (e.g.
    • --output_dir
      telling the path to the directory for dumping the results.
    • See
      python -h
      for more detailed script usage help.
  • Outputs: Under the given
    , you will get results like below and prefix with source image basename.
    • The aligned rgb images
      [SOURCE BASENAME]_aligned_rgb.png
      and line segments images
      [SOURCE BASENAME]_aligned_line.png
      • demo_aligned_rgb.png
        :--------------------: | :---------------------: |
    • The detected vanishing points
      -0.002278 -0.500449 0.865763
      0.000895 0.865764 0.500452
      0.999999 -0.001137 0.000178

2. Estimating layout with HorizonNet

  • Execution: Predict the layout from above aligned image and line segments by firing below command.
    python --pth ckpt/resnet50_rnn__mp3d.pth --img_glob assets/preprocessed/demo_aligned_rgb.png --output_dir assets/inferenced --visualize
    • --pth
      path to the trained model.
    • --img_glob
      path to the preprocessed image.
    • --output_dir
      path to the directory to dump results.
    • --visualize
      optinoal for visualizing model raw outputs.
    • --force_cuboid
      add this option if you want to estimate cuboid layout (4 walls).
  • Outputs: You will get results like below and prefix with source image basename.
    • The 1d representation are visualized under file name
      [SOURCE BASENAME].raw.png
    • The extracted corners of the layout
      [SOURCE BASENAME].json
      {"z0": 50.0, "z1": -59.03114700317383, "uv": [[0.029913906008005142, 0.2996523082256317], [0.029913906008005142, 0.7240479588508606], [0.015625, 0.3819984495639801], [0.015625, 0.6348703503608704], [0.056027885526418686, 0.3881891965866089], [0.056027885526418686, 0.6278984546661377], [0.4480381906032562, 0.3970482349395752], [0.4480381906032562, 0.6178648471832275], [0.5995567440986633, 0.41122356057167053], [0.5995567440986633, 0.601679801940918], [0.8094607591629028, 0.36505699157714844], [0.8094607591629028, 0.6537724137306213], [0.8815288543701172, 0.2661873996257782], [0.8815288543701172, 0.7582473754882812], [0.9189453125, 0.31678876280784607], [0.9189453125, 0.7060701847076416]]}

3. Layout 3D Viewer

  • Execution: Visualizing the predicted layout in 3D using points cloud.
    python --img assets/preprocessed/demo_aligned_rgb.png --layout assets/inferenced/demo_aligned_rgb.json --ignore_ceiling
    • --img
      path to preprocessed image
    • --layout
      path to the json output from
    • --ignore_ceiling
      prevent showing ceiling
    • See
      python -h
      for usage help.
  • Outputs: In the window, you can use mouse and scroll wheel to change the viewport

Your own dataset

See tutorial on how to prepare it.


To train on a dataset, see

python -h
for detailed options explaination.\ Example:
python --id resnet50_rnn
- Important arguments: -
required. experiment id to name checkpoints and logs -
folder to output checkpoints (default: ./ckpt) -
folder to logging (default: ./logs) -
finetune mode if given. path to load saved checkpoint. -
backbone of the network (default: resnet50) - other options:
whether to remove rnn (default: False) -
root directory to training dataset. (default:
) -
root directory to validation dataset. (default:
) - If giveng, the epoch with best 3DIoU on validation set will be saved as
training mini-batch size (default: 4) -
epochs to train (default: 300) -
learning rate (default: 0.0001)

Quantitative Evaluation - Cuboid Layout

To evaluate on PanoContext/Stanford2d3d dataset, first running the cuboid trained model for all testing images:

python --pth ckpt/resnet50_rnn__panos2d3d.pth --img_glob "data/layoutnet_dataset/test/img/*" --output_dir output/panos2d3d/resnet50_rnn/ --force_cuboid
shell-style wildcards for all testing images. -
path to the directory to dump results. -
enfoce output cuboid layout (4 walls) or the PE and CE can't be evaluated.

To get the quantitative result:

python --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/*txt"
shell-style wildcards for all the model estimation. -
shell-style wildcards for all the ground truth.

If you want to: - just evaluate PanoContext only

python --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/pano*txt"
- just evaluate Stanford2d3d only
python --dt_glob "output/panos2d3d/resnet50_rnn/*json" --gt_glob "data/layoutnet_dataset/test/label_cor/camera*txt"

:clipboard: The quantitative result for the released

is shown below:

| Testing Dataset | 3D IoU(%) | Corner error(%) | Pixel error(%) | | :-------------: | :-------: | :------: | :--------------: | | PanoContext |

| | Stanford2D3D |
| | All |

Quantitative Evaluation - General Layout


  • Faster pre-processing script (top-fron alignment) (maybe cython implementation or fernandez2018layouts)


  • Credit of this repo is shared with ChiWeiHsiao.
  • Thanks limchaos for the suggestion about the potential boost by fixing the non-expected behaviour of Pytorch dataloader. (See Issue#4)


Please cite our paper for any purpose of usage.

  author    = {Cheng Sun and
               Chi{-}Wei Hsiao and
               Min Sun and
               Hwann{-}Tzong Chen},
  title     = {HorizonNet: Learning Room Layout With 1D Representation and Pano Stretch
               Data Augmentation},
  booktitle = {{IEEE} Conference on Computer Vision and Pattern Recognition, {CVPR}
               2019, Long Beach, CA, USA, June 16-20, 2019},
  pages     = {1047--1056},
  year      = {2019},

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.