by jindongwang

A tutorial for using deep learning for activity recognition (Pytorch and Tensorflow)

142 Stars 58 Forks Last release: Not found 22 Commits 0 Releases

Available items

No Items, yet!

The developer of this repository has not created any items for sale yet. Need a bug fixed? Help with integration? A different license? Create a request here:

Deep Learning for Human Activity Recognition

Deep learning is perhaps the nearest future of human activity recognition. While there are many existing non-deep method, we still want to unleash the full power of deep learning. This repo provides a demo of using deep learning to perform human activity recognition.

We support both Tensorflow and Pytorch.


  • Python 3.x
  • Numpy
  • Tensorflow or Pytorch 1.0+


There are many public datasets for human activity recognition. You can refer to this survey article Deep learning for sensor-based activity recognition: a survey to find more.

In this demo, we will use UCI HAR dataset as an example. This dataset can be found in here.

Of course, this dataset needs further preprocessing before being put into the network. I've also provided a preprocessing version of the dataset as a

file so you can focus on the network (download HERE). It is also highly recommended you download the dataset so that you can experience all the process on your own.

| #subject | #activity | Frequency | | --- | --- | --- | | 30 | 6 | 50 Hz |


  • For Pytorch (recommend), go to

    folder, config the folder of your data in', and then run`.
  • For tensorflow, run
    file. The update of tensorflow version is stopped since I personally like Pytorch.

Network structure

What is the most influential deep structure? CNN it is. So we'll use CNN in our demo.

CNN structure

Convolution + pooling + convolution + pooling + dense + dense + dense + output

That is: 2 convolutions, 2 poolings, and 3 fully connected layers.

About the inputs

That dataset contains 9 channels of the inputs: (accbody, acctotal and acc_gyro) on x-y-z. So the input channel is 9.

Dataset providers have clipped the dataset using sliding window, so every 128 in

can be considered as an input. In real life, you need to first clipped the input using sliding window.

So in the end, we reformatted the inputs from 9 inputs files to 1 file, the shape of that file is

, that is, every windows has 9 channels with each channel has length 128. When feeding it to Tensorflow, it has to be reshaped to
as we expect there is 128 X 1 signals for every channel.

Related projects

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.