A best practice for tensorflow project template architecture.
A simple and well designed structure is essential for any Deep Learning project, so after a lot of practice and contributing in tensorflow projects here's a tensorflow project template that combines simplcity, best practice for folder structure and good OOP design. The main idea is that there's much stuff you do every time you start your tensorflow project, so wrapping all this shared stuff will help you to change just the core idea every time you start a new tensorflow project.
So, here's a simple tensorflow template that help you get into your main project faster and just focus on your core (Model, Training, ...etc)
In a nutshell here's how to use this template, so for example assume you want to implement VGG model so you should do the following: - In models folder create a class named VGG that inherit the "base_model" class
class VGGModel(BaseModel): def __init__(self, config): super(VGGModel, self).__init__(config) #call the build_model and init_saver functions. self.build_model() self.init_saver()
def build_model(self): # here you build the tensorflow graph of any model you want and also define the loss. pass
def init_saver(self): # here you initalize the tensorflow saver that will be used in saving the checkpoints. self.saver = tf.train.Saver(max_to_keep=self.config.max_to_keep)
In trainers folder create a VGG trainer that inherit from "base_train" class ```python
class VGGTrainer(BaseTrain): def init(self, sess, model, data, config, logger): super(VGGTrainer, self).init(sess, model, data, config, logger) ```
Override these two functions "trainstep", "trainepoch" where you write the logic of the training process ```python
def train_epoch(self): """ implement the logic of epoch: -loop on the number of iterations in the config and call the train step -add any summaries you want using the summary """ pass
def train_step(self): """ implement the logic of the train step - run the tensorflow session - return any metrics you need to summarize """ pass
- In main file, you create the session and instances of the following objects "Model", "Logger", "Data_Generator", "Trainer", and config ```python sess = tf.Session() # create instance of the model you want model = VGGModel(config) # create your data generator data = DataGenerator(config) # create tensorboard logger logger = Logger(sess, config)
Pass the all these objects to the trainer object, and start your training by calling "trainer.train()" ```python trainer = VGGTrainer(sess, model, data, config, logger)
**You will find a template file and a simple example in the model and trainer folder that shows you how to try your first model simply.**
│ ├── basemodel.py - this file contains the abstract class of the model.
│ └── basetrain.py - this file contains the abstract class of the trainer.
├── model - this folder contains any model of your project.
│ └── examplemodel.py
├── trainer - this folder contains trainers of your project.
│ └── exampletrainer.py
├── mains - here's the main(s) of your project (you may need more than one main). │ └── example_main.py - here's an example of main that is responsible for the whole pipeline.
├── data loader
│ └── datagenerator.py - here's the datagenerator that is responsible for all data handling. │ └── utils ├── logger.py └── anyotherutilsyou_need
Base model is an abstract class that must be Inherited by any model you create, the idea behind this is that there's much shared stuff between all models. The base model contains:
Here's where you implement your model. So you should :
Base trainer is an abstract class that just wrap the training process.
Here's what you should implement in your trainer.
This class also supports reporting to Comet.ml which allows you to see all your hyper-params, metrics, graphs, dependencies and more including real-time metric. Add your API key in the configuration file:
For example: "comet_api_key": "your key here"
This template also supports reporting to Comet.ml which allows you to see all your hyper-params, metrics, graphs, dependencies and more including real-time metric.
Add your API key in the configuration file:
"comet_api_key": "your key here"
Here's how it looks after you start training:
You can also link your Github repository to your comet.ml project for full version control. Here's a live page showing the example from this repo
I use Json as configuration method and then parse it, so write all configs you want then parse it using "utils/config/process_config" and pass this configuration object to all other objects.
Here's where you combine all previous part.
Any kind of enhancement or contribution is welcomed.