Need help with text_renderer?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

oh-my-ocr
291 Stars 70 Forks 66 Commits 2 Opened issues

Services available

!
?

Need anything else?

Contributors list

Text Renderer

Generate text line images for training deep learning OCR model (e.g. CRNN). example

  • [x] Modular design. You can easily add different components: Corpus, Effect, Layout.
  • [x] Integrate with imgaug, see imgaug_example for usage.
  • [x] Support render multi corpus on image with different effects. Layout is responsible for the layout between multiple corpora
  • [x] Support apply effects on different stages of rendering process corpus_effects, layout_effects, render_effects.
  • [x] Generate vertical text.
  • [x] Support generate
    lmdb
    dataset which compatible with PaddleOCR, see Dataset
  • [x] A web font viewer.
  • [ ] Corpus sampler: helpful to perform character balance

Documentation

Run Example

Run following command to generate images using example data:

git clone https://github.com/oh-my-ocr/text_renderer
cd text_renderer
python3 setup.py develop
pip3 install -r docker/requirements.txt
python3 main.py \
    --config example_data/example.py \
    --dataset img \
    --num_processes 2 \
    --log_period 10

The data is generated in the

example_data/output
directory. A
labels.json
file contains all annotations in follow format:
json
{
  "labels": {
    "000000000": "test",
    "000000001": "text2"
  },
  "sizes": {
    "000000000": [
      120,
      32 
    ],
    "000000001": [
      128,
      32 
    ]
  },
  "num-samples": 2
}

You can also use

--dataset lmdb
to store image in lmdb file, lmdb file contains follow keys: - num-samples - image-000000000 - label-000000000 - size-000000000

You can check config file example_data/example.py to learn how to use text_renderer, or follow the Quick Start to learn how to setup configuration

Quick Start

Prepare file resources

  • Font files:
    .ttf
    .otf
    .ttc
  • Background images of any size, either from your business scenario or from publicly available datasets (COCO, VOC)
  • Corpus: text_renderer offers a wide variety of text sampling methods, to use these methods, you need to consider the preparation of the corpus from two perspectives:
  • The corpus must be in the target language for which you want to perform OCR recognition
  • The corpus should meets your actual business needs, such as education field, medical field, etc.
  • Charset file [Optional but recommend]: OCR models in real-world scenarios (e.g. CRNN) usually support only a limited character set, so it's better to filter out characters outside the character set during data generation. You can do this by setting the chars_file parameter

You can download pre-prepared file resources for this

Quick Start
from here:

Save these resource files in the same directory:

workspace
├── bg
│ └── background.png
├── corpus
│ └── eng_text.txt
└── font
    └── simsun.ttf

Create config file

Create a

config.py
file in
workspace
directory. One configuration file must have a
configs
variable, it's a list of GeneratorCfg.

The complete configuration file is as follows: ```python import os from pathlib import Path

from textrenderer.effect import * from textrenderer.corpus import * from text_renderer.config import ( RenderCfg, NormPerspectiveTransformCfg, GeneratorCfg, SimpleTextColorCfg, )

CURRENTDIR = Path(os.path.abspath(os.path.dirname(file_)))

def storydata(): return GeneratorCfg( numimage=10, savedir=CURRENTDIR / "output", rendercfg=RenderCfg( bgdir=CURRENTDIR / "bg", height=32, perspectivetransform=NormPerspectiveTransformCfg(20, 20, 1.5), corpus=WordCorpus( WordCorpusCfg( textpaths=[CURRENTDIR / "corpus" / "engtext.txt"], fontdir=CURRENTDIR / "font", fontsize=(20, 30), numword=(2, 3), ), ), corpuseffects=Effects(Line(0.9, thickness=(2, 5))), gray=False, textcolorcfg=SimpleTextColorCfg(), ), )

configs = [story_data()] ```

In the above configuration we have done the following things:

  1. Specify the location of the resource file
  2. Specified text sampling method: 2 or 3 words are randomly selected from the corpus
  3. Configured some effects for generation
  4. Specifies font-related parameters:
    font_size
    ,
    font_dir

Run

Run

main.py
, it only has 4 arguments: - config:Python config file path - dataset: Dataset format
img
or
lmdb
- numprocesses: Number of processes used - logperiod: Period of log printing. (0, 100)

All Effect/Layout Examples

Find all effect/layout config example at link

  • bg_and_text_mask
    : Three images of the same width are merged together horizontally, it can be used to train GAN model like EraseNet

| | Name | Example | |---:|:-------------------------------------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | 0 | bgandtextmask | bg_and_text_mask.jpg | | 1 | charspacingcompact | char_spacing_compact.jpg | | 2 | charspacinglarge | char_spacing_large.jpg | | 3 | colorimage | color_image.jpg | | 4 | curve | curve.jpg | | 5 | dropouthorizontal | dropout_horizontal.jpg | | 6 | dropoutrand | dropout_rand.jpg | | 7 | dropoutvertical | dropout_vertical.jpg | | 8 | emboss | emboss.jpg | | 9 | extratextlinelayout | extra_text_line_layout.jpg | | 10 | linebottom | line_bottom.jpg | | 11 | linebottomleft | line_bottom_left.jpg | | 12 | linebottomright | line_bottom_right.jpg | | 13 | linehorizontalmiddle | line_horizontal_middle.jpg | | 14 | lineleft | line_left.jpg | | 15 | lineright | line_right.jpg | | 16 | linetop | line_top.jpg | | 17 | linetopleft | line_top_left.jpg | | 18 | linetopright | line_top_right.jpg | | 19 | lineverticalmiddle | line_vertical_middle.jpg | | 20 | padding | padding.jpg | | 21 | perspectivetransform | perspective_transform.jpg | | 22 | samelinelayoutdifferentfontsize | same_line_layout_different_font_size.jpg | | 23 | vertical_text | vertical_text.jpg |

Contribution

  • Corpus: Feel free to contribute more corpus generators to the project, It does not necessarily need to be a generic corpus generator, but can also be a business-specific generator, such as generating ID numbers

Run in Docker

Build image

docker build -f docker/Dockerfile -t text_renderer .

Config file is provided by

CONFIG
environment. In
example.py
file, data is generated in
example_data/output
directory, so we map this directory to the host.
docker run --rm \
-v `pwd`/example_data/docker_output/:/app/example_data/output \
--env CONFIG=/app/example_data/example.py \
--env DATASET=img \
--env NUM_PROCESSES=2 \
--env LOG_PERIOD=10 \
text_renderer

Font Viewer

Start font viewer

streamlit run tools/font_viewer.py -- web /path/to/fonts_dir

image

Build docs

cd docs
make html
open _build/html/index.html

Citing text_renderer

If you use text_renderer in your research, please consider use the following BibTeX entry.

@misc{text_renderer,
  author =       {oh-my-ocr},
  title =        {text_renderer},
  howpublished = {\url{https://github.com/oh-my-ocr/text_renderer}},
  year =         {2021}
}

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.