Python
Need help with bert-for-tf2?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.
kpe

Description

A Keras TensorFlow 2.0 implementation of BERT, ALBERT and adapter-BERT.

628 Stars 130 Forks MIT License 179 Commits 18 Opened issues

Services available

Need anything else?

BERT for TensorFlow v2

|Build Status| |Coverage Status| |Version Status| |Python Versions| |Downloads|

This repo contains a

TensorFlow 2.0
_
Keras
_ implementation of
google-research/bert
_ with support for loading of the original
pre-trained weights
_, and producing activations numerically identical to the one calculated by the original model.

ALBERT
_ and
adapter-BERT
_ are also supported by setting the corresponding configuration parameters (
shared_layer=True
,
embedding_size
for
ALBERT
_ and
adapter_size
for
adapter-BERT
_). Setting both will result in an adapter-ALBERT by sharing the BERT parameters across all layers while adapting every layer with layer specific adapter.

The implementation is build from scratch using only basic tensorflow operations, following the code in

google-research/bert/modeling.py
_ (but skipping dead code and applying some simplifications). It also utilizes
kpe/params-flow
_ to reduce common Keras boilerplate code (related to passing model and layer configuration arguments).

bert-for-tf2
_ should work with both
TensorFlow 2.0
_ and
TensorFlow 1.14
_ or newer.

NEWS

  • 30.Jul.2020 -
    VERBOSE=0
    env variable for suppressing stdout output.
  • 06.Apr.2020 - using latest
    py-params
    introducing
    WithParams
    base for
    Layer
    and
    Model
    . See news in
    kpe/py-params
    _ for how to update (
    _construct()
    signature has change and requires calling
    super().__construct()
    ).
  • 06.Jan.2020 - support for loading the tar format weights from
    google-research/ALBERT
    .
  • 18.Nov.2019 - ALBERT tokenization added (make sure to import as

    from bert import albert_tokenization
    or
    from bert import bert_tokenization
    ).
  • 08.Nov.2019 - using v2 per default when loading the

    TFHub/albert
    _ weights of
    google-research/ALBERT
    _.
  • 05.Nov.2019 - minor ALBERT word embeddings refactoring (

    word_embeddings_2
    ->
    word_embeddings_projector
    ) and related parameter freezing fixes.
  • 04.Nov.2019 - support for extra (task specific) token embeddings using negative token ids.

  • 29.Oct.2019 - support for loading of the pre-trained ALBERT weights released by

    google-research/ALBERT
    _ at
    TFHub/albert
    _.
  • 11.Oct.2019 - support for loading of the pre-trained ALBERT weights released by

    brightmart/albert_zh ALBERT for Chinese
    _.
  • 10.Oct.2019 - support for

    ALBERT
    _ through the
    shared_layer=True
    and
    embedding_size=128
    params.
  • 03.Sep.2019 - walkthrough on fine tuning with adapter-BERT and storing the fine tuned fraction of the weights in a separate checkpoint (see

    tests/test_adapter_finetune.py
    ).
  • 02.Sep.2019 - support for extending the token type embeddings of a pre-trained model by returning the mismatched weights in

    load_stock_weights()
    (see
    tests/test_extend_segments.py
    ).
  • 25.Jul.2019 - there are now two colab notebooks under

    examples/
    showing how to fine-tune an IMDB Movie Reviews sentiment classifier from pre-trained BERT weights using an
    adapter-BERT
    _ model architecture on a GPU or TPU in Google Colab.
  • 28.Jun.2019 - v.0.3.0 supports

    adapter-BERT
    _ (
    google-research/adapter-bert
    _) for "Parameter-Efficient Transfer Learning for NLP", i.e. fine-tuning small overlay adapter layers over BERT's transformer encoders without changing the frozen BERT weights.

LICENSE

MIT. See

License File 
_.

Install

bert-for-tf2
is on the Python Package Index (PyPI):

::

pip install bert-for-tf2

Usage

BERT in

bert-for-tf2
is implemented as a Keras layer. You could instantiate it like this:

.. code:: python

from bert import BertModelLayer

lbert = BertModelLayer(**BertModelLayer.Params( vocabsize = 16000, # embedding params usetokentype = True, usepositionembeddings = True, tokentypevocab_size = 2,

num_layers               = 12,           # transformer encoder params
hidden_size              = 768,
hidden_dropout           = 0.1,
intermediate_size        = 4*768,
intermediate_activation  = "gelu",

adapter_size = None, # see arXiv:1902.00751 (adapter-BERT)

shared_layer = False, # True for ALBERT (arXiv:1909.11942) embedding_size = None, # None for BERT, wordpiece embedding size for ALBERT

name = "bert" # any other Keras layer params

))

or by using the

bert_config.json
from a
pre-trained google model
_:

.. code:: python

import bert

modeldir = ".models/uncasedL-12H-768A-12"

bertparams = bert.paramsfrompretrainedckpt(modeldir) lbert = bert.BertModelLayer.fromparams(bertparams, name="bert")

now you can use the BERT layer in your Keras model like this:

.. code:: python

from tensorflow import keras

maxseqlen = 128 linputids = keras.layers.Input(shape=(maxseqlen,), dtype='int32') ltokentypeids = keras.layers.Input(shape=(maxseq_len,), dtype='int32')

# using the default tokentype/segment id 0 output = lbert(linputids) # output: [batchsize, maxseqlen, hiddensize] model = keras.Model(inputs=linputids, outputs=output) model.build(inputshape=(None, maxseq_len))

# provide a custom tokentype/segment id as a layer input output = lbert([linputids, ltokentypeids]) # [batchsize, maxseqlen, hiddensize] model = keras.Model(inputs=[linputids, ltokentypeids], outputs=output) model.build(inputshape=[(None, maxseqlen), (None, maxseq_len)])

if you choose to use

adapter-BERT
_ by setting the
adapter_size
parameter, you would also like to freeze all the original BERT layers by calling:

.. code:: python

lbert.applyadapter_freeze()

and once the model has been build or compiled, the original pre-trained weights can be loaded in the BERT layer:

.. code:: python

import bert

bertckptfile = os.path.join(modeldir, "bertmodel.ckpt") bert.loadstockweights(lbert, bertckpt_file)

N.B. see

tests/test_bert_activations.py
_ for a complete example.

FAQ

  1. In all the examlpes bellow, please note the line:

.. code:: python

# use in a Keras Model here, and call model.build()

for a quick test, you can replace it with something like:

.. code:: python

model = keras.models.Sequential([ keras.layers.InputLayer(inputshape=(128,)), lbert, keras.layers.Lambda(lambda x: x[:, 0, :]), keras.layers.Dense(2) ]) model.build(input_shape=(None, 128))

  1. How to use BERT with the
    google-research/bert
    _ pre-trained weights?

.. code:: python

modelname = "uncasedL-12H-768A-12" modeldir = bert.fetchgooglebertmodel(modelname, ".models") modelckpt = os.path.join(modeldir, "bertmodel.ckpt")

bertparams = bert.paramsfrompretrainedckpt(modeldir) lbert = bert.BertModelLayer.fromparams(bertparams, name="bert")

# use in a Keras Model here, and call model.build()

bert.loadbertweights(lbert, modelckpt) # should be called after model.build()

  1. How to use ALBERT with the
    google-research/ALBERT
    _ pre-trained weights (fetching from TFHub)?

see

tests/nonci/test_load_pretrained_weights.py 
_:

.. code:: python

modelname = "albertbase" modeldir = bert.fetchtfhubalbertmodel(modelname, ".models") modelparams = bert.albertparams(modelname) lbert = bert.BertModelLayer.fromparams(model_params, name="albert")

# use in a Keras Model here, and call model.build()

bert.loadalbertweights(lbert, albertdir) # should be called after model.build()

  1. How to use ALBERT with the
    google-research/ALBERT
    _ pre-trained weights (non TFHub)?

see

tests/nonci/test_load_pretrained_weights.py 
_:

.. code:: python

modelname = "albertbasev2" modeldir = bert.fetchgooglealbertmodel(modelname, ".models") modelckpt = os.path.join(albertdir, "model.ckpt-best")

modelparams = bert.albertparams(modeldir) lbert = bert.BertModelLayer.fromparams(modelparams, name="albert")

# use in a Keras Model here, and call model.build()

bert.loadalbertweights(lbert, modelckpt) # should be called after model.build()

  1. How to use ALBERT with the
    brightmart/albert_zh
    _ pre-trained weights?

see

tests/nonci/test_albert.py 
_:

.. code:: python

modelname = "albertbase" modeldir = bert.fetchbrightmartalbertmodel(modelname, ".models") modelckpt = os.path.join(modeldir, "albertmodel.ckpt")

bertparams = bert.paramsfrompretrainedckpt(modeldir) lbert = bert.BertModelLayer.fromparams(bertparams, name="bert")

# use in a Keras Model here, and call model.build()

bert.loadalbertweights(lbert, modelckpt) # should be called after model.build()

  1. How to tokenize the input for the
    google-research/bert
    _ models?

.. code:: python

dolowercase = not (modelname.find("cased") == 0 or modelname.find("multicased") == 0) bert.berttokenization.validatecasematchescheckpoint(dolowercase, modelckpt) vocabfile = os.path.join(modeldir, "vocab.txt") tokenizer = bert.berttokenization.FullTokenizer(vocabfile, dolowercase) tokens = tokenizer.tokenize("Hello, BERT-World!") tokenids = tokenizer.converttokenstoids(tokens)

  1. How to tokenize the input for
    brightmart/albert_zh
    ?

.. code:: python

import params_flow pf

# fetch the vocab file albertzhvocaburl = "https://raw.githubusercontent.com/brightmart/albertzh/master/albertconfig/vocab.txt" vocabfile = pf.utils.fetchurl(albertzhvocaburl, model_dir)

tokenizer = bert.alberttokenization.FullTokenizer(vocabfile) tokens = tokenizer.tokenize("你好世界") tokenids = tokenizer.converttokenstoids(tokens)

  1. How to tokenize the input for the
    google-research/ALBERT
    _ models?

.. code:: python

import sentencepiece as spm

spmmodel = os.path.join(modeldir, "assets", "30k-clean.model") sp = spm.SentencePieceProcessor() sp.load(spmmodel) dolower_case = True

processedtext = bert.alberttokenization.preprocesstext("Hello, World!", lower=dolowercase) tokenids = bert.alberttokenization.encodeids(sp, processed_text)

  1. How to tokenize the input for the Chinese
    google-research/ALBERT
    _ models?

.. code:: python

import bert

vocabfile = os.path.join(modeldir, "vocab.txt") tokenizer = bert.alberttokenization.FullTokenizer(vocabfile=vocabfile) tokens = tokenizer.tokenize(u"你好世界") tokenids = tokenizer.converttokensto_ids(tokens)

Resources

  • BERT
    _ - BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
  • adapter-BERT
    _ - adapter-BERT: Parameter-Efficient Transfer Learning for NLP
  • ALBERT
    _ - ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations
  • google-research/bert
    _ - the original
    BERT
    _ implementation
  • google-research/ALBERT
    _ - the original
    ALBERT
    _ implementation by Google
  • google-research/albert(old)
    _ - the old location of the original
    ALBERT
    _ implementation by Google
  • brightmart/albert_zh
    _ - pre-trained
    ALBERT
    _ weights for Chinese
  • kpe/params-flow
    _ - A Keras coding style for reducing
    Keras
    _ boilerplate code in custom layers by utilizing
    kpe/py-params
    _

.. _

kpe/params-flow
: https://github.com/kpe/params-flow .. _
kpe/py-params
: https://github.com/kpe/py-params .. _
bert-for-tf2
: https://github.com/kpe/bert-for-tf2

..

Keras
: https://keras.io .. _
pre-trained weights
: https://github.com/google-research/bert#pre-trained-models .. _
google-research/bert
: https://github.com/google-research/bert .. _
google-research/bert/modeling.py
: https://github.com/google-research/bert/blob/master/modeling.py .. _
BERT
: https://arxiv.org/abs/1810.04805 .. _
pre-trained google model
: https://github.com/google-research/bert .. _`tests/test
bertactivations.py`: https://github.com/kpe/bert-for-tf2/blob/master/tests/testcompareactivations.py .. _
TensorFlow 2.0
: https://www.tensorflow.org/versions/r2.0/api
docs/python/tf ..
TensorFlow 1.14
: https://www.tensorflow.org/versions/r1.14/api
docs/python/tf

..

google-research/adapter-bert
: https://github.com/google-research/adapter-bert/ .. _
adapter-BERT
: https://arxiv.org/abs/1902.00751 .. _
ALBERT
: https://arxiv.org/abs/1909.11942 .. _`brightmart/albert
zh ALBERT for Chinese

: https://github.com/brightmart/albert_zh
.. _
brightmart/albertzh`: https://github.com/brightmart/albertzh ..
google ALBERT weights
: https://github.com/google-research/google-research/tree/master/albert .. _
google-research/albert(old)
: https://github.com/google-research/google-research/tree/master/albert .. _
google-research/ALBERT
: https://github.com/google-research/ALBERT .. _
TFHub/albert
: https://tfhub.dev/google/albert
base/2

.. |Build Status| image:: https://travis-ci.org/kpe/bert-for-tf2.svg?branch=master :target: https://travis-ci.org/kpe/bert-for-tf2 .. |Coverage Status| image:: https://coveralls.io/repos/kpe/bert-for-tf2/badge.svg?branch=master :target: https://coveralls.io/r/kpe/bert-for-tf2?branch=master .. |Version Status| image:: https://badge.fury.io/py/bert-for-tf2.svg :target: https://badge.fury.io/py/bert-for-tf2 .. |Python Versions| image:: https://img.shields.io/pypi/pyversions/bert-for-tf2.svg .. |Downloads| image:: https://img.shields.io/pypi/dm/bert-for-tf2.svg .. |Twitter| image:: https://img.shields.io/twitter/follow/siddhadev?logo=twitter&label=&style= :target: https://twitter.com/intent/user?screen_name=siddhadev

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.