Need help with contextual-repr-analysis?
Click the “chat” button below for chat support from the developer who created it, or find similar developers for support.

About the developer

204 Stars 31 Forks 5 Commits 1 Opened issues


A toolkit for evaluating the linguistic knowledge and transferability of contextual representations. Code for "Linguistic Knowledge and Transferability of Contextual Representations" (NAACL 2019).

Services available


Need anything else?

Contributors list

# 6,662
4 commits

Build Status codecov


A toolkit for evaluating the linguistic knowledge and transferability of contextual word representations. Code for Linguistic Knowledge and Transferability of Contextual Representations, to appear at NAACL 2019.

For a description of the included tasks, see

Table of Contents


This project is being developed in Python 3.6, and CI runs the tests in Python 3.6 as well (via TravisCI).

Conda will set up a virtual environment with the exact version of Python used for development along with all the dependencies needed to run the code.

  1. Download and install Conda.

  2. Change your directory to your clone of this repo.

    cd contextual-repr-analysis
  3. Create a Conda environment with Python 3.6 .

    conda create -n contextual_repr_analysis python=3.6
  4. Now activate the Conda environment. You will need to activate the Conda environment in each terminal in which you want to run code from this repo.

    source activate contextual_repr_analysis
  5. Install the required dependencies.

    pip install -r requirements.txt

You should now be able to test your installation with

py.test -v
. Congratulations!

Getting Started: Evaluating Representations

This section walks through an example of evaluating ELMo on the English Web Treebank (EWT) English POS tagging task.

Step 1. Precomputing the Word Representations

The easiest way to get started with evaluating your representations is to precompute representations for each word in each sentence in the evaluation dataset. In each of the

directories, there exists a text file of sentences (newline delimited, and tokens are space-delimited). These are the sentences used during training and evaluation, so getting representations for these should be enough. If you write a new
and want to generate these sentences, use the script at
python ./scripts/ -h
for more information on usage).

The format of the HDF5 file should be as follows:

  1. The keys should be numbers (represented as strings), corresponding to line numbers.

  2. The value associated with each key is expected to a numpy array of word representations. Acceptable shapes are

    (sequence_length, representation_dim)
    (num_layers, sequence_length, representation_dim)
  3. Another key, the string value

    , should store a string-serialized JSON dictionary mapping from sentences (the sentences that the representations in the values are calculated from) to string numbers (the other keys of the HDF5 file).

If you have a

Dict[str, str]
and a
containing a mapping from
numbers to vectors (a dictionary with consecutive numbers as keys and numpy arrays as values), you can pass the dictionaries into the following function to produce an HDF5 file with the proper format.
def make_hdf5_file(sentence_to_index, vectors):
    with h5py.File(output_file_path, 'w') as fout:
        for key, embeddings in vectors.items():
                embeddings.shape, dtype='float32',
        sentence_index_dataset = fout.create_dataset(
        sentence_index_dataset[0] = json.dumps(sentence_to_index)

Step 2. Creating the experiment configuration

contains all the experiment configurations used in this project. TODO (nfliu): Write more about writing your own experiment config.

Step 3. Training the probing model

To train a probing model on top of the precomputed word representations, we use the

allennlp train

Given a single configuration file, we can train with:

allennlp train  -s  --include-package contexteval

For example, for training a contextualizer on the topmost ELMo layer (the default) for POS tagging:

allennlp train experiment_configs/elmo_original/ewt_pos_tagging.json \
    -s ewt_pos_tagging_topmost_layer \
    --include-package contexteval

Note that the precomputed contextualizers in the experiment config do not have a layer specified. This causes models to default to using the topmost layer. To train on, say, the first layer (index 0), you can run this command:

allennlp train experiment_configs/elmo_original/ewt_pos_tagging.json \
    -s models/elmo_original/ewt_pos_tagging_layer_0 --include-package contexteval \
    --overrides '{"dataset_reader": {"contextualizer": {"layer_num": 0}}, "validation_dataset_reader": {"contextualizer": {"layer_num": 0}}}'

To train on all layers, one-by-one, you can wrap the above in a bash for-loop.

for i in 0 1 2; do allennlp train experiment_configs/elmo_original/ewt_pos_tagging.json \
    -s models/elmo_original/ewt_pos_tagging_layer_${i} --include-package contexteval \
    --overrides '{"dataset_reader": {"contextualizer": {"layer_num": '${i}'}}, "validation_dataset_reader": {"contextualizer": {"layer_num": '${i}'}}}'; done

Step 4: Evaluating the probing model on test data

To evaluate a trained probing model on test data, use the

allennlp evaluate

To evaluate the three models we trained above and log the output to a file, we can run:

for i in 0 1 2; do allennlp evaluate models/elmo_original/ewt_pos_tagging_layer_${i}/model.tar.gz \
    --evaluation-data-file ./data/pos/en_ewt-ud-test.conllu --cuda-device 0 \
    --include-package contexteval 2>&1 | tee models/elmo_original/ewt_pos_tagging_layer_${i}/evaluation.log; done


  author    = {Liu, Nelson F.  and  Gardner, Matt  and  Belinkov, Yonatan  and  Peters, Matthew E.  and  Smith, Noah A.},
  title     = {Linguistic Knowledge and Transferability of Contextual Representations},
  booktitle = {Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies},
  year      = {2019}

We use cookies. If you continue to browse the site, you agree to the use of cookies. For more information on our use of cookies please see our Privacy Policy.