Source code/webpage/demos for the What-If Tool
The What-If Tool (WIT) provides an easy-to-use interface for expanding understanding of a black-box classification or regression ML model. With the plugin, you can perform inference on a large set of examples and immediately visualize the results in a variety of ways. Additionally, examples can be edited manually or programmatically and re-run through the model in order to see the results of the changes. It contains tooling for investigating model performance and fairness over subsets of a dataset.
The purpose of the tool is that give people a simple, intuitive, and powerful way to play with a trained ML model on a set of data through a visual interface with absolutely no code required.
The tool can be accessed through TensorBoard or as an extension in a Jupyter or Colab notebook.
Check out the large set of web and colab demos in the demo section of the What-If Tool website.
To build the web demos yourself: * Binary classifier for UCI Census dataset salary prediction * Dataset: UCI Census * Task: Predict whether a person earns more or less than $50k based on their census information * To build and run the demo from code:
bazel run wit_dashboard/demo:demoserverthen navigate to
http://localhost:6006/wit-dashboard/demo.html* Binary classifier for smile detection in images * Dataset: CelebA * Task: Predict whether the person in an image is smiling * To build and run the demo from code:
bazel run wit_dashboard/demo:imagedemoserverthen navigate to
http://localhost:6006/wit-dashboard/image_demo.html* Multiclass classifier for Iris dataset * Dataset: UCI Iris * Task: Predict which of three classes of iris flowers that a flower falls into based on 4 measurements of the flower * To build and run the demo from code:
bazel run wit_dashboard/demo:irisdemoserverthen navigate to
http://localhost:6006/wit-dashboard/iris_demo.html* Regression model for UCI Census dataset age prediction * Dataset: UCI Census * Task: Predict the age of a person based on their census information * To build and run the demo from code:
bazel run wit_dashboard/demo:agedemoserverthen navigate to
http://localhost:6006/wit-dashboard/age_demo.html* This demo model returns attribution values in addition to predictions (through the use of vanilla gradients) in order to demonstate how the tool can display attribution values from predictions.
You can use the What-If Tool to analyze a classification or regression TensorFlow Estimator that takes TensorFlow Example or SequenceExample protos (data points) as inputs directly in a jupyter or colab notebook.
Additionally, the What-If Tool can analyze AI Platform Prediction-hosted classification or regresssion models that take TensorFlow Example protos, SequenceExample protos, or raw JSON objects as inputs.
You can also use What-If Tool with a custom prediction function that takes Tensorflow examples and produces predictions. In this mode, you can load any model (including non-TensorFlow models that don't use Example protos as inputs) as long as your custom function's input and output specifications are correct.
With either AI Platform models or a custom prediction function, the What-If Tool can display and make use of attribution values for each input feature in relation to each prediction. See the below section on attribution values for more information.
If you want to train an ML model from a dataset and explore the dataset and model, check out the WhatIfToolNotebookUsage.ipynb notebook in colab, which starts from a CSV file, converts the data to tf.Example protos, trains a classifier, and then uses the What-If Tool to show the classifier performance on the data.
A walkthrough of using the tool in TensorBoard, including a pretrained model and test dataset, can be found on the What-If Tool page on the TensorBoard website.
To use the tool in TensorBoard, only the following information needs to be provided:
EstimatorAPI that will use thse appropriate TensorFlow Serving Classification or Regression APIs can be found in the saved model documentation and in this helpful tutorial. Models that use these APIs are the simplest to use with the What-If Tool as they require no set-up in the tool beyond setting the model type.
Alternatively, the What-If Tool can be used to explore a dataset directly from a CSV file. See the next section for details.
The information can be provided in the settings dialog screen, which pops up automatically upon opening this tool and is accessible through the settings icon button in the top-right of the tool. The information can also be provided directly through URL parameters. Changing the settings through the controls automatically updates the URL so that it can be shared with others for them to view the same data in the What-If Tool.
If you just want to explore the information in a CSV file using the What-If Tool in TensorBoard, just set the path to the examples to the file (with a ".csv" extension) and leave the inference address and model name fields blank. The first line of the CSV file must contain column names. Each line after that contains one example from the dataset, with values for each of the columns defined on the first line. The pipe character ("|") deliminates separate feature values in a list of feature values for a given feature.
In order to make use of the model understanding features of the tool, you can have columns in your dataset that contain the output from an ML model. If your file has a column named "predictions__probabilities" with a pipe-delimited ("|") list of probability scores (between 0 and 1), then the tool will treat those as the output scores of a classification model. If your file has a numeric column named "predictions" then the tool will treat those as the output of a regression model. In this way, the tool can be used to analyze any dataset and the results of any model run offline against the dataset. Note that in this mode, the examples aren't editable as there is no way to get new inference results when an example changes.
Details on the capabilities of the tool, including a guided walkthrough, can be found on the What-If Tool website. Here is a basic rundown of what it can do:
Visualize a dataset of TensorFlow Example protos.
Visualize the results of the inference
Explore counterfactual examples
Edit a selected example in the browser and re-run inference and visualize the difference in the inference results.
Compare the results of two models on the same input data.
If using a binary classification model and your examples include a feature that describes the true label, you can do the following:
If using a multi-class classification model and your examples include a feature that describes the true label, you can do the following:
If using a regression model and your examples include a feature that describes the true label, you can do the following:
We imagine WIT to be useful for a wide variety of users. * ML researchers and model developers - Investigate your datasets and models and explore inference results. Poke at the data and model to gain insights, for tasks such as debugging strange results and looking into ML fairness. * Non-technical stakeholders - Gain an understanding of the performance of a model on a dataset. Try it out with your own data. * Lay users - Learn about machine learning by interactively playing with datasets and models.
As seen in the example notebook, creating the
WitWidgetobject is what causes the What-If Tool to be displayed in an output cell. The
WitWidgetobject takes a
WitConfigBuilderobject as a constructor argument. The
WitConfigBuilderobject specifies the data and model information that the What-If Tool will use.
The WitConfigBuilder object takes a list of tf.Example or tf.SequenceExample protos as a constructor argument. These protos will be shown in the tool and inferred in the specified model.
The model to be used for inference by the tool can be specified in many ways: - As a TensorFlow Estimator object that is provided through the
set_estimator_and_feature_specmethod. In this case the inference will be done inside the notebook using the provided estimator. - As a model hosted by AI Platform Prediction through the
set_ai_platform_modelmethod. - As a custom prediction function provided through
set_custom_predict_fnmethod. In this case WIT will directly call the function for inference. - As an endpoint for a model being served by TensorFlow Serving, through the
set_model_namemethods. In this case the inference will be done on the model server specified. To query a model served on host "localhost" on port 8888, named "mymodel", you would set on your builder `builder.setinferenceaddress('localhost:8888').setmodelname('mymodel')`.
See the documentation of WitConfigBuilder for all options you can provide, including how to specify other model types (defaults to binary classification) and how to specify an optional second model to compare to the first model.
Feature attribution values are numeric values for each input feature to an ML model that indicate how much impact that feature value had on the model's prediction. There are a variety of approaches to get feature attribution values for a predicts from an ML model, including SHAP, Integrated Gradients, SmoothGrad, and more.
They can be a powerful way of analyzing how a model reacts to certain input values beyond just simply studying the effect that changing individual feature values has on model predictions as is done with partial dependence plots. Some attribution techniques require access to a model internals, such as the gradient-based methods, whereas others can be performed on black-box models. Regardless, the What-If Tool can visualize the results of attribution methods in addition to the standard model prediction results.
There are two ways to use the What-If Tool to visualize attribution values. If you have deployed a model to Cloud AI Platform with the explainability feature enabled, and provide this model to the tool through the standard
set_ai_platform_modelmethod, then attribution values will automatically be generated and visualized by the tool with no additional setup needed. If you wish to view attribution values for a different model setup, this can be accomplished through use of the custom prediction function.
As described in the
set_custom_predict_fndocumentation in WitConfigBuilder, this method must return a list of the same size as the number of examples provided to it, with each list entry representing the prediction-time information for that example. In the case of a standard model with no attribution information, the list entry is just a number (in the case of a regression model), or a list of class probabilities (in the case of a classification model).
However, if there is attribution or other prediction-time information, then the list entry can instead be a dictionary, with the standard model prediction output under the
predictionskey. Attribution information can be returned under the
attributionskey and any other supplemental information under its own descriptive key. The exact format of the attributions and other supplemental information can be found in the code documentation linked above.
If attribution values are provided to the What-If Tool, they can be used in a number of ways. First, when selecting a datapoint in the Datapoint Editor tab, the attribution values are displayed next to each feature value and the features can be ordered by their attribution strength instead of alphabetically. Additionally, the feature values are colored by their attribution values for quick interpretation of attribution strengths.
Beyond displaying the attribution values for the selected datapoint, the attribution values for each feature can be used in the tool in the same ways as any other feature of the datapoints. They can be used selected in the datapoints visualization controls to use those values to create custom scatter plots and histograms. For example, you can create a scatterplot showing the relationship between the attribution value of two different features, with the datapoints colored by the predicted result from the model. They can also be used in the Performance tab as a way to slice a dataset for comparing performance statistics of different slices. For example, you can quickly compare the aggregate performance of a model on datapoints with low attribution of a specified feature, against the datapoints with high attribution of that feature.
Any other supplemental information returned from a custom prediction function will appear in the tool as a feature named after its key in the dictionary. They can also be used in the same way, driving custom visualizations and as a dimension to slice when analyzing aggregate model performance.
When a datapoint is edited and the re-inferred through the model with the "Run inference" button, the attributions and other supplemental information is recalculated and updated in the tool.
For an example of returning attribution values from a custom prediction function (in this case using the SHAP library to get attributions), see the WIT COMPAS with SHAP notebook.
First, install and enable WIT for Jupyter through the following commands:
sh pip install witwidget jupyter nbextension install --py --symlink --sys-prefix witwidget jupyter nbextension enable --py --sys-prefix witwidget
Then, use it as seen at the bottom of the WhatIfToolNotebookUsage.ipynb notebook.
Install the widget into the runtime of the notebook kernel by running a cell containing:
!pip install witwidget
Then, use it as seen at the bottom of the WhatIfToolNotebookUsage.ipynb notebook.
WIT has been tested in JupyterLab versions 1.x, 2.x, and 3.x.
Install and enable WIT for JupyterLab 3.x by running a cell containing:
!pip install witwidget !jupyter labextension install wit-widget !jupyter labextension install @jupyter-widgets/jupyterlab-managerNote that you may need to specify the correct version of jupyterlab-manager for you JupyterLab version as per https://www.npmjs.com/package/@jupyter-widgets/jupyterlab-manager.
Note that you may need to run
!sudo jupyter labextension ...commands depending on your notebook setup.
Use of WIT after installation is the same as with the other notebook installations.
Yes. You can do this by defining a python function named
custom_predict_fnwhich takes two arguments: a list of examples to preform inference on, and the serving bundle object which contains information about the model to query. The function should return a list of results, one entry per example provided. For regression models, the result is just a number. For classification models, the result is a list of numbers, representing the class scores for each possible class. Here is a minimal example that just returns random results:
The function name "custom_predict_fn" must be exact.
def custom_predict_fn(examples, serving_bundle):
Examples are a list of TFRecord objects, each object contains the features of each point.
serving_bundle is a dictionary that contains the setup information provided to the tool,
such as server address, model name, model version, etc.
number_of_examples = len(examples) results =  for _ in range(number_of_examples): score = random.random() results.append([score, 1 - score]) # For binary classification # results.append(score) # For regression return results
Define this function in a file you save to disk. For this example, let's assume the file is saved as
/tmp/my_custom_predict_function.py. Then the TensorBoard server with
tensorboard --whatif-use-unsafe-custom-prediction /tmp/my_custom_predict_function.pyand the function should be invoked once you have set up your data and model in the What-If Tool setup dialog. The
unsafemeans that the function is not sandboxed, so make sure that your function doesn't do anything destructive, such as accidently delete your experiment data.
Check out the developement guide.
Check out the release notes.