The model performance page is used to drill down into the performance of each individual model and slice the data to determine whether the model performs better/worse on different subsets of the data.
We want this page to show a single version of a single model, so we’ll inform the user they need to select a model and version number if they haven’t selected one. So we create 2 more “New measure”s using DAX:
selected_model = IF(
DISTINCTCOUNT(model_metric_tracking[model_name]) > 1,
"Select A Model",
selected_version = IF(
DISTINCTCOUNT(model_metric_tracking[model_version]) > 1,
"Select A Version",
Now that we’ve got these, we can insert our cards and slicers.
We’ll start with the slicers:
The first slicer on the left is our model name slicer, so we insert a slicer and provide
model_name as the field. Under “Selection controls” we’ll want to ensure “Single select” is on. We can also make this slicer a “Dropdown” rather than a “List”. Then we can select a single model as shown:
The second slicer on the left is our model version slicer, we’ll let a user select the “latest” model. The reason for this is that we can then ensure this page will always show the latest model, even after re-training.
This is given as a “New column” as below, if the model version is the same as the maximum model version for that model name we return “Latest”, otherwise we return the number as a string:
latest_version_number = IF(
) = model_metric_tracking[model_version],
As this is now a string and you’ll get odd issues when sorting, like having “1”, “10”, “2”… instead of “1”, “2”…”10″, we’ll want to sort this by our
model_version column. We can do this by clicking on the field and then in “Column Tools” selecting the “Order by Column” button and clicking on “model_version”.
We then need to make sure we select “Sort descending” to ensure that latest shows up first and the most recent trained models follow.
% Lower Status Population
The 2 most important features for predicting average district house prices were “LSTAT” and “RM” so we’ve created slicers for these to see how our model performs on different subsets of data here.
“LSTAT” is “% Lower Status Population” and we want to bin this for our slicer. This has been done as follows in DAX using the
SWITCH statement to create a “New column”, there are no percentage values above 40:
LSTAT_BINNED = SWITCH(
model_metric_tracking[LSTAT] < 10,
model_metric_tracking[LSTAT] < 20,
model_metric_tracking[LSTAT] < 30,
We use this calculated column for the slicer. For the dropdown here we provide the option to “Select All”:
Avg. Room Count
As mentioned above, the other important feature for predicting average district house prices was “RM”, which is the average number of rooms. Again we bin this data using DAX in a “New column”:
RM_BINNED = SWITCH(
model_metric_tracking[RM] < 5,
model_metric_tracking[RM] < 6,
model_metric_tracking[RM] < 7,
model_metric_tracking[RM] < 8,
Again, we use this calculated column for the slicer.
Next to the slicers we have our cards, these will update based on the choices in the slicers.
For each card, we’ve turned on the title and updated the formatting of the title and turned the category off (except for the last 2 where we use it to indicate
$'000s for RMSE).
This is copied for all 6 cards shown:
- Model Name (selected_model)
- Model Version (selected_version)
- Training Date (date – select “Latest” for aggregation)
- Training Values (is_train – select “Sum” for aggregation)
- Combined RMSE (rmse)
- Test RMSE (test_rmse)
We create a scatter chart to show how the model performed on each data point vs the ground truth, so start by selecting a Scatter Chart:
Then select the appropriate fields (
"PRICE" for X Axis and
"PRICE_PREDICTION" for Y Axis):
Some changes we’ll need to make to this scatter chart:
- Click on Dropdown menu for Y Axis and click Don’t Summarize
- If you have a lot of data to show, increase the data point limit
- I choose not to fill the points as the level of the transparency by default is quite low so it’s difficult to determine the density of points
Data to Show selection
There is a slicer below the scatter chart, from which you can choose to select the data you want to see in the scatter chart from:
This is done by creating a new calculated column in DAX:
training_label = IF(model_metric_tracking[is_train] = 0, "Training", "Test")
And creating a slicer from this new calculated column. However, as we only want to have this slicer select the data down on the scatter chart and not on our cards, we need to turn off those interactions. We can do that by selecting our slicer, then in the “Format” ribbon, select “Edit interactions” and turn off all the interactions with other visualizations that aren’t the scatter chart: