How to Avoid Disaster


Important: Your model is only as good as the data it was trained on

Two problems to watch out for:

  1. out-of-domain data: "data that our model sees in production that is very different to wath it saw during training.
  2. domain shift: "whereby the type of data that our model sees changes over time."

Mitigation steps:

"Where possible, the first step is to use an entirely manual process with your model running in parallel and not being used to directly drive any actions."

"The second step is to try and limit the scop of the model."

"The third step is to gradually increase the scope of your rollout."

Tip: "Try to think about all the ways in which your system could go wrong, and then think about what measure or report or picture could reflect that problem, and ensure that your regular reporting includes that information."
Note: Defining good validation and tests sets are part of the solution. See my "How to create good validation and test sets" for more details.

1. "Chaper 2: From Model to Production". In The Fastbook pp.86-90 provides even more detail and is worth a thorough read