{"id":2578818,"date":"2023-04-12T13:00:13","date_gmt":"2023-04-12T17:00:13","guid":{"rendered":"https:\/\/wordpress-1016567-4521551.cloudwaysapps.com\/plato-data\/beyond-accuracy-evaluating-improving-a-model-with-the-nlp-test-library\/"},"modified":"2023-04-12T13:00:13","modified_gmt":"2023-04-12T17:00:13","slug":"beyond-accuracy-evaluating-improving-a-model-with-the-nlp-test-library","status":"publish","type":"station","link":"https:\/\/platodata.io\/plato-data\/beyond-accuracy-evaluating-improving-a-model-with-the-nlp-test-library\/","title":{"rendered":"Beyond Accuracy: Evaluating & Improving a Model with the NLP Test Library"},"content":{"rendered":"

Sponsored Post<\/p>\n

 <\/p>\n

\"Beyond
NLP Test: Deliver Safe & Effective Models<\/span>
 <\/p>\n

The need to test Natural Language Processing models<\/h2>\n

 
A few short years ago, one of our customers notified us about a bug. Our medical data de-identification model had near-perfect accuracy in identifying most patient names \u2014 as in \u201cMike Jones is diabetic\u201d \u2014 but was only around 90% accurate when encountering Asian names \u2014 as in \u201cWei Wu is diabetic\u201d. This was a big deal, since it meant that the model made 4 to 5 times<\/em> more mistakes for one ethnic group. It was also easy to fix, by augmenting the training dataset with more examples of this (and other) groups.<\/p>\n

Most importantly, it got us thinking:<\/p>\n