data-and-data

Fellas, Just wanted to share a story that’s been bugging me for days. Last week, I had the opportunity to attend a presentation by a vendor (unfortunately, for certain reasons, I can’t mention the name). Basically, they’ve been developing a web-based app specifically for tagging purposes (I can’t go into more detail). I was hooked by the concise yet appealing presentation until...they started showcasing a machine learning model they had built to perform the tagging task. The tagging, simply put, is a text classification task with multi-label output (a text can be classified into more than one label at the same time). They claimed the model had been performing consistently well, demonstrating the usual metrics: accuracy, precision, recall, and F1-score. All the scores were above 94%, which left some of my colleagues in awe. This is where I started to smell something fishy. I began asking multiple questions to confirm my suspicion. Starting with the dataset they used, then moving on to h...

data-and-data

Posts

Don't Get Fooled by Numbers!