Louis’s Learning: Data Validation

Data Validation

As I write this, much of the world is hunkered down in their home trying to escape infection and propagation of the COVID-19 virus. I thank all those who are still helping others and providing essential services.

I realize that optimizing a supply chain is not a priority on anyone’s mind at the moment. If I am writing this post, it is more of a self-indulging exercise to bring a sense of normalcy back into my life. I hope others can find a smidgen of solace in reading this.

Using Data to Inform Decisions

My work as a supply chain practitioner and consultant has mostly revolved around providing decision-makers with information and data to help them make the right choice. I like to consider myself a “numbers guy”. It’s much easier to influence a decision if you have sound quantified data to back an assertion.

Looking at the data regarding COVID-19 infections, it would be easy for me to draw false conclusions. For instance, as of writing this blog, infection rates in the US are at 99 cases per million population. Taken on its own, this is a concern and a very difficult time for those involved, but quite manageable for our health care system. On the other hand, if you look at the growth rate of infections, then the situation is much more alarming and warrants a completely different approach to tackle the issue.

Question Everything

If I bring up this example, it is to highlight how easy it is to make errors when only narrowly looking at numbers. When conducting an analysis, it is always important to take a step back and validate your data. Conduct a survey of the numbers and question everything that seems out of the ordinary.

Here are a few examples where a detailed review yielded useful outcomes:

  • $0 cost shipment: these frequently represent customer pick-ups, but upon digging a little further it was found the data had been corrupted and columns got misaligned in a spreadsheet at some point in the analysis. A quick re-pull of the data resolved the issue.
  • Low quantity of orders: while the overall yearly quantity and value of orders had passed a “sniff” test, upon review of the monthly numbers, it was found the two months of orders were missing from the dataset.
  • Extra-large shipments: out of several hundred thousand shipments, fewer than 20 had weights in excess of a TL capacity (>44,000 lbs). Yet upon digging into the data it was found that many more shipments were being consolidated in the dataset and the issue had to be resolved.

To conclude, prior to delving into an analysis, whether it be a simple linear regression or a more complex AI algorithm, it pays to review and validate the data with those who understand it.

Louis Bourassa is the Analytics & Optimization Practice Head at JBF Consulting. He provides analytical and optimization support to JBF clients. Louis has a diverse background with a mix of industry, consulting and software roles that allowed him to develop a strong business acumen and expert knowledge of supply chain analysis and design.

Founded in 2003, JBF Consulting is a supply chain execution strategy and systems integrator to logistics-intensive companies of every size and any industry. Our background and deep experience in the field of packaged logistics technology implementation positions us as industry leaders whose craftsmanship exceeds our client expectations. We expedite the transformation of supply chains through logistics & technology strategy, packaged & bespoke software implementation, and analytics & optimization. For more information, visit us at www.jbf-consulting.com

vector: www.freepik.com