Is data science fundamentally about storytelling? Plenty of people think so:
I like this idea better: Approach your analytics projects as myth busting
. This puts you in the right frame of mind to make sense of patterns in your data and then take effective action based on what you find.
Analyzing a natural experiment
In September, we wrote about a natural experiment
we identified in our data set. One of our clients lowered, via a configuration setting, the number of candidates each supplier could submit for a particular position. The client along with their MSP expected that this change would decrease the resume load on hiring managers without impacting time to fill or quality of candidates.
Our client’s managed service provider (MSP) had regular complaints from hiring managers about receiving too many resumes—sometimes, at the request of hiring managers inundated with too many candidates at once. MSP staffers had to manually shut off submissions to certain requisitions.
Approaching this from a myth-busting position requires stating up front what hypothesis you’re testing. This is a fundamental step in statistical inference, and indeed, in science in general
. The MSP thought that reducing the submittal limit setting would reduce the total number of resumes that hiring managers received, without increasing time to fill or impacting candidate quality.
Is this, in fact, what happened?
No – it’s not. We busted that myth. We found that the average total number of resumes submitted per position actually went up after the setting was changed. You can see it in the chart below, which shows boxplots of submission counts totaled up per position. In a boxplot, the line across the middle of the box indicates the median, while the lower and upper boundaries of the boxes indicate the first and third quartiles.
Across all job categories, the total median submissions per requisition increased. This was a statistically significant change, suggesting it wasn’t due just to chance variation. Because nothing else substantive changed other than the supplier submittal limit setting, we can pretty comfortably attribute this difference to the settings change.
We investigated other aspects of the situation including the differing time-based patterns of accumulation of submissions under the two settings regimes and the impact of the change on time to fill. If you’re interested in reading a report of our analysis results, sign up to receive IQN Labs research
A natural urge when you see counterintuitive results like this is to start telling stories about why it occurred. Every time we share these results with procurement specialists, we hear interesting and informed speculation about why total average submissions per position increased when the limit was lowered. While there’s nothing wrong with story time, we need to be careful to take a skeptical approach to any of them, especially if we are going to make changes to processes or decisions based on them.
Moving from the actual evidence from the data into storytelling is fraught with potential complications. Most often the problem we encounter is one of inferring causation when we only see correlation. Statistician Kaiser Fung calls the turn from evidence-based reporting of data analytic results to storytelling “story time
” and finds it happening with regularity in journalists’ reporting of scientific research. Of course it occurs in research papers too
In this case, we don’t have any good information about why suppliers submitted more resumes after the change—we just know that they did.
Towards dynamic supplier submissions management
At IQN Labs, we start by investigating business questions like this, but our goal is eventually to develop machine intelligence supporting effective temporary labor hiring practices. Our intention is not as much to understand what happened in the past as to improve what happens in the future. So the next step in our standard process is not to investigate why we saw the counterintuitive result, but rather to consider the implication of these results for how to best use data to drive better outcomes.
You can see our process diagrammed below. We undertake individual analyses in order to generate analytic ideas and assets that will eventually form the basis of data-intensive innovations in our application.
The takeaway for us from this analysis is the potential opportunity that lies in more dynamically managing supplier submission limits. MSP representatives shouldn’t have to field calls from hiring managers saying “enough!” The hiring “machine” should be able to tell when there are plenty of resumes for a given position (and should be able to tell when there are too few as well).
We think that there’s an opportunity for dynamic submission management that responds to the features of a particular job such as its category, title, rate potential, location, and so forth. As well, we envision that a dynamic supplier submissions capability could note how a particular hiring manager is acting, and adjust submission limits based on that.
If a particular hiring manager is very picky (say, she has declined 90% of resumes that have come through for a position), we might request additional resumes from suppliers. If it looks like a hiring manager has enough, we can shut down additional submissions automatically, with no work required on the part of procurement or MSP staff.
The scientists at IQN Labs are hard at work building predictive models that will allow our software to gauge when a particular requisition needs more submissions or, conversely, when it has received enough for the time being.
If you want to read detailed results of our supplier submittal limits analysis, sign up to receive IQN Labs research updates