This rant is a reaction to a LinkedIn post I was tagged in because i promote citizen data science.
There’s a lot of bullshit being sold about data science.
And for every person who thinks they’re the data science unicorn and all the rest aren’t skilled enough, there’s a company proving otherwise.
I’ve talked with a lot of people about this topic and presented on why I believe Microsoft i supporting this with their tooling.
Read on for some ranting and reasoning.
It doesn’t only boil down to tooling but also to making knowledge accessible for everyone.
We’ve already seen this with everything else in history.
From reading and writing to maths to cars to electronics and now even computers.
All these fields used to be considered super hard and special and only a few had access to the education to apply/practice them.
Data science isn’t a special field compared to those things, so people should really get off their high horse.
Will an amateur mathematician or amateur programmer be as good as someone who’s been trained for it?
Most likely not, but sometimes magic happens.
Amateurs invent cool stuff in their garage that has a lasting impact (Apple).
Teenagers write websites that change the world (google, facebook).
And I’m sure you can imagine even more examples.
Keeping enthusiastic people away from data science (or any field) because they don’t fit the profile in your head is a disaster to data literacy. It’s the same as keeping people away from schools because they don’t fit your view of what a student should know or look like.
These enthusiastic people, these power users, these citizen data scientists, they are what drives adoption and they actually help promote a “real” data scientist. This is even more true if you’re one of those academics that believe they are better than others due to the level of your education or the things you know.
Just stop fearing “regular people”, because that’s the real force driving these data scientists on their high horse telling you nothing good will come from citizen data science.
A recent and good example is the BI industry.
Self-service BI has had the same comments.
“Regular people are too dumb for BI”, “They will create bad data models”, “They don’t know the tooling”, “Think of the load on our database when everyone starts querying ad-hoc”, “This will never work because you lack governance”, etc.
Self-service BI in it’s literal form, where you throw a tool at people not trained in business intelligence, is doomed to fail.
But when you start cooperating and looking for ways to make it work, magic happens.
Suddenly there’s an army of people running experiments and even if only 1 out of 100 is useful, they will still accomplish a lot more than your small data science team will be able to do in the same period.
But you need to build a layer and tooling that serves people.
There are already companies which enable non-data-scientists to run predictive experiments thanks to internal tooling. Example: Airbnb (https://medium.com/airbnb-engineering/how-airbnb-democratizes-data-science-with-data-university-3eccc71e073a)
Again, tooling is what is driving this and another nice example is the Department of Defense, just watch this conference talk: (https://www.youtube.com/watch?v=glrBe2zwihc).
There are even data science community leaders who believe in citizen data science.
David Robinson is my favourite example, mainly because he’s really well known (https://www.datacamp.com/community/podcast/citizen-data-science).
The question is not “will it happen” or “why should it happen”.
History has already thought us that it will happen if people are interested (and they are).
The only real question is “How can we help make it happen sooner and better“.