Самое сложное в data science: политика

The Data Science Illusion

When I was waking up at 6 AM to study Support Vector Machines I thought: “This is really tough! But, hey, at least I will become very valuable for my future employer!”. If I could get the DeLorean, I would go back in time and call “Bulls**t!” on myself. The truth is that reality is much more nuanced, and the fact the field is still far away from being mature isn’t helping at all.

The classical story goes something like this: “data scientists spend 80% of their time getting, cleaning and managing data, only the rest is spent on analysis and machine learning”. Wrong. Well, actually true if we consider only time spent on productive work. Reality is that a lot of time is spent convincing people, dodging attacks and rushing along executives to avoid stupid mistakes.

Then there’s corporate politics

How it all starts

One day someone at SuperMegaCompany™ read somewhere you must be data driven to be successful and companies such as AirBnB, Netflix and Uber are eating the worldthanks to data science, and so on. This is generating a lot of hype and buzz that we can clearly see in the numbers of job postings requiring in some way a data science-y approach.

This behavior leads to a plethora of problems, well summarized by Monica Rogati that boil down to “chances are you’re not ready to get a data scientist on board “.  For those of you who never had to startup such an area I can sum it up in a few words: there will be blood.

Most of the time the SuperMegaCompany™ never dealt with the processes required by healthy data science, and please let’s not talk about a data driven culture. Even though there is some sort of reporting in place, you’re not data driven. So, I hear you check Tableau every day? It doesn’t make you data driven.

I get it, everything you read online on the topic is eventually going to degenerate either in a very abstract corporate jargon useless thing or in a presentation of tools at your disposal. I won’t comment the first option, but the second one is much more interesting because is more subtle. These people are basically claiming that you can get better at math by getting a fancier calculator, but as it’s clear to anyone this is far away from the truth.

Tools don’t make anything by themselves. They are called tools for a reason: you use them as a mean to reach an end, but they won’t use themselves and if you use them wrong you are not going to reach any end. Culture comes from people, not tools, and it is deeply ingrained into our conscious and unconscious minds. People have biases and tend to be creatures of habit: routine saves us, and you are coming to disrupt all of this.

What you will find

If you’re lucky you will find people who know they’re not data driven and hired you to get some help. But chances are that you will find people who think they are data driven and data literate, “I check every month the financial statement and a summarized report of every area, so you know, we are really data driven here…”. Ok. Sure.

This is the worst that can happen, in fact these people think that you will be the icing on the cake and in a couple of months you’ll have found the Holy Grail of all business answers and destroyed your competition. But in reality you’ll discover that most of the data is stored on spreadsheets, nobody ever deployed a log parser and to get access to databases you have to literally pray the IT department.

Building from scratch can be fun if you get enough support and you don’t meet too many issues, but this is almost never the case.

Politics is (in) the way

Let’s say you have to collect data from an area led by an executive at SuperMegaCompany™, it’s 8 years the guy works there and everything is going smoothly. One day, a geeky guy knocks on the executive’s door asking if it’s possible to easily collect all his data. How dare you? So he asks you why you need all the data and he starts looking suspicious. You say you need them to start analyzing what’s going on in the company, at this point there are usually three possible answers:

  • “Easy! We publish a very thorough report every <insert period>. You can get the data from there”
  • “It’s impossible to get all the data we generate and process every day, you see, our work is very <add phony adjective> and we can ensure quality only in this way”
  • “Sure! You can talk with <person who doesn’t even know what the company does> he/she can help you with that!”

There you go, welcome to the magic world of corporate politics!

Now months – if not years – of fighting are awaiting you, there is nothing to do about it. Some people will resist it, they’ll think you want to control them and report to their bosses how inefficient they truly are. In the end who can completely blame them? It will happen. Sooner or later someone will be fired because of the data you collected, the same data you cleaned and munged so meticulously, yes they’ll know it was you who analyzed it and presented it.

Let me be clear: it’s not your fault, they would probably have been fired as well a few months later when inefficiencies would have become clear to everyone. But from their point of view would mean admitting they’re not perfect, and who is willing to do that?

Fighting politics with politics

While I was writing this post, I found this about the same exact topic: a guide to navigate corporate politics for data scientists. I agree with most of the things written in the post, but I think it is a bit simplistic. Yes, the concepts are mostly ok, but they’re talking about policies, not politics. I know, I know, for some of you these might seem more or less the same thing, but they are not.

Politics indicates the practice of governance, or the actions aimed at getting a governing position. Policy is a plan that you put in practice after you have obtained governing power. I know this seems a useless rant coming from a political scientist (actual political sciences graduate here…aehm…), but the difference matters. A lot.

The correct policies are needed in order to become a functioning data science team, but how are you going to put them in place if you have no power? And how do you obtain power? With politics and then you maintain it with other politics and policies, not the other way round.

The policy first approach is typical of the American way of thinking: if you do things right people will like them and you’ll be good to go. While in Europe – and especially in Italy (Machiavelli rings a bell?) – the opposite is true: politics is fight for power, and not a way to put policies in place. Of course the best approach relies in the middle, both of them have clear issues and won’t work in the long run.

The fact data science is young increases the probability of such scenarios happening to you as well:

  • Management or part of it doesn’t buy in
  • Resistance to data collection and analysis
  • Rejection of forecasts, predictions and insights
  • Treating you as a “number/Excel monkey”
  • Thinking that any unsolved problem is your problem even if it doesn’t require any of your role defining skills

Some tips to navigate corporate politics

The first thing you should do is to find the friendlier managers and start working with them right away. Do simple projects that can be finished in a few weeks, if possible report results and the value added to as many people as possible. At the beginning focus on top management and try to “be there” as much as possible: meetings, conference calls, etc.

Be there, and try to be the more proactive as you can, launch new ideas and don’t be afraid to speak up if someone says something that doesn’t make sense or even utterly stupid. While you’re doing this work, start making yourself available to regular coworkers: help them getting the data they need, automating or speeding boring simple tasks, and so on.

All of this will be mostly informal work, it’s going to be tough for the first few months, but you need buy-in at almost every company level as soon as possible. In fact, if you miss the train it’s going to take much more time to catch up.

I know that for some of you these might seem as trivial, boring tasks, but I ensure that the majority of people will attach a high value to them and will start to come to you even for slightly unrelated issues.

When this moment arrives you have to start casting a wide net, this time starting from the bottom. Try to be helpful as much as possible, offer your skills to make your coworkers’ jobs easier and be nice. For real. Be so nice to make people almost nauseous: explain carefully and very clearly everything you have to do, why you’re doing it and how you can make other people’s work easier.

Being able to communicate is the most important skill you can have to do this job, if you’re not able to communicate effectively at every company level you won’t be able to make a career out of this job. If you can’t communicate well, people are going to see you as Anton Chigurh (the guy in the pic): a weird guy approaching with an obscure tool trying to exterminate them.

I know, it seems harsh, but it’s true and it’s real. People are scared by new things and what they don’t understand, and they are even more afraid of new things they don’t understand. You must find the way to limit the number of people that will see you as a pain in the neck, I say limit because it’s impossible that everyone is going to understand.

The only way to deal with these people is to ignore them and if you really have to work with them be even nicer than with others. If they’re not true jerks you’ll be able to win even their support eventually, but if they truly are jerks you’ll have to find ways to slow them down while not impeding the company.

You have to find someone willing to be your lightning rod attracting most of the attacks and absorbing them in your stead. This is essential, when you’re new you won’t be able to be your own lightning rod, seniority matters and different people can absorb different levels of damage. After some time you can start showing what you’re worth, but always remember that you aren’t the principal of most of the people you’ll be working with.

So tu sum it up:

  • Find friendly managers and start working with them
  • Start producing value as soon as possible
  • Communicate everything you do to as many people as possible at every level
  • Be nice, explain everything clearly and help everyone
  • Avoid jerks, for real
  • Find a lightning rod absorbing damage in your stead

Down with policies

Now that you reached a position of power – some power is enough, absolute power is utopia – you can start focusing more on policies. You’ll have to keep helping people, but now that you are probably working on larger and important projects you won’t have time for everything and everyone.

Make clear to everybody that you want to keep helping them, but requests are becoming too difficult to follow right away. Keep being nice and find a way to push most of work for requests to one day per week: make a requests Friday for instance, and give people a way to make deferred requests. A Google form is enough to collect them and some detail about them, and you can look at it just once per week.

I made a great deal about communicating and doing it well, but it’s very difficult to do it effectively and continuously. At the very beginning do presentations, get personal and face to face to your coworkers, in this way you can know people better and individuate weak points, potential jerks, supportive people and so on.

The only problem is that you can’t keep doing this indefinitely. Presenting stuff takes time and effort, a lot of time and effort, and you don’t want to spend all of your time on slides and charts, right? So you have to find the right media to share stuff within your company.

If the SuperMegaCompany™ already has a widely used sharing system, stick with it. Otherwise there are dozens of possibilities you can exploit. If you think that an internal blog would help, you might want to take a look at the Knowledge Repo from Airbnb: it’s a complete CMS with a web server and a nice templating system that automatically converts Jupyter notebooks and R Markdown files to posts.

What I can tell you is that just a blog won’t cut it: people are not going to start reading it spontaneously and start engaging by themselves. The most simple way to deal with this is to start emailing posts to people you know can be interested and you might publish reports and studies on it so you can direct people there.

What would really be great is to start a discussion channel where everyone can publish and discuss content, in this way you’ll just be one of many posters and nobody will start looking funny at you when emailing them new posts and being a “smartass”. I personally love Reddit, and you can deploy your own internal Reddit version if you like to.

Always keep a record of what you do and how much time it takes, after that try to attach a value to every finished project. If it’s directly measurable, insert the right value over a span of time, if not measurable even an order of magnitude and/or a range are going to be ok. This will become useful when someone will ask you what you do for them or for the company trying to challenge you.

It’s going to be helpful also to estimate time required for future projects, businesses are not used to our job’s cycles and time spans, so people will expect that you deliver in a couple of weeks or less. But we know that if an A/B test requires 1 month and a half it is going to take that time to get an answer, maybe even more, almost never less. So be clear in advance of the effortresources and time required for your projects, and remember that you can buy everything: computing power, people (not really buying, but you get the point), materials, and so on.

The only thing you can’t buy is time. Nobody can give you back the minutes you spent reading this post, not even if you are the richest man on Earth. So treat your time as the most precious thing you have, and remember that for others your time is always less valuable than theirs.

To spend your time wisely and not wasting it you have to setup a process you can follow most of the time. Try to split it according to the kind of task at hand, for different things you’ll have different requirements and steps. I don’t really believe in Agile, Kanban, etc, especially I don’t believe in tracking software. A spreadsheet and an agenda are more than enough for a small team, and often would be good even for large teams.

Summarizing:

  • Remember that time management is the key
  • Establish a weekly day to screen and process requests
  • Give presentations, but find a way to spread knowledge more deeply and continuously
  • Draw your processes and find a way to attach value to every project
  • Try to stay lean and simple as long as you can

Concluding (with politics)

If you’re thinking: “All this stuff seems a lot of work!” you’re right. Most of you have probably noticed that data analysis, machine learning and so on were just briefly mentioned, but they were never the focus. Unfortunately side work is very important, if you don’t do it you won’t have the chance to prove your worth and everything will become pointless.

Even if you’re successful in getting some recognition remember that with power comes responsibility: meetings, briefings, calls, working trips, and so on. If you’re the only one doing your job at your company, good luck with that. Soon everything will become unbearable and you’ll have to ask for someone to work with you as soon as possible. In this case power and influence will be very handy, if you have none, nobody will listen to you and you’ll have to keep doing everything by yourself.

So remember: policies are important, but you need power to be able to put them in place. One without the other is meaningless, and can even be counterproductive.

Source

Data Scientist # 1

Машинное обучение, большие данные, наука о данных, анализ данных, цифровой маркетинг, искусственный интеллект, нейронные сети, глубокое обучение, data science, data scientist, machine learning, artificial intelligence, big data, deep learning

Данные — новый актив!

Эффективно управлять можно только тем, что можно измерить.
Copyright © 2016-2021 Data Scientist. Все права защищены.