How to make sense of statisticians
Updated: Nov 7, 2020
It can be surprisingly hard to talk about stats and models. Communication in general is not easy, but when the goal is to translate ideas and concepts into numbers, there’s no room for misunderstandings.
One of the things I enjoy most in research is thinking about problems and finding ways to solve them. The datasets I work with are almost always messy, which makes it hard to know exactly which method to use to analyze it, but it also creates a fun challenge – how can I extract the information I need, which methods can be used, how can we deal with missing data. There’s always an opportunity to learn, be creative, and sometimes even develop new methods.
But what can be even harder than working with new datasets is talking to people about analyzing them. When analyzing data, one needs to be very specific about what exactly the question is. It is not enough to say ‘hey, I have a nice dataset on bee visits to flowers and which flowers become infected with a fungus, let’s make a model about the spread of this fungus!’. Even though this makes intuitive sense in that the motivation is to understand what’s going on and how this fungus is spreading, this is too vague for creating a useful model. The dataset determines how detailed a model can be. The question determines how detailed a model needs to be.
Here's a specific anecdote:
Some time ago during my PhD in Antwerp, Prof. David Costantini got in touch with me about something he was working on, and would it be possible to create a model to test this concept. David was (and probably still is) clearly a clever scientist and knew this topic inside and out, but had never ventured into the modelling world before, while I knew nothing at all about what he was working on (something called hormesis that sounded to me like a bad disease but turned out to be a cool ecoevo mechanism). After some background reading we got together to chat about what to do.
After our first chat it was clear that there were some exciting questions to address, and I went back feeling motivated and eager to start building a model. But almost as soon as I started working on it, it became obvious that there were still a bunch of important details that I hadn’t even considered because I was lacking the background, and that David also hadn’t considered because he lacked the experience in the requirements of a model. I believe it took 3 or 4 more meetings at different stages of the model before we both really understood what the other needed, even though after almost every meeting I went back to my computer thinking I would write the code for the model and get the results we were interested in. And this wasn’t just about data details – the hardest part in this was getting the questions right. What exactly were the questions that we should test.
This struggle in getting on the same page, understanding each other, and knowing the details of the question, is in fact a process filled with opportunities to improve the science and learn. The requirements of a useful model forced both of us to home in on the little important gaps in knowledge, on the most interesting unknown aspects of the system and the research field.
I always enjoy this communication dance when working on new projects and with new people. It’s a necessary back-and-forth in which everyone learns, and in my experience it has always made the science better.
Have you experienced a similar situation where misunderstandings resulted from communicating with people with different backgrounds? Do you also find that these situations can be great teaching moments?