Life in academia

Making the master thesis great again: It’s about tea and water. And the replication crisis.

Make the master thesis great again: Students need to do empirical work, and we need empirical work to be replicated to solve the reproducibility crisis.

Next week, I travel back to Milan for our July graduation session. My richly talented master student will defend her insightful master thesis on unexpectedly low funding goals on Kickstarter. Think like: sometimes you find campaigns who put the funding goal so low that there’s no way they won’t get the funding they ask for. Does this have an influence on the overall funding that they accumulate? It’s an interesting topic.

But that’s not the norm. Maybe also because interests of students and supervisors differ on that

Interestingly, not all universities mandate their students to write a thesis. In Europe, almost always you will have to write one. At many American schools, it’s an optional endeavor on the master level; bachelor theses typically do not exist. Students with personal affection for a topic pick the master thesis. Students who are interested in doing a PhD later on pick the master thesis. In the European Union, both bachelor thesis and master thesis are a basic component of any program that I’m aware of.

The problem: The (master) thesis is a pain in the a…cademic path of students

As a consequence, year by year tens of thousands of students write a thesis. Their thesis should do something novel, because they shall demonstrate that they can work academically independently. Unfortunately, at many school this ends up in a mismatch with any teaching prior to that. Teaching is now often very applied, industry-oriented, not necessarily academic. At the bachelor level, I have not met a single student in the past five years who ever had worked with an academic paper prior to the thesis process.

But if that’s the reality of academic teaching and learning: what’s the point?

Why should we then pretend a time of the past? It only leads to lots of pressure and one of two problems: Either students receive appropriate last-minute coaching on the expense of a lot of time spent by their supervisors – time lost for research. Or students end their academic path with a frustrating experience of delays, not getting around the thesis thing, and superficial to non-existing supervision, while supervisors have given up on putting effort into yet another student who does not even understand the basics of academic writing.

Don’t get me wrong: I’ve supervised some very insightful theses both on the bachelor level and on the master level. I’ve been on the committee for excellent thesis projects under the supervision of colleagues. But I do very well know that it’s not the experience of the majority.

And for that same majority, we are bad supervisors.

Also, the fact that students individually can determine the focus of their thesis to a large extent, for some students the master’s thesis is a mythical entity that can lead to feelings of insecurity or even inferiority (Ylijoki, 2001). This is indicated by Nena, who at some point had no faith that she would ever finish her thesis (see quote at the start of this chapter). In addition, concerning the teacher–student interaction, supervisors are often not trained or supported in supervising students in a one-on one situation and therefore basetheir supervision on their own experiences as a supervisee (Delamont, Parry, & Atkinson, 1998; Lee, 2008; Marsh, Row, & Martin, 2002; Pearson & Brew, 2002; Philips & Pugh, 2005; Todd, Smith, & Bannister, 2006). Lastly, regarding the complex goals of both supporting and assessing learning, it can be unclear what the task/responsibility of the supervisor is and what the task/responsibility of the student is, as reflected in Marlene’s comment.

(Kleijn, 2013, p.3 – pdf download starts immediately by clicking on this link)

Strangely, thesis supervision is the teaching task with most impact assigned to me as a professor, but also the only one that students do not evaluate. The following table (from Kleijn’s dissertation) displays the results from a multi-cohort study into student satisfaction at Utrecht University; it covers responses from about 1,000 students between 2009 and 2013. All responses are measured on 7-point Likert scales. I’m sorry, but… these results are abysmal.


From my personal experience, the problem is that in most cases we as supervisors just don’t really care. We might focus much of our attention to the best students. To those, we might give projects that explore some research interests of ours for which we don’t have the time. That’s pretty much the case of my student graduating this July. On the flip-side, they need the least supervision. The worst students require the most supervision effort. At least as long as we care about them to learn and to improve. But even if we care: often we don’t get any reward for that effort.

For me personally, the worst part of a thesis supervision trajectory is the very beginning, when we need to agree on a topic. Be it bachelor level or master level. We expect them to come up with something interesting and academic. They come to us with something way too fluffy (“I want to do research on crowdsourcing” – “In which sense” – “I want to study crowdsourcing in the Netherlands” – yeah, but what exactly?), or with something way too simplistic and way too business: A marketing plan for the company they work for does not meet thesis expectations. And to be fair: coming up with a viable and academically interesting thesis topic is difficult. Especially if they have no clue about what could be academically interesting.

So far, so good. Two days ago, a curious Chinese friend (currently in the process of writing her own master thesis) asked me why every student has to write a thesis. And to be fair: while there are lots of good reasons that people can bring forward, I think it boils down to: because it always has been like that. At Bocconi, we’ve recently revised the thesis process and now make a distinction between applied theses and research theses. Applied theses can’t get full points, but supposedly stress students less. We think they are a nice offer for students who do not want to pursue an academic career. But at Bocconi, the weight of a thesis is small. A bachelor thesis counts up to 4 points out of 110, and a master thesis counts up to 8 points out of 110. So… why don’t we just let the majority of students go with another elective?

Put short: for the majority of students, I think we should abolish the thesis all together. Certainly on the bachelor level. For the master level, I have another idea.

Some context: tea and water

The same Chinese is currently on a business trip through Asia. At one point, we spoke about the enjoyment of drinking tea vs. hot water vs. cold drinks. (Here’s the gist: I can drink tea, but I don’t fancy hot beverages, so I seldom do drink tea. She mentioned she’d even drink hot unflavored water. I appreciate that, but I don’t appreciate that for myself.) Then, yesterday, I stumbled upon this paper.

Huang, Y., Choe, Y., Lee, S., Wang, E., Wu, Y., and Wang, L. (2018). Drinking tea improves the performance of divergent creativity, Food Quality and Preference, 66: pp. 29-35.

Let me share with you also the abstract.

Previous research has found that tea improves performance on convergent creativity tasks, such as the Remote Associates Test, by inducing a positive mood. However, there is no empirical evidence regarding the effect of tea drinking on performance in divergent creativity tasks. Using two experiments, the current research investigates the relationship between tea consumption and divergent creativity. In both experiments, participants were randomly assigned to two groups and implicitly manipulated to drink tea or water. In experiment 1 (N = 50), we used a block-building task as a measure of divergent creativity in spatial cognition. The results showed that the participants who drank tea performed better in the spatial creativity task assigned in the 10 min immediately following tea consumption than did those who drank water. In experiment 2 (N = 40), we adopted the restaurant naming task as a measure of divergent creativity in semantic cognition. The results showed that the participants who drank tea received higher scores in the semantic creativity task compared to those who drank water. The current research demonstrates that drinking tea can improve creative performance with divergent thinking. This work contributes to understanding the function of tea on creativity and offers a new way to investigate the relationship between food and beverage consumption and the improvement of human cognition.

(Part of ) her response to me read like this:

Only 50 people took part in the experiment, and they published the study?? What kind of nuts will take the study serious then?

She has a fair point (or maybe not, but more on that later). Aside of that, the experimental design is flawed (and I’m not the first to point it out). They compare drinking tea vs. water at a temperature of 42°C for either beverage. The have two studies to show their main effect, but nothing to prove the causality of the proposed mechanism. It could be tea vs. water, but it could also be anything flavored vs. water, and it could also be that cold water trumps both tea and hot water. There’s no control group either (everyone is drinking something).

I should point out that I did neither drink tea nor water during the writing of this post. I had some milk at breakfast and a bit of juice later. You should try warm orange juice in winter. It tastes better than it sounds.

Anyways, here is what news made out of that tea vs water study:

Half of the participants were given a cup of black tea to drink while the other half were administered a glass of water before their cognitive and creative skills were put to the test. (The Independent)

As the students gave their name, age and other details to researchers, half were given a cup of black tea to drink and the other half a glass of water, before immediately going into one of two different tests. (The Telegraph)

Half of the students in each test drank up to one cup of tea, three minutes before the tasks began. The other half drank water. (The New York Post)

The research, by psychologists at Peking University in China, involved 50 students with an average age of 23. Half were given a cup of black tea to drink while the others drank a glass of warm water. The two groups then completed standard tests to assess their creative skills. (World Tea News)

One single news outlet pointed out this vital part of the experimental design. This is not cherry picking. I’ve read over more sources who reported on that study and World Tea News is the only source who mentioned the temperature of the water. There’s a lot of debate about bad research, but there’s way too little debate about bad use of research.

Still, there is bad research. A lot of it.

Some more context: the replication (or: reproducibility) crisis

At Bocconi, I teach marketing research. There’s some content related to that on my blog already. Instagram (or Tinder) is a great teaching tool. And you don’t really need a text book. Though students read – if you care about it. One session each year I devote to “Exploring the limits of science”. It’s a great chance to share with them the best syllabus ever written on this planet.

In a nutshell, this session covers the issue of the replication crisis. We have too many studies that suffer from too small sample sizes. People engage in p-hacking. They also don’t understand p-values correctly and tend to misinterpret them. There’s a file drawer problem (people only report studies with significant results and they only continue promising research) and a publication bias (studies with null-findings, i.e. that don’t confirm their stated hypotheses, rarely get to publication). All of that happens intentionally and unintentionally, but it happens. And people who will use research results should be just as aware of this problem as people who produce them.

As part of that session, I also show the following graph to students.


So, this is about finding meaningful (i.e. substantially large) effects rather than just finding significant effects.

Back to the tea vs. water paper. They used just 50 participants for their first experiment and 40 for their second. They compared two experimental groups (no control). They still found an effect. Cohen’s d is ~ .55, so it’s not a small effect either. We would typically still recommend them to run a study with more than twice as many participants if they had expected that big an effect and wanted to make sure not to overlook it. Nonetheless, they’ve been lucky and found it. Hence, the small sample size itself is not necessarily a problem in their study. Actually, it still is a problem. Small sample sizes are prone to misrepresent population means. Many small samples from the same population might have very different means.

But for the sake of my argument, it doesn’t matter. As I pointed out earlier, other things in this experimental design are flawed.

Making a master thesis great again: contribute to the need for replication

Professional researchers usually do not run replications. They can’t publish them, so there’s no reward for it. But replications are necessary to make sure that effects are not just coincidental. It’s actually a great task for people who should get familiar with running empirical academic research. As is: critically reviewing a paper and identifying some fundamental flaws.

From my perspective, herein lies the potential to give back value and meaning to a master thesis. That doesn’t mean we should not allow for excellent students with an interesting idea to conduct their own innovative study. But for the average student… I’d rather assign them a paper (or have them select a paper) and (1) replicate the findings, while (2) improving on some fundamental flaw of the original modeling framework or experimental design.

Students are the key to solve the replication crisis. Yes, we need to teach them the right way of doing things. And all of a sudden, we could have a myriad of people producing replications. They can write a much more focused literature review, as they would need to concentrate on literature that helps them to identify the flaws. They add to the scientific process, because we need these replications urgently. They learn how to run an empirical test with rather explicit guidelines, as they can largely follow the method section of the paper that they replicate.

The big issue is money. Also behavioral marketing experiments cost money and students don’t typically get money to run their studies. In countries were student fees are the norm, this seems an easy solution: just increase the tuition fees by a certain amount that would allow students to run a simple experimental replication. Or actually less than that, because some students might use secondary data for their replication of a modeling framework; they’d cross-subsidize experiments by others.

On the side of supervisors, supervision gets more focused, too. We don’t need to guide them through a lengthy topic selection process. We only need to have a look if (part of) a paper is suitable for replication for a master student. We have it easier to assess their methodology even if they do something outside our domain, because also we have a template to follow. We can assess the creative academic contribution rather straightforward: did they pick challenging or low-hanging fruit for their replication, and did their replication successfully address the flaw that they found or not? And in a thesis defense, we can ask more meaningful questions.

Some might argue for the novelty of master theses, but let’s be honest: at the very end of it, they are rarely new in any way. Very likely, many master theses unknowingly are already replications of research projects that have been done by some other master student somewhere else or at the same place, but a few years earlier. Or, take Bocconi again: I have been in sitting in the thesis defense of so many master theses which investigate how consumers think about a certain industrial topic, and then they do a factor analysis to identify the key drivers, a cluster analysis to segment their participants based on these key drivers, and maybe (but not always) a regression to see how these clusters influence an outcome variable. There’s hardly anything novel in any of that; it’s what we now call an applied thesis as opposed to a research thesis.

So, why not make it salient – and then do it right? At the very end, this could feed one big database, in which we can look up for any published empirical paper how the replications by master students look like? It might help professional researchers to identify those papers which deserve a second look in a peer-reviewed replication format.

Standard academic practices are not laws of nature. We don’t need text books for all classes. We don’t need theses for all students. We don’t need all theses to look like they do now. There might be better, more appropriate, more contemporary formats. This is one. Make the master thesis great again.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: