Image created by Midjourney
Things are looking kinda bleak for our beloved programming forum. I looked at superuser activity on Stack Overflow and found that a significant number of super users are leaving the site
I've been using StackOverflow for more than a decade at this point. Almost everyone who writes code for a living frequents this website, or at least they used to. I even started answering questions on the site at one point to get better at programming (but also to add another bullet point in my resume). However, it seems like things are looking pretty bad for our beloved programmer forum. Stack Overflow publishes their user data online so I decided to play around with it for my social computing systems class. The data is massive, and is all in XML, so getting it ready for analysis in BigQuery took a while, and could be a separate blog post.
If you've been on Stack Overflow during the early days, you probably have encountered users with ungodly levels of reputation. These are the users that helped bootstrap the site to where it is today. I call them "superusers".
It's hard to define what it really means for a user to churn. Someone could visit the site less but still make edits, answer questions etc. Regardless, the user activity I focused on was the number of questions answered, so I looked at the rolling six month averages of the number of questions superusers answered.
Plotting these averages as a line plot showed that the trajectories tend to follow an upward trend and then either plateau or decay.
I used an exponential decay model to quantify churn.
\[y = a . e^{kt}\]where \(y\) is the activity levels (rolling six-month average of number of answers posted) since peak, \(a\) is the peak activity level, \(k\) is the decay rate and \(t\) is the time since peak activity levels. \(k\), in this context, is the rate of churn. I used the interquartile range to categorize super-users into the following categories based on their churn rates
Churn rate | Churn category |
---|---|
k < 0 | no churn |
0 < k < q1 - 1.5 * IQR | slow churn |
q1 - 1.5 * IQR ≤ k < q3 + 1.5 * IQR | fast churn |
k ≥ q3 + 1.5 * IQR | churned |
I found that around 5% of super-users have already churned, and around 95% of them are in the fast churn category. The number of super-users who have not churned makes up less than 1% of the total number of super-users. These findings show that Stack Overflow’s most influential contributors are at a very high risk of leaving the site, which can severely affect the longevity and sustainability of the site
You can see that folks in the three churn categories follow distinct patterns. One could speculate that ChatGPT is behind all this, but if you look at the trendline for the fast-churn category, things have started going downhill way before ChatGPT came out. Regardless, it brings into question the future of Stack Overflow and programming in general. Are we heading towards a world where we learn to code from robots instead of other humans? Who knows.
If you found this interesting, you can read the full report here