tl;dr
We are interested in understanding the differences between C# and F#. If you work on a project that has both C# and F# and you want to help science, we would love to monitor your IDE usage and ask your a few questions too. Just download kave, and program like no one is watching for three months, and then send us your data. Also share your contact into, so we can ask you a few follow up questions. All details on what we can and cannot see from you are here. If you have questions, comment below or send me an email at ‘mail’ at this domain.com.
What is the best programming language?
I can hardly remember a time in my life I have not pondered this question. I remember debating with a friend in high school, when we were going from QBAsic to Turbo Pascal whether it is really better to declare your variables before you use them, and why. I guess everyone in programming has had discussions like this a gazillion times.
Let’s do some real science!
I have grown up a little bit since my high school Pascal days and I am now a scientist. So, instead of debating (as enjoyable as I find that), why not do some science? As a .NET developer and lover of F#, I am super interested to know how those languages compare. There have been some comparisons between languages (not including a functional language unfortunately), but running such a study properly is not easy.
Like a real doctor
Ideally, you want to do an experiment like they do in medicine, they give a new pill to one group of people and a control group gets a placebo, and assignment into groups is random. If enough people in the first group get better, the pill works.
In computer science, such a controlled experiment looks like this: you bring in some developers, give them a problem and assign them a random language. If sufficiently many people do worse in one language, you have proof that language is better. Very much like my recent Scratch paper or the work by Stefan Hanenberg.
What are we really measuring?
If you want to do a controlled experiment, there are confounding factors, problems that could interfere with the study. For example, the choice of the problem in the experiment, it has to be easy enough to be doable in a short timeframe (I is hard to get people to participate in a study that lasts over let’s say 2 hours) but difficult enough to be representable of something real. The domain matters too. If I my problem concerns, for example, accounting, I need developers with at least a basic understanding of a balance sheet, or it will be unreasonably hard for them to build something, independent of the programming language.
Do you know Haskell and JavaScript?
These are problems that all controlled experiments in CS have, but the one language versus another one has one special risk: people have to know both programming languages. Say you want to measure JavaScript versus Haskell.
It is crazy to do such an experiment with people new to either of them. So you need people that know both JavaScript and Haskell, in order to be able to truly random assign, because people that know Haskell could very well be different from people that don’t, because they chose to learn Haskell. Getting a group of significant size that knows the two languages you want to measure is not an easy task.
Do you know vim and emacs?
One more things: languages live in IDEs, so we are not only measuring what you know about a language, but also about an IDE. In our random experiment, are we letting people use their own IDE? In that case, we are again biasing, as the people doing vim might be different because they chose vim over emacs. If we assign the IDEs, we need people that know both vim and emacs, and yeah, that might be tricky 🙂
It is better to have one dev in the field than 10 in the lab
Frequent readers of this blog know that in the Netherlands when you defend your PhD dissertation, you also defend 10 statements. One of mine was “For user studies in software engineering holds: it is better to have one user in the field, than ten in the lab.”
I still believe this. While it is great to do controlled experiments if you do not have the confounding factors, doing a real controlled experiment on Haskell vs JavaScript is too impractical for the above reasons, and you will likely measure other things. So, what if we could study real people in their normal habitat?
A stroke of luck
Well, sometimes, life throws you nice things. When I was visiting SANER, I ran into Sebastian Prosch who wrote a plugin for Resharper that measures what people do in Visual Studio. He wrote a nice paper about it too.
This is my shot, I thought! I can use that plugin to measure how people use F# and C#. A lot of the confounding factors are gone, because both languages have the same IDE and if they are used within one project, we are not comparing two programs against each other but one project with two languages.
Are you curious too? Register for this study, download Kave and help us figure this out!
I’ve installed the plug-in, used it for a while but since it needed ReSharper which I really have no use for and my trial expired, I stopped using it. I did send a few reports before that though.
There already is a free VS plugin, that measures various metrics, called WakaTime: https://visualstudiogallery.msdn.microsoft.com/ca0ea1f3-e824-4586-a73e-c8e4a65323d8?SRC=VSIDE Maybe they would be interested in collaboration with such research project as yours.
Thanks for contributing!! All date we get is welcome!
We reached out to WakaTime and asked them to share data, but they did not get back to us unfortunately.
Interesting points about trying to find scientific backing for holywars 🙂
Like you mentioned it doesn’t make sense to evaluate a language in isolation, you need to evaluate an ecosystem. That’s not only language features and IDEs, but more importantly APIs and libraries, their communities and how it all fits together with other existing languages and technologies in a bigger picture. In addition don’t forget corporate interests of the big software vendors.