Django … an outlier

While analyzing the development activity and code metrics for over 240 of the most actively developed FLOSS projects, guess which project popped out?

Yes Django! Its an outlier in terms of its activity. It’s influencing the results of my statistical analysis more than any other project as per the Cook’s distance diagnostic index. Let me bring your attention to the lonely dot that is close to the value 1 at the top right corner. I missed it at first, but noticed it when I looked at the sorted values.

This is telling us that at least among the sample that I have ,Python, C and C++ based actively developed FLOSS projects, Django (including its community) is quite unique.

I leave you with the graph of sorted Cook’s D values from my analysis:

Click to Enlarge Image

Add post to:   Delicious Reddit Slashdot Digg Technorati Google
Make comment

Comments

You teaser — you’ve told us that Django is an outlier, but you haven’t told us why — or whether that is a good or bad thing from the point of view of your analysis! :-)

I’m intrigued to know what sort of metrics you are including in your statistical analysis. According to your research, what is it that we are doing right (or wrong, or differently)?

short answer: I don’t know

I am as intrigued as you are. To be honest I was expecting this to happen, given that Django showed unusually high values for many of the variables I am observing (most notable was the number of external contributors). The “why” is something a research career is built on and is certainly worth exploring.

What I was interested in my analysis is to understand the characteristics of projects that attracted more new contributors per month than other project. Some of the variables I included in my analysis:

lines of code, lines of comments, number of self reported users from ohloh.net, modularity, number of modules, number of files, age, number of committers, number of contributors per quarter, number of new contributors per quarter, license, and programming language.

So far, the results point to the positive correlation between good design, good documentation (with diminishing returns), and size (with diminishing returns) to the number of new contributors per quarter. I might have more results soon about the factors that relate to the development performance of a project.

As to whether this is a good or a bad thing, again, I don’t know. But Django had significantly higher number of external contributors and ratio of code comments to code. It also observed an above average reading of modularity that is not declining over time. So obviously the community is doing something right and unlike other FLOSS projects.

Required. 30 chars of fewer.

Required.

captcha image Please, enter symbols, which you see on the image