All articles, tagged with “mapserver”

Identifying contributors in FLOSS projects

As part of my graduate work, I need to analyze FLOSS repositories to identify number of external contributors. What I mean by an external contributor is any individual who made a patch contribution without having commit access to the source code repositories in addition to being a first time contributor.

What I usually do to identify contributors in general, is to parse the commit logs for any attribution to individuals who are not committers. Take for example the following log message:

Fixed #9859 – Added another missing force unicode needed in admin when 15 running on Python 2.3. Many thanks for report & patch to nfg. - (Django Revision 9656)

I wrote some regex based scripts to identify names or pseudo-names such as “nfg” from previous example.

Things however are not always clear cut for FLOSS projects as not all projects attribute contributors in the log message. For example, I noticed in the MapServer project, which seems to be actively developed, that there were no attributions in the log messages. After inquiring in IRC, it turns out that the attributions are available in the project tracker (thanks danmo!). What is included in the commit log message is a reference to the ticket number

So I pulled up my sleeves, and wrote a quick parser to identify all ticket numbers in the log messages. I then used httplib2 and beautifulsoup to connect to project tracker, and parse the patch name and contributor. The following is the code I used to perform that task:

import httplib2
from BeautifulSoup import BeautifulSoup as BS
def get_mapserver_author(ticket):

    url = 'http://trac.osgeo.org/mapserver/ticket/%s' % ticket

    h = httplib2.Http(".cache")
    resp, content = h.request(url, "GET")

    bs = BS(content)
    div = bs.find('div',id='attachments')
    patches = []
    for y in div.findAll('dt'): 
        try:
            patches.append((y.em.string,y.a.string))
        except:
            print 'Problem parsing ticket ',ticket
    return patches

I managed to identify 68 unique names for the duration between Jan. 1st, 2007 and June 1st, 2009. These are of course the names of contributors who are not necessarily first time contributors. Further analysis is needed before one can determine which of these contributors are “external.
Of course, it goes without saying, that the number of contributors is just an estimate. There might be some other contributions made through the mailing list (thanks FrankW for pointing this out). Not to mention the likelihood that an individual might have two different pseudo-names. As jmkenna (IRC: #mapserver) simply put it “It’s difficult to identify FLOSS contributors”

Just in case you are wondering, here are the names:

warmerdam
aalbarello
unicoletti
tamas
jimk
rouault
dmorissette
tomkralidis
aboudreault
brage
armin
bartvde
diletant
pramsey
nmandery
nharding
eshabtai@gmail.com
assefa
bartvde@osgis.nl
dfuhry
hschoenhammer
project10
hopfgartner
ujunge@pmcentral.com
sdlime
richf
dionw
nnikolov
abajolet
laurent
tbonfort
BobBruce
nsavard
woodbri
flavio
scott.e@goisc.com
dstrevinas
ivanopicco
jlacroix
cplist
kfaschoway
szigeti
zjames
elzouavo
mcoladas@telefonica.net
nfarrell@bom.gov.au
jparapar
vulukut@tescilturk.com
novorado
russellmcormond
msmitherdc
crschmidt
hjaekel
peter.hopfgartner@r3-gis.com
hulst
mturk@apache.org
thomas.bonfort@gmail.com
ivano.picco@aqupi.tk
jmckenna
drewsimpson
bartw
djay
sholl
dirk@advtechme.com
cph
jratike80
hobu
hpbrantley