«

Overachiever

October 14, 2008

»
V

The “why” of open source software research

October 19, 2008

I’ve created a domain map in which I’ve tried to record where I think the major research areas are in open source software (OSS) research.

The “why” of open source software research

By: Chris Malek

Oct 19 2008

Category: Articles

No Comments »

I said earlier that I’m trying  get a handle on the topic of open source software (OSS) and what research people have done on it so far.   I want to do that so I can see how what I want to do fits in with what has already been done, and also see whether open source research is a new field, is on the rise, on the decline, or is done.

Based on looking at 337 papers from between 1998 and 2008 which Web of Science returned when I searched for “open source” in a relevant subset of journals, I performed a card sort on those papers to try to construct a domain map of what research has been done.  I’ve depicted my results above.

I found that I could group papers into five major areas:

  • about open source: these are papers which look at some specific aspect of open source software development or open source software development communities.  The kinds of questions that papers in this area try to answer include: Why would someone choose to participate in an open source software development project?  How do open source projects organize themselves?  How does governance arise in open source projects?   How do open source projects achieve and maintain high quality code?
  • OSS and business: these papers look at open source communities and the open source development model compare with, contrast with, or fit in with business.  Questions asked by papers in this area include: What are the economics of open source?  How can a company incorporate open source development methodology into its business model?  How does open source software compare with commercially produced software?
  • using OSS data to do something else: papers in this area use data from open source projects — code, revision history, mailing list data, etc. — in testing a specific tool, method or idea.   They use OSS data because it is readily available, typically, when such information from commercial software teams is not.   The kinds of things that papers in this area deal with are: software metrics (maintainability, bugginess); replaying development history; assisting developers in making better code; and looking at characteristics of software development in general (not specifically OSS development).
  • open source in a special context: papers in this area look at open source software usage in particular contexts: in libraries, in government agencies, in medicine, or for paticular uses, such as web portals.
  • particular open source packages: this area is home to papers which describe software that the authors have written and which they have released as open source.  Titles from this section are: “How octave can replace Matlab in chernometrics,” “MUDABlue: An automatic categorization system for Open Source repositories,” “R: An overview and some current directions.”

Only the first three are relevant to my work: “about open source”, “OSS and business” and “using OSS data to do something else.”   Particularly the first and last areas, since I’ll be trying to make a tool which works within an open source context, and since I’ll be using OSS data to test the tool.

I also wanted to get an idea of how much work had been done in each area.  Below is the same map as is given above (click on it to get more detail), but colored so that the color in each box is related to how many papers I found in each area.

People have done the most work in examining open source development and open source development communities, it looks like, and they’ve spen a lot of time looking at how the communities work (22 papers), and how the development process works (31), as you would expect.   Shortly after that are overviews of what open source is (18), and papers about the individuals who work in projects (14).

In “using OSS data to do something else”, the papers are fairly equally distributed among the four areas, but favoring “tools which improve software development” (19), and “software development in general” (16).

Leave a Reply