Challenge #1: Scale
In this category you can demonstrate the usefulness of your mining
tools. The main task will be to find interesting insights by analyzing
the software repositories of Eclipse and Firefox. Both systems are
large in size, several years mature, and provide lots of input for
Since the submissions will be evaluated together with experts and
developers of Eclipse and Firefox, this will be a unique
opportunity to get candid feedback on your tools.
Participation is straightforward:
- Select your mining area (one of bug analysis, change analysis, architecture and design, process analysis, team structure).
- Get project data of Eclipse and/or Firefox.
- Formulate your mining questions.
- Use your mining tool(s) to answer them.
- Write up and submit your challenge report.
The challenge report should describe the results of your work
and cover the following aspects: questions addressed, input data,
approach and tools used, derived results and interpretation of them,
and conclusions. Keep in mind that the report will be evaluated by
both developers and researchers. Reports must be at most 4 pages long
and in ICSE
The submission will be via Easychair (http://www.easychair.org/MSRChallenge2007/).
Each report will undergo a thorough review and accepted challenge
reports will be published as part of the MSR 2007 proceedings. Authors
of selected papers will be invited to give a presentation at the MSR
workshop in the MSR Challenge track.
Challenge #2: Predict
This year, the MSR Mining Challenge will have a special task:
Predict for Eclipse the number of bug/changes that will happen
between February 1 and April 30, 2007 (both days included).
Participation is as follows:
- Pick a team name, e.g., SCHNITZL.
- Come up with predictions for changes and/or bugs based on some criteria or prediction model. A very simple model is for instance the number of past changes/bugs.
- Annotate the corresponding files with your predictions
Here is a sample annotation: plugins-annotated-schnitzl.txt
- Write a paragraph (max 200 words) that describes how you computed your predictions.
- Submit everything before
January 31 February 7 (Apia time) by email to email@example.com.
Obviously, the team with the best predictions will win. However, to
increase the competition, we will organize a set of "benchmark"
The predictions for bugs should be on the component level. A
component is specified directly bug reports. For instance bug report
42233 was reported for the component "UI" of the product
"JDT". For the challenge, we will consider the core products of
Eclipse: Equinox, JDT, PDE, Platform. A complete list of relevant
products and components is in the file components.txt. Note, that we will not
remove duplicates from the final counts.
The predictions for changes should be on the plug-in level. We will
count the number of 'Exp' changes (deleting a file is not a change) in
"org.eclipse.*" modules. We will consider all branches. To
keep things simple, we map modules to plug-ins by taking the first
three parts of their namespace. For example
"org.eclipse.jdt.ui" maps to
"org.eclipse.jdt". You can find more examples in the file mapping.txt
More precisely, to determine the actual number of changes we will use the following script:
grep -r -E "date\W+2007\.0.+\W+author.+state Exp;" * | grep "^org.eclipse" | sed 's/org.eclipse.//' | sed 's/\..*//' | sed 's/\/.*//' | sed 's/-feature//' | uniq -c