IEEE International Conference on Data Mining

ICDM 2008 Data Mining Contest:

Radioxenon monitoring for verification of the Comprehensive nuclear-Test-Ban Treaty

Instructions for Participants

[ Home | Instructions | Downloads | ICDM'08 ]
December 17th 2008 - And the winners are...

Winner of the Contest Crown

The team of
Wei Fan1, ErHeng Zhong2, Sihong Xie2, Yuzhao Huang2, Kun Zhang3, Jing Peng4, and Jiangtao Ren1
1) IBM T. J. Watson Research Center
2) Sun Yat-Sen University
3) Xavier University of Louisiana
4) Montclair State University

Winner of the most muscular

Zhongfeng Zhang
Institute of Automation, Chinese Academy of Sciences

The Kangaroo Prize was postponed/cancelled due to an anomaly in the data set.

The ICDM Data Mining Contest 2008 is now officially over. However, we do encourage you to participate in the tasks, or otherwise explore and mine the data. If you do plan on publishing any results, please let us know.
 

1. General Description of the Problem:

 

Compliance verification of the Comprehensive Nuclear-Test-Ban Treaty (CTBT) will employ the remote detection and measurement of radioactive forms of a noble gas, xenon, called radioxenon that is potentially emitted from the site of a nuclear explosion.  Specifically, four radioxenon isotopes, Xe-131m, Xe-133m, Xe-133, and Xe-135, are measured in a procedure called radionuclide monitoring. Different relative combinations of these isotopes correspond to signatures that can be associated with distinct sources such as nuclear power plants, medical isotope production facilities, or various types of weapons.

 

            In the first few weeks after an explosion, the relative concentrations of the four isotopes are expected to be released in “fingerprint” relative concentrations quite distinct from background sources of radioxenon. The problem of attributing a specific observation of airborne concentrations of radioxenon to an explosion is twofold.  Firstly, since the CTBT stations are not located at the source of the explosion, the radioxenon is detected at a location which can be well over a thousand kilometres away. This atmospheric transport process can take weeks, thus degrading the distinctness of signature through radioactive decay and lessening the likelihood of detecting one or more of the radioxenon isotopes at all.  Secondly, one can never observe radioxenons emitted purely from an explosion source but admixtures of this gas with the radioxenons released from all other background sources.

 

The problem set to the contestants is to devise the means to distinguish between those radioxenon measurements that are due purely to normal environmental emissions or background (B) from those measurements that contain the signature of an explosion combined background (B+E).

 

 

 

2.1  Contestant’s Package

 

            In addition to these instructions, the contestant’s package includes:

 

  1. A training data set file including class labels B and B+E
  2. A blind testing data set without the labels B and B+E
  3. A program to assist self-evaluation and evaluation of the contestant’s methods calculating figures of merit including Area Under Curve (AUC) from Receiver Operator Curves (ROC). Description of use the program is in Appendix 1.
  4. Reporting template to be used for each submission by the contestants to ensure intercomparability among the contestants and to facilitate panel analysis of the submissions. Submissions not employing this template will not be considered. Description of use in Appendix 2.

 


2.2 Description of the data sets provided:

 

Two files containing the contest data are provided.  One file contains the training data in which the class of each datum, B or B+E is labelled. A second file contains a test data set containing instances of both classes but without B or B+E labels.  In both file types, an alpha numeric index is provided for each case. The numeric portion traces the background measurement or background measurement combined with synthesized explosion observation used to create the datum. The alpha portion of the code refers to one of 5 real-world measurement sites that have been collecting measurements of radioxenon concentrations daily for an extended period of time and to the qualitative degree of complexity of the background radioxenon observed at these sites. Hence, the alpha codes qualitatively rank the sites’ radioxenon background in order of increasing complexity  is from V to W  to X to Y with location Z being particularly complex with respect to the 4 other sites.

 

The 6 contest data column headers are explained as follows:

 

a)      The first column contains an index comprised of a station code and the scenario numeric index. (see above).

b)      A second column identifying the type of datum - background (B) or a combined background and simulated explosion signal (B+E). This column is blank in the test data set. 

c)      A final 4 columns, one for each of the activity concentrations associated with the index, namely, Xe-133, Xe-133m, Xe-135, Xe-131m.

 

 

 

 

 

3. Contest Reporting:

 

A reporting template is provided and it and its use are described in Appendix 2.  For each final result of a task and task test, it is essential the contestants provide a filled electronic template the allow the contest evaluators to consider their work fully.

 

3.1  Point of Contact for Completed Templates:

 

Please send all completed templates (maximum 5 MB email) to jing_yi@hc-sc.gc.ca (underscore “_” between Jing and Yi).

 

 

 

4. Contest Tasks:

 

The primary goal of this contest is to produce methods that are broadly applicable over different station background measurement distributions and explosion source hypotheses.  The best methods will also have a very efficient learning curve in terms of the amount of data required to successfully tune the classifier.  Recognition will also be given to methods more proficient in properly categorizing data arising from specific classes of explosion release hypotheses or station background types, because these methods add a forensic or diagnostic dimension to the classifier that may not be evident in the overall best classifier.

 

Software will be provided (see Appendix 1) to the contestants to calculate their relative degree of success at each task in terms of a number of characteristics (for example, numbers of false positives) and figures of merit (for example, % accuracy or Area Under Curve for Receiver Operating Curve).  Area Under Curve (AUC) will be used as the primary figure of merit by the evaluators to judge success in conjunction with consideration of ease of tuning the methods and balance of performance over a range sub-cases of explosion radioxenon emission scenarios.

 

Contestants may opt to do one, some or all of the tasks but their full participation is encouraged.

 

 

The tasks are as follows:

 

Task 1: The first task is to classify, as accurately as possible, the results as B or B+E over the entire set of stations (V, W, X, Y and Z) with one classifier. Contestants may combine data as they see fit.  They may separately tune classifier parameters for each station but they may not have separate classifier parameter types for each station nor separate classifiers.  Contestants can to report on more than one classifier for this task if they so choose.

 

Task 2: The second task is to classify, as accurately as possible, the results as B or B+E with an optimal algorithm for each station (V, W, X, Y and Z) given.

 

Task 3: The third task is to apply classifiers developed in Tasks 1 and 2 to assist the panel of evaluators assess the contestant’s methods. The contestants will apply their methods to a second unlabelled data set. Furthermore, they will reprocess the first data set under the prescribed conditions described below to allow consideration of balance of performance (as below). 

 

4.1 Task 3, Detailed Tests:

 

Contestants are requested to run there classifiers developed in the in tasks 1 and 2 in a manner prescribed to assist the panel evaluators assess their submissions in detail. For each test trial, the attached template must be completed as a full report. Instructions on use of the template are included in Appendix 2 

 

 

4.1.1 Test 1

 

            The contestants will classify a second unlabelled data set comprised of similar data as used for the development of their classifiers.  This test ensures the performance of the methods for similar concentrations of radioxenon sampled from the same distribution as the training set but not necessarily the same proportions of B and B+E cases.  A report template must be provided for each classification run including the methods provided for individual station types.

 

4.1.2 Test 2

 

The labelled training data sets employed by the contestants may be the complete labelled data set and station subsets provided by the contest sponsors or training sets created by the contestants themselves using combinations and sub-sampling. For the actual data used to develop their classifiers, the contestants are requested to calculate the AUC for their classifiers for tuning with 20% of their employed training data set: then similarly for 40%, 60%, 80%, and 100%.  This test examines the efficiency in use of data required to tune the classifiers.  Of course, recognition will be given to contestants who employ relative relatively small subsets of the training data provided in the first instance to develop their classifiers. A report template must be provided for each classification run including the methods provided for individual station types.

 

 

4.1.3 Test 3

 

For the classifier developed under Task 1 only, the contestants are asked to provide results for a factorial analysis of the sensitivity of the effectiveness of their method to small changes in their parameter values.  Hence, the contestants are requested to provide results for 2n trials where “n” is the number of numerical parameters used by their methods.  Parameter jumps on the order of 10% are requested where, as appropriate and as judged by the contestant, it is:

 

 

For example, if the classifier has 3 numerical parameters,  23 or 8 trials are needed for all possible combinations of high and low parameters. The trials to be conducted are, therefore:

 

 

Parameter 1

Parameter 2

Parameter 3

Trial 1

+10%

+10%

+10%

Trial 2

+10%

+10%

-10%

Trial 3

+10%

-10%

+10%

Trial 4

-10%

+10%

+10%

Trial 5

-10%

-10%

+10%

Trial 6

-10%

+10%

-10%

Trial 7

+10%

-10%

-10%

Trial 8

-10%

-10%

-10%

 

For Task 1 classifiers employing station specific tuning, It is requested that this analysis is employed for at least two station types.

 

 

 

5. Contest Papers

 

All contestants are asked to write a short paper (approximately 4 pages) describing your method and results. The paper can be as short as 2 pages or can be up to a maximum of 6 pages. Please email your paper to Trevor_Stocki@hc-sc.gc.ca by Nov 20th.

 

Please use the same format as the conference. Please use the "camera ready format". See here for more details.
Do not use pdf express for this. Please send the LaTex, MS-Word document or PDF file.

Please also note that is not a double blind submission, so please put the author's names and institutions on the paper. Also put the corresponding author's contact information.

 

All the papers will be collected and distributed at the conference as well as made available through our Website.

 

 

 

6. Contest Prizes

 

6.1 Contest Crown:

 

            For classifier judged to have best over all performance

 

6.2 Most Muscular:

 

            For classifiers with highest AUC scores for the full labelled data set and by station.

 

6.3 Kangaroo Prizes:

 

            For classifiers that are talented in unusual or unexpected respects.


 

 

 

 

Appendix 1:  Installing and using the provided evaluation software. 

 

Installation

 

This software is known to work with windows XP, it should work with other versions of windows, but has not been tested.  The software is provided in the file named evalualtor.zip. Please note, the Setup.exe program in the tool needs the activeX installed in advance because the tool needs the activeX for charting.  The activeX is included in the package.

 

Installation Instructions: 

 

1. unzip

            Unzip the compressed package, which contains all necessary executable files and

            sample data files. The following should be done first if needed.

 

2. Install ActiveX control(MSCHRT20.OCX)

 

            a) Click the start button.

            b) Click Run.

            c) The run window will pop up.  Click browse.

            d) Go to the directory in which unzip deposited all the files;

            e) Type the following in the “open” field, thereby running the activeX        installation:

 

regsvr32 MSCHRT20.OCX

 

3. Setup

            a) Run Setup.exe in the unziped package for installing the tool

            b) Follow the usual setup menus.

 


Using the Software

 

This software has included with it a few example files.  In the following instructions these example files will be used to show you how to use the software to calculate an ROC curve and AUC.  Below is an image of what your window should look like after running the example.

 

 

 

Steps:

 

1.         You will need to generate two files from your classification software.  One with    the actual values of the training data and one with the predicted values from the      model. How to generate these files is explained in the next section below.

2.         Start the program by double clicking on evaluation.application.

3.         Input the  filename which contains actual values of test data by typing it into the   ‘Actual Values field’ or by clicking the ‘...’           button to browse.  (true.txt, for our    example, in the directory).

4.         Input the filename for the predicted results by typing it into             the ‘Actual Values      field’ or by clicking the ‘...’ button to browse. (predicted1-DTJ48.txt, for our    example, in the directory)

5.         Select the predicted results in the list box for evaluation

6.         Click Evaluation

 

Then the related roc curves (2dXY) will be displayed in a window (please handle a bug for color: evaluate one first, and then the multiple selections are done next). The results are displayed in a text box.  You can cut and paste these results into the template. These results are output to the related files, which are explained below.

 

 

Formats for two inputs

 

The predicted results can be one of the two formats, and the delimiters between columns can be tab or comma or space. Also, the tool can automatically sort the probabilities for true positive cases.

 

 

#case       TPProbability

xxxx        0.####

xxxx        0.####

...

 

or by leaving the TPProbability column empty

 

#case                   predClass

xxxx                    xx

xxxx                    xx

...

 

 

The format for the file that contains the actual values is only required to be the second format, but in this case, it is the actual class not the predicted class.

 

 

Output Files

 

ResultsEvals.txt

This file contains the evaluation results according to the result files from all contestants;

 

xxxx_roc_curve_points.txt

This file contains the roc curve points for each predicted result of contestant for the general purpose;

 

For example,

 

x            y           prediction (not use)

 

xxx         xxxx       0.#####

xxx         xxxx       0.#####

 

 

xxxx_roc_curve.gnu contains the definition of the roc curve points, which is used by gnuplot.

 

This software will give you the AUC for a data set. You will need to segment your data sets properly in order to fill in the template with the appropriate values of AUC.  You will also need to make listings of the data for the template as well. This software will not do this.

 

If you have any questions about this software please contact Trevor Stocki at  Trevor_Stocki@hc-sc.gc.ca .

 

 

 

Appendix 2:  Data Entry Template Instructions ICDM 2008

 

Only solutions entered in the data entry template will be evaluated and considered by the panel evaluators.  Each data classification attempt will require submission of the data entry template.  The template has 4 sections labelled in blue:  Biographical Information, Model Development Data, Analysis Software Results, and Raw Classifier Results.  A brief description of each the major section follows:

 

Biographical Information Section

This section is used to report information:  team members, contact information, task number, data set used.  Additionally, there are field to enter the algorithm name and description.  Information on the algorithm should be of sufficient detail that it is possible for the panel to completely reconstruct the team’s results.

 

 

 


Model Development Data Section

This section contains the indices and associated xenon concentrations that were used in the development of the algorithm.  An example of how a single datum is entered is shown below.

 

 


Analysis Software Results

By using the software tool provided (see Appendix 1), algorithm performance will be measured and assessed.  All performance information from the software tool is entered into this section.

 

 

Raw Classifier Results

The results of your team’s algorithm should be the classification of the datum points provided into background and background + explosion classes.  This section should contain the indices and associated xenon concentrations as classified by your algorithm under the appropriate heading.  You will have to insert rows to accommodate all datum points.

 

 

Template Instructions:

 

  1. Enter team members, organization and the details of a single point of contact.  The single point of contact should be prepared, if necessary, to answer questions should the evaluation panel require more information. 
  2. Choose from the drop down options the task number, station name (task 2 only), and data set.
  3. Enter the algorithm name and describe the algorithm in a detailed enough manner for the evaluation panel to duplicate your results.
  4. In the Model Development section, enter the indices and associated xenon concentrations for all datum points used to develop the model.
  5. For the Analysis Software Results section:
    1. Choose the percentage of data used to develop the model (Task 3 and training data set ONLY),
    2. Enter the number of parameters in the algorithm, the values, and their type (real or integer), remembering to update the values each time for Task 3, Test 3.
    3. Enter all results from the software analysis tool in the appropriate field
  6. In the Raw Data Section, enter each classified datum under the appropriate heading.
  7. Save the file with a name following the examples below:

      Organization ACME with John Smith as the contact person is participating in ICDM.   His team is using the training data set, and working on Task 2, station W, would use the following filename:

John_Smith_ACME_Task2_StnW_Training.xls

 

      The same organization is now working on Task 3, Test 3, using the training data set,  and has a 3 parameter model (Parameter 1 +10%(I for increased), Parameter 2 -10%(D for decreased), Parameter 3 Normal (N for normal)), would save the template as:

John_Smith_ACME_Task3_IDN_Training.xls

 

Or more explicitly,

Contact Name_Organisation_Task#_Parameter States_Data Set Name.xls

 

 

The same organization is now working on Task 3, Test 2, using the training data set, and 40% of the data would save the template as:

John_Smith_ACME_Task3_Training_40.xls

 

Or more explicitly,

Contact Name_Organisation_Task#_Data Set Name_% of Data used.xls


Specific details on reporting on task 3.

1) For test 1 of task 3, please report the probabilities and we will calculate the AUCs.

2) For test 2 of task 3,
a) please start the tuning of your classifiers from scratch
b) please use X (where X is 20%, 40%, 60%, 80%, and 100%) of the labelled data set to tune your classifier,
c) then please run your classifier on the entire unlabelled data and
d) report the results from the unlabelled data set in the reporting template.

Note you should generate a reporting template for each of the percentages (IE 5 templates in this test).

3) For test 3 of under task 3 (it is actually task 1) please do all the work with the labelled data set and report your AUCs as tasks 1 and 2.

[ Home | Instructions | Contestant Package | ICDM'08 ]