Training on the testset

Questions, answers, discussions related to RSCTC'2010 Discovery Challenge

Training on the testset

Postby OneMillionMonkeys » Thu Dec 24, 2009 12:23 am

For the basic track, we have access to the testset when we train the models. Not the labels, of course, but all the field values for each record in the testset. Even with the labels missing, I believe having the rest of the testset available will improve my model.

I haven't tried submitting to the advanced track yet, but from browsing the documentation it seems that you do not have equivalent access no matter which interface (Debellor, etc.) you use. Instead, you are given the whole training set, and then presented with testset samples one at a time which you need to classify right away. You cannot see the whole testset in advance, use that information to assist in training your model and only then begin classification.

That said, it doesn't seem there is anything to prevent you from adjusting your model incrementally as testset samples are presented. So, in theory, your model could be more accurate on later testset samples as compared to earlier ones.

Is my understanding correct and is this how things are intended? Is it permissible for a model to adapt during presentation of the testset samples?
OneMillionMonkeys
 
Posts: 4
Joined: Thu Dec 03, 2009 12:08 am

Re: Training on the testset

Postby Marcin » Mon Dec 28, 2009 7:46 pm

Yes, we'll accept the algorithms that adapt the model during presentation of test samples. In many real-world applications this kind of adaptation would be highly appreciated and thus it wouldn't be reasonable to reject such algorithms from the contest.

Regards
Marcin Wojnarski
Organizing Committee
Marcin
 
Posts: 115
Joined: Fri Oct 09, 2009 6:45 pm


Return to RSCTC'2010 Discovery Challenge

Who is online

Users browsing this forum: No registered users and 1 guest

cron