Page 1 of 1

How did you like the challenge?

PostPosted: Thu Mar 04, 2010 4:20 pm
by TunedIT
Hello,

Thanks everyone for participation. Hope you had good fun and gained valuable experience. We enjoyed it very much. :-)

Please say us how was the challenge: organization, problem formulation, TunedIT website etc. All your thoughts and comments, good or bad, are welcome. They will help us very much if we organize next competitions in the future.

Best regards,
Organizers

Re: How did you like the challenge?

PostPosted: Thu Mar 04, 2010 10:04 pm
by robinhood
This was a well-organised challenge, many thanks for all the work.
As other users have suggested before, it would be nice if post-challenge submissions were allowed or the test data revealed. Moreover, as a researcher from the Biosciences I would also be interested in knowing the real gene names corresponding to the features in the microarray datasets. Maybe some of the genes that were selected in many base classifiers of ensemble methods have a biological function associated to the outcome classes, and the prediction models are human-interpretable to a certain degree. In a future competition it might also be interesting to reveal the gene names right from the start, so that participants could try to integrate biological information from other data sources in their models (it would be very useful to see whether further performance gains are possible).

Best regards,
robin

Re: How did you like the challenge?

PostPosted: Thu Mar 04, 2010 11:21 pm
by tsymbalo
For me the main problem was that I simply wasted more than 3 days trying to submit a solution based on the weka.meta classifiers and afterwards, after a pretty long discussion with the organizers (that took 2 days) it appeared that weka.meta is simply not yet supported! and I could not see any information about that before I tried to prepare code based on weka.meta. so, at least five days are wasted simply for nothing!! I guess it was not only me having these problems.

Second, it appears that time and memory restrictions were too strict and not so clear and it clearly depended on the number of solutions being tested. Not taking into account that while in the beginning testing a solution took 2-3 hours, then at the end two times I waited more than 24 hours! It is not a problem by itself, but made any kind of planning simply impossible. Perhaps, having an output with how much % is already done could help people and save them so precious time? Moreover, the time restrictions were rather unrealistic and, I believe, much better and more meaningful results could be achieved if time allowance would be 2 times more, or so. 12 minutes for doing meaningful feature selection on a set of genetic data is not realistic today (or at least I believe so, even if seamingly good accuracies were achieved).

Third, having gene names as was already mentioned might be useful indeed not only from the point of view of generating useful knowledge but also from the point of view of ability to use external knowledge.

Fourth, as I already said in my previous comment, publishing more deeper history/overfitting statistics is more than desirable and might be much more interesting/useful (for everyone) than the contest itself.

Last, there are many other statistics than the simple mean true positive rate. It would be more than desirable to see and analyse them as well, otherwise all this is pretty useless. Are these generated models meaningful/useful at all for clinicians?

Anyway, thanks a lot for organizing this contest and for tunedit, this was clearly a very useful experience, and tunedit is clearly an extremely useful thing!

Re: How did you like the challenge?

PostPosted: Fri Mar 05, 2010 2:42 pm
by Guest
Hi Guys,

Thanks for the well run challenge.

I suspect the learnings from this (which is what it is all about) will be more about how to win competitions (for the basic track at least) than how to develop a machine learning algorithm for the task at hand.

The evaluation method used means that the eventual difference between first and 20th place could have been decided by just 1 or 2 cases being predicted differently. For this reason it would be good to see the scores from the 6 individual data sets - this shouldn't be difficult to compute and publicise.

Did you randomly select the leaderboard and final evaluation sets, or did you force each class to have the same counts in each, in which case it would be more of a 'game' to work around the evaluation criteria than a machine learning exercise.

Another point to note also is that the target classes were human judgements, which could be wrong. Not being a subject matter expert, would it be possible to re evaluate the human classifications based on the winners predictions - this seems to me the most useful outcome of this type of analysis?

Looking forward to seeing the ensembling results.

Re: How did you like the challenge?

PostPosted: Tue Mar 09, 2010 7:42 pm
by Marcin
Thanks for your opinions.
robinhood wrote:it would be nice if post-challenge submissions were allowed or the test data revealed. Moreover, as a researcher from the Biosciences I would also be interested in knowing the real gene names corresponding to the features in the microarray datasets

Test data are revealed now and can be found in Repository: http://tunedit.org/repo/RSCTC/2010/A and http://tunedit.org/repo/RSCTC/2010/B
Detailed info about genes and data source will be published soon.

robinhood wrote:In a future competition it might also be interesting to reveal the gene names right from the start, so that participants could try to integrate biological information from other data sources in their models (it would be very useful to see whether further performance gains are possible).

Surely, this information could help. We didn't reveal it because we had to keep identity of the datasets secret during competition.

Re: How did you like the challenge?

PostPosted: Tue Mar 09, 2010 11:43 pm
by Guest
Hi,

For the basic track, would it be possible to for the released targets to flag if they were in the leaderboard or final evaluation sets?
Also for the ensembles you calculated - what actually were the results? When you say best x, best on leaderboard or final results?

If you could do this it would assist us non java users.

Cheers.

Re: How did you like the challenge?

PostPosted: Fri Apr 09, 2010 11:32 am
by Marcin
Hello,

On the main page of the challenge, there's now a complete summary with links to all resources and results. In particular, take a look at:


We hope that publication of all these resources will let you continue the research work started with participation in the contest.