It is for about 2 months that our team has started working on Basic Track of competition and
during our work we guessed that something is wrong with dataset 6. Today we found a reason
The problem is probably about test dataset of dataset 6 itself or the way you compute balanced
accuracy for this dataset.
To get the accuracy of our model on each dataset we set the prediction of all the other datasets
than dataset of interest to zero and as a feedback from leaderboard we get the accuracy of that
In this manner if someone submits all say 1 or 2 or ... C for dataset X which has C classes and sets
the prediction for the other datasets to zero, the expectation is to get accuracy on leaderboard
equal to [(0%+0%+0%+0%+0%+100% / C) / 6] . This means that for 5 class datasets if you submit
all 1 you should get accuracy = 3.00%. The story is the same with submitting all 2 or 3 or 4 or 5.
This story always happens with all the datasets except dataset 6:
everyone can test this:
if you submit zero as prediction for all datasets 1,2,3,4 and 5 and submit one as prediction
for all test points in dataset 6 we get 0.00%. the result for the others is:
if you submit all 1 for dataset 6 you get 0.00%
if you submit all 2 for dataset 6 you get 3.00%
if you submit all 3 for dataset 6 you get 2.00%
if you submit all 4 for dataset 6 you get 1.00%
if you submit all 5 for dataset 6 you get 1.00%
this indicates that there is a problem in this specific dataset.
I tested dataset 3 and for all classes I get 3.00% as it is expected to get.
Please fix this problem or indicate where I make a mistake.
But we identified a strange thing in dataset 6: