Right now nobody wants to work on this because 1) the data are incredibly noisy and 2) the target accuracy is unrealistic.
jake:
That is true. But my main source of reservations come from low credibility of this contest.
1) Look at what the contestants are supplied by. Some reports which are confusing (different data, far from clear what and why they do), claims whose validity is unclear ("coherences/consistency in the data"). No reasonable and credible clues given.
2) Leaderboard Evaluation. I think it is a well known, and well established, practice for such contests to evaluate the contestants on blind data. What is usually done is that people are asked to classify unknown data, of which a portion (30%) is used for computing the leaderboard results, while the classification result on the entire ensamble is undisclosed. In cotrast to this, the organisers use **known data** for leaderboard, so everyone can overfit (knowingly or not) and the Leaderboard Results are then a complete mess.

