Yep, there were duplicates. This wasn't intended, initially it was just a bug in data generation process. We discovered it shortly before contest launch and wanted to fix the data, but later on we decided to leave it as it is, because in a real-world setting the data are never clean, rather opposite - they contain all different kinds of impurities and without laborious investigation and cleansing one have no chances to get any close to optimal accuracy. So, this was a kind of an exercise.
Do you know if duplicates had any influence on model building and results?
As to other surprises - we'd be surprised ourselves if there are any. Or maybe you found something?