Challenges / IEEE ICDM Contest: TomTom Traffic Prediction for Intelligent GPS Navigation/1. Traffic

Status Closed
Type Scientific
Start 2010-06-22 18:00:00 CET
End 2010-09-07 23:59:59 CET
Prize 5,000$

Registration is required.


You must be registered to this challenge in order to access the files.


Many municipalities use devices called Automatic Traffic Recorders (ATR) to collect traffic data from a number of selected road segments in the city. ATRs are magnetic loops embedded in the pavement surface, detecting the presence of metal and transforming this information to volume data. Data from ATRs are used further on to make short-term and long-term predictions, for the purpose of driver navigation and urban planning (roadworks, new road construction etc.), as for example in the Autobahn Traffic project in North Rhine-Westphalia, Germany.

In the first task, "Traffic", you have to devise an algorithm for predicting ATR recordings. A time series of simulated congestion measurements from 10 selected road segments of Warsaw is given. There are 2 values recorded in each time point for a given segment, corresponding to two opposite directions of traffic flow. Congestion - the number of cars that passed a given segment - is measured in consecutive 1-minute periods of time. The TSF simulator worked in 10-hour long simulation cycles. During the cycle, distributions of start and destination points of new vehicles were exchanged randomly every 60 minutes. After 10 hours, simulation was restarted from scratch.


Training dataset consists of a stream of data collected over 1000 hours of simulation, divided into a hundred of 10-hour long independent cycles. Lines contain measurements from consecutive minutes of simulation. Distinct simulations are separated by empty line. Every line contains 20 values: congestion for two opposite directions of 10 selected road segments. See the map and IDs of selected segments.

Test dataset covers different 1000 hours, split into 60-minute long windows, of which only the first 30 minutes are revealed, while the other 30 minutes are left secret for use in evaluation of predictions. There are 30 lines of recordings for every window. Windows are separated by empty line.

Your task is to predict congestion (total number of cars) for the 10-minute period starting 10 minutes ahead, i.e., for the period between 41'st and 50'th minute of every window. Windows in test dataset are randomly permuted.


Predictions must be submitted as a text file where lines correspond to test windows and every line contains 20 space-separated values of predictions. Values may be non-integers.

Baseline solution was calculated as a total number of cars in the last ten minutes of the known part of the window (minutes 21'st till 30'th)

Solutions are evaluated by the Root Mean Squared Error (RMSE) of predictions.


...view Leaderboard and Submit your solution.

Copyright © 2008-2013 by TunedIT
Design by luksite