Task 3 - GPS Training Data Outliers

Questions, answers, discussions related to IEEE ICDM Contest

Task 3 - GPS Training Data Outliers

Postby amrkabardy » Sun Jul 11, 2010 4:35 pm

The training data of task 3 "GPS" contains non-realistic data. There are about 57000 outliers. It has values for longitude that are far away from Warsaw map (some values jump to Russia!!).

As an example, have a look at those data:
170 129 0 52.1983 38.0398
230 386 0 52.1682 36.1611
250 195 0 52.3358 40.5478
260 62 23.5 52.2963 32.6303
900 334 0 52.2469 37.7419
910 251 0 52.1867 39.4539
930 325 11.2 52.2472 35.2635
950 637 0 52.2503 39.5417
990 526 0 52.1458 40.4882
1030 12 0 52.2347 38.8464
1040 225 0 52.1799 18.7817
1060 562 0 52.2449 36.0609

I will just through away such points from the data and work with the remaining dataset, but I'd like to know if there is anything to mention about this.
Please advise if there are any considerations about these data.

Thanks
amrkabardy
 
Posts: 4
Joined: Wed Jun 30, 2010 6:05 pm

Re: Task 3 - GPS Training Data Outliers

Postby pawelg » Wed Jul 14, 2010 12:02 pm

Hello,

Thank you for this observation. Yes, indeed there are some outliers related with wrong longitude (other attributes should be fine). This is related with a bug in generating data, but the case when the bug occurs is very rare (and we just missed it).

Best regards,
Pawel Gora
pawelg
 
Posts: 13
Joined: Fri Jun 25, 2010 9:12 pm

Re: Task 3 - GPS Training Data Outliers

Postby pawelg » Wed Jul 14, 2010 12:37 pm

One more thing:
the correct latitude comes from the range:
(52.106505190756316, 52.375599176659101)
and correct longitude comes from the range:
(20.830078125, 21.26953125)

Just omit the values that lie outside this range.

Pawel Gora
pawelg
 
Posts: 13
Joined: Fri Jun 25, 2010 9:12 pm


Return to IEEE ICDM Contest: Road Traffic Prediction for Intelligent GPS Navigation

Who is online

Users browsing this forum: No registered users and 1 guest

cron