What language is the best for data mining algorithms?

All other topics related to TunedIT, data mining, machine learning, research, applications, ...

What language is the best for data mining algorithms?

Postby Marcin » Tue Dec 08, 2009 4:24 pm

Hello,

Automated evaluation of algorithms with TunedTester and sharing of their implementations lie in the heart of TunedIT. Currently, TunedTester supports only one programming language: Java. This is a significant constraint for many researchers and programmers who don't implement in Java. For this reason, we'd like to extend TunedTester with support for other languages, but we need your help and advice to decide which languages are worth considering: which are the most popular, mature, flexible, efficient, inter-operable, secure, easy to use, easy to deploy in production - from the perspective of data-mining / machine-learning research and applications? We invite you to share experiences at this forum.

Tell us about the language or software environment that you're using now or would like to use in the future. More importantly, say about the API (Application Programming Interface) that your algorithms implement. For example: is the algorithm implemented as a class or a function or a bunch of functions? If class, what is the base class and what methods must be overridden? What are the arguments and return values of these methods/functions? Is this your custom API or a common standard?

The choice of API is very important, because it determines interoperability of the algorithm with other pieces of software, including TunedTester.

Everyone is welcome to post about his experiences. It doesn't matter if you're a TunedIT veteran or just accidentally came across this forum. You may add a one-line vote or a detailed description of your approach. At some point in time we'll review your opinions and decide how to proceed with TunedTester development. However, we don't expect this thread to end anytime, because languages and software environments keep changing very fast and TunedTester development must follow.

Thanks
Marcin Wojnarski, TunedIT
Marcin
 
Posts: 115
Joined: Fri Oct 09, 2009 6:45 pm

Re: What language is the best for data mining algorithms?

Postby Guest » Tue Dec 08, 2009 8:38 pm

c#
Guest
 

Re: What language is the best for data mining algorithms?

Postby a passer-by... » Wed Dec 09, 2009 12:18 am

I'm sure many people will say C or C++ is a must, because there are so many libraries available, and also because of concerns about execution speed when working with very large datasets.

Personally, however, I like Python. But to keep the speed up, you can use SWIG to create a Python wrapper on any underlying C or C++ library, to make everyone happy.

In terms of API, I just use whatever the documentation for the library I'm using requires. Ideally, my algorithms are just "glue code" to patch together a bunch of library calls, so most of the total code is in the libraries.
a passer-by...
 

Re: What language is the best for data mining algorithms?

Postby another passerby » Wed Dec 09, 2009 2:05 am

Depends on the person & application...

For the netflix competition I used mostly VB2008 for quick development, although some of the stuff I tried was so slow I almost went over to C/C++ to pick up some speed. It's strange to have programs that take more time to run than write!

'R' is popular in data mining (especially biomedical) because of the statistical support libraries - and the price is right :)

SAS & SPSS are popular among the corporate data mining crowd.

I suspect your list will get longer the more people you ask... everyone has their own favorites..
another passerby
 

Re: What language is the best for data mining algorithms?

Postby Guest » Thu Dec 10, 2009 11:14 am

R
Guest
 

Re: What language is the best for data mining algorithms?

Postby Louis Kleiman » Mon Dec 28, 2009 5:07 pm

My choice is Delphi. I know it very well, it is native compiled and is indistinguishable performance-wise from C, it has component libraries that make database programming easy and fast. Basically, the ease of VB and the speed of C.
Louis Kleiman
 

Re: What language is the best for data mining algorithms?

Postby seyhan » Wed Jul 14, 2010 3:01 pm

I use C++ for text analysing such as reading and shaping data and use R statistical analysis. Also, KNIME and RApid Miner are very handy for data mining modelling (train and test shaped data).
seyhan
 
Posts: 2
Joined: Fri Jul 09, 2010 2:18 pm

Re: What language is the best for data mining algorithms?

Postby Gabrielwer » Wed Sep 07, 2011 10:14 am

I also use C++ as the translating from C to C++ to reduce mobile spy preprocessor usage
Last edited by Gabrielwer on Thu Oct 06, 2011 11:38 am, edited 1 time in total.
Gabrielwer
 
Posts: 1
Joined: Wed Sep 07, 2011 10:10 am

Re: What language is the best for data mining algorithms?

Postby xander771 » Wed Sep 14, 2011 12:16 pm

Louis Kleiman wrote:My choice is Delphi. I know it very well, it is native compiled and is indistinguishable performance-wise from C, it has component libraries that make database programming easy and fast. Basically, the ease of VB and the speed of C.

still I prefer C++ to Delph
xander771
 
Posts: 1
Joined: Wed Sep 14, 2011 12:15 pm

Re: What language is the best for data mining algorithms?

Postby mbq » Sat Sep 17, 2011 4:07 pm

R -- it has been just created for this job. And it already has a huge library, is free and open, easy to deploy...
mbq
 
Posts: 1
Joined: Fri Jan 14, 2011 3:26 pm


Return to General Discussion

Who is online

Users browsing this forum: No registered users and 1 guest

cron