The programming language chosen for predictive trading systems depends on where you are in life cycle. If you want to produce code to validate ideas then the fewest lines of code to produce a working system tends to favor languages like F#, Python and Julia. Even for early ideas you sometimes have to deal with considerable data volumes so you still need reasonable speed. If you are coding for live operations where speed is essential then C/C++ are the best choice especially when unplanned pauses from Garbage collectors are unacceptable. Julia is a new language designed for speed in critical areas. It isn’t perfect but overall I am impressed.
I find myself trying to find a hybrid where the speed is good enough to use for everything except HTF but where the programmer productivity is better and where I can choose to port only the most critical aspects to C. I recently tried Julia in this middle ground to see how I liked the language when testing new ideas for predictive algorithms. Julia seems faster than python in tight stats loops when you have to code your own indicators. Julia seems way faster than Node or Python when indexing into large vector arrays from inside the core language. I think numpy is still faster when you can use the matrix operations but a lot of my code needs to iterate across sliding windows where Julia was much faster than Node.js or Python. I think I isolated the primary speed difference to the fact that Julia has a lighter weight mechanism for computing pointers to elements in arrays than Python or Node.
I would still choose to move to C/C++ for important production code but Julia seems to offer a pretty sweet balance between performance and programmer productivity. Julia is still immature and the error messages need improvement. They really need to add default initializers for composite structures but overall I like the language. Julia includes a DataFrame that is similar to R tables and it includes many of the same matrix operations. Julia shines when you pull out vectors to write functions that cannot be accomplished with matrix operations and have to iterate across the rows. This kind of operation requires fast tight loops indexing into arrays and Julia ran a lot! faster than R for that task.
I have always loved the flexibility of R’s Datatable objects and it’s graphics integration but hated the speed whenever I had to iterate over large sets. Julia seems to do a better job for the performance critical pieces of code while preserving many of the niceties of the R Datatable.
The R Hadoop integration is nice but sometimes having a language that can run 10X faster on the same algorithm can move you from needing a 20 node Hadoop cluster down to something that can be accomplished in a single 8 core CPU. This can have significant benefits for reducing operating costs. Julia does have distributed computing support but I have not tested them so can not vouch for how well they work.
I would still hesitate to recommend Julia for my commercial clients but for early life cycle prototypes and proof of concept predictive engines it seems like a good choice.
Thanks Joe Ellsworth