By Anna Nicolaou – The big data frenzy continues. It’s permeating nearly every industry, flooding companies with more and more information, and making software dinosaurs such as Excel look more and more inept.
So what are the best tools to sift through gigantic data sets?
R has been kicking around since 1997 as a free alternative to pricey statistical software, such as Matlab or SAS
Python is intuitive and easier to learn than R
Julia is a high-level, insanely fast and expressive language
“If you look inside Twitter, Linkedin, or Facebook, you will find that Java is the foundational language for all of their data engineering infrastructures”
- Hadoop and Hive
Hadoop has exploded as the go-to Java-based framework for batch processing. It pairs nicely with Hive
“Java is like building in steel. Scala is like working with clay”
- Kafka and Storm
What about when you need rapid, real-time analytics? Kafka is your friend. Storm is another framework written in Scala
MatLab has been around for eternity
Octave is very similar to MatLab, except it’s free
Developed by Google, loosely derives from C