Glenn Engstrand

Functional Programming and Big Data for the Impatient

Learn about the future of Functional Programming and Big Data by reading this evaluation of three relevant open source technologies; PigPen, Cascalog, and Apache Spark. A small report is written, that reports on the per minute count of post actions from a two hour test run of a web service that generated 4077809 events, in all three languages and environments. Based on the findings of that development work, these three technologies are evaluated and compared.

measure Pigpen Cascalog Apache Spark
lines of code 27 28 5
run time in seconds 123 85 13
the good easy to understand implicit grouping small and fast
the bad broken and cumbersome funky punctuation syntax limited reduce side functionality
the ugly evil pig UDF hack not truly functional not always using hadoop

If you want more detail but are too impatient to read the 5 page document below, then check out this slide deck summary.

Functional Programming and Big Data