I’m sorry Cron, I’ve met AirBnB’s Airflow

I’m sorry Cron, I’ve met AirBnB’s Airflow

In the past I used to handle some data workflows with cron, especially ETL jobs. At the beginning it was easy and natural, but it didn’t scale well and ended up in trouble.

Continue reading “I’m sorry Cron, I’ve met AirBnB’s Airflow”

Advertisements

Service Provider WiFi analysis with Spark DataFrames

Service Provider WiFi analysis with Spark DataFrames

Imagine a huge wifi network with hundred of access points (APs) and thousands of users. The network is managed by a controller, which has full visibility of all events like a new user connection or disconnection. In my case, the controller reports all the events to a Hadoop Distributed File System (HDFS) in raw format.
Continue reading “Service Provider WiFi analysis with Spark DataFrames”