One of the most crucial steps of the analytics journey is indeed the last one: the communication. The way you share your results will inevitably impact on the value perceived by your audience.
In the past I used to handle some data workflows with cron, especially ETL jobs. At the beginning it was easy and natural, but it didn’t scale well and ended up in trouble.
Knowledge is power – Littlefinger (Game of Trones)
If in this bussiness the handsets are the Knights, and the brands are the Lords, who is winning the battle? Despite the joke, the mobile handset market awareness is crucial for a mobile service provider in order to increase the sales, enhance the customer satisfaction and reduce the churn. The days when the customers only obtain their handsets from the provider are gone, now there are many other Lords on the battle (retail, amazon, ebay, etc). There is only one way to success: detect where the market is going, learn, and decide in consecuence.
Imagine a huge wifi network with hundred of access points (APs) and thousands of users. The network is managed by a controller, which has full visibility of all events like a new user connection or disconnection. In my case, the controller reports all the events to a Hadoop Distributed File System (HDFS) in raw format.
Continue reading “Service Provider WiFi analysis with Spark DataFrames”
One of the projects I’m working on is a platform to store all the accounting information generated by a mobile network with ~140K users. Although the system has several collections, the one I need to focus on is named
dayly, which contains ~57 million documents with the following simplified schema: Continue reading “Example of MongoDB’s Map Reduce”
I’d like to show through this post a minimal example on using python for an unsupervised learning task: clustering. My goal is to segment customers based on their past mobile data usage. Continue reading “Unsupervised learning use case: mobile users segmentation”