Elixir Dataflows

In support of a colleague’s data analysis, I built a live online data flow application in Elixir to ingest large quantities (100s of gigabytes) of social media data, process and filter the data, and load the data into a separate database for directy querying. We utilized an iterative, agile methodology to development the data processing and filtering techniques. I utilized Elixir for its easy parallelism and functional nature.

Flows the Wrong Way, Part 2: The Right Way

2018/10/13

Note in 2022: I’m in a very different state of mind compared to when I wrote this, and neither part represents my modern voice or style particularly well. I think it’s still a good story.

In my last post, I covered my first attempt to implement TCP streaming in Flow, a data flow library for Elixir. My first attempts involved a bunch of failed Unix sockets, and an attempt to implement a GenStage that failed for reasons I didn’t understand. I eventually settled on this:

Flows the Wrong Way: Streaming into Elixir

2018/10/10
Note in 2022: I’m in a very different state of mind compared to when I wrote this, and neither part represents my modern voice or style particularly well. I think it’s still a good story. As part of a new and exciting project, I was faced with the task of ingesting a large amount of more or less homogeneous JSON data into a SQL database for an associate of mine to do some rudimentary business intelligence analysis on it.