This week’s BigData frameworks, tools, and examples roundup!
A MapReduce job to explore blunders in chess games.
In this demo we take the crime dataset from the City of Chicago, turn it into a streaming data source and process the data in two paths.
Oryx 2 is a realization of the lambda architecture built on Apache Spark and Apache Kafka, but with specialization for real-time large scale machine learning.
Pachyderm is a complete data analytics solution that lets you efficiently store and analyze your data using containers. We offer the scalability and broad functionality of Hadoop, with the ease of use of Docker.
Serving system for Hadoop generated data
Presto is a distributed SQL query engine for big data.
Pinball is a scalable workflow manager.
Searchkit is a suite of UI components built in react. The aim is rapidly create beautiful search applications using declarative components, and without being an ElasticSearch expert.
Apache Kafka C/C++ client library
Fenzo is a scheduler Java library for Apache Mesos frameworks that supports plugins for scheduling optimizations and facilitates cluster autoscaling.
Large-scale image and time series analysis with Spark.
Scala extensions for the Storm distributed computation system. Tormenta adds a type-safe wrapper over Storm’s Kafka and Kestrel spouts.
Storehaus is a library that makes it easy to work with asynchronous key value stores. Storehaus is built on top of Twitter’s Future.
Subscribe for upcoming posts!
Join the channel!