Weekly roundup – Apr. 15, 2016

Examples

Pipeline
End-to-End, Real-time, Advanced Analytics Big Data Reference Pipeline.

architecture-overview

Spark Streaming Blueprint apps
A spark sbt blueprint to build your own spark apps off of.

Toolsets

Streaming SQL for Apache Spark
Manipulate Spark-streaming by SQL.

WhereHows
Data Discovery and Lineage for Big Data Ecosystem.

Confluent REST Utils
Utilities and a small framework for building REST services with Jersey, Jackson, and Jetty.

Kafka Offset Monitor
A little app to monitor the progress of kafka consumers and their lag wrt the queue.

zk-shell
A powerful & scriptable shell for Apache ZooKeeper.

HQL (Apache Hive) query language support in Atom
Brings HQL (Apache Hive query) language support to Atom text editor. Works for SQL like languages and Pig Latin.

dejaVu
A modern Elasticsearch data browser.

spark-gce
A tool for running Spark on Google Compute Engine.

Libraries

The Intel® Deep Learning Framework
IDLF is a SDK library for Deep Neural Networks training and execution.

aerosolve
A machine learning package built for humans.

Apache ORC
ORC is a self-describing type-aware columnar file format designed for Hadoop workloads.

Hiveunit-MR2
A library to test Hive scripts with YARN and MR2.

Ducktape
Distributed System Integration & Performance Testing Library

 


You can find a lot more tools, frameworks and libraries at PocketCluster Index. Go check it out! Looking into adding your repo? tweet to @stkim1!

E-mail Subscribtion
Subscribe for upcoming posts!
Join Slack
Join the channel!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s