Weekly Roundup -Feb. 28, 2016

Yahoo starts this week’s roundup with releasing Caffe on Spark!


PulsarIO Realtime-Analytics
Realtime analytics, this includes the core components of Pulsar pipeline.

Apache Curator
Curator is a set of Java libraries that make using Apache ZooKeeper much easier.

Jetstream is a streaming processing framework.

Jetstream Esper Processor implementation


Caffe On Spark
CaffeOnSpark brings deep learning to Hadoop and Spark clusters.

Druid Kafka extension to avoid Kafka high level API rebalance issue

Tools for reading data from Solr as a Spark RDD and indexing objects from Spark into Solr using SolrJ.

Code to index HDFS to Solr using MapReduce

Shared code for our Hadoop ecosystem connectors


Flyway by Boxfuse • Database Migrations Made Easy.

Apache Sentry
Apache Sentry is a highly modular system for providing fine grained role based authorization to both data and metadata stored on an Apache Hadoop cluster.

Hari Sekhon Tools
Utils for Hadoop, Hive, Solr, Linux, SQL, Ambari, Datameer, Web and various Linux CLI Tools.

Solr Scale Toolkit
Fabric-based framework for deploying and managing SolrCloud clusters in the cloud.
Solr Scale Toolkit

Banana for Solr – A Port of Kibana

Code for using Pig scripts to index content to Solr
Lucidworks Pig Functions for Solr


You can find a lot more tools, frameworks and libraries at PocketCluster Index. Go check it out! Looking into adding your repo? tweet to @stkim1!

E-mail Subscribtion
Subscribe for upcoming posts!
Join Slack
Join the channel!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s