This year’s first weekly roundup! Check all the additions from PocketCluster Index !
Stock Inference engine using SpringXD, Apache Geode / GemFire and Spark ML Lib.
Stock Prediction demo on Spark.
This project shows a simple example of how to integrate drools into an Apache Spark job.
Sandbox for Apache nifi.
Source, data and tutorials of the Hue video series, the Web UI for Apache Hadoop.
An example to run drools with spark.
This code implements a spectral (third order tensor decomposition) learning method for learning LDA topic model on Spark.
Sparkling Water integrates H2O’s fast scalable machine learning engine with Spark.
The Spring for Apache Hadoop project provides extensions to Spring, Spring Batch, and Spring Integration to build manageable and robust pipeline solutions around Hadoop.
Spark-Druid package enables Logical Plans written against a raw event dataset to be rewritten to take advantage of a Drud Index of the Event data.
The main goal of the Spark Kernel is to provide the foundation for interactive applications to connect to and use Apache Spark.
Spear is a SparkListener that maintains info about Spark jobs, stages, tasks, executors, and RDDs in MongoDB.
Scripts for generating Grafana dashboards for monitoring Spark jobs
Scripts for parsing / making sense of yarn logs.
Guacamole is a framework for variant calling, i.e. identifying DNA mutations from Next Generation Sequencing data. It currently includes a toy germline (non-cancer) variant caller as well as a somatic variant caller for finding cancer mutations. Most development effort has gone into the somatic caller so far.
Spark ML transformers, estimator, Spark SQL aggregations, etc that are missing in Apache Spark.
Subscribe for upcoming posts!
Join the channel!