Prescriptive Applications over Kite and Hadoop.
Hadoop Mini Clusters
Collection of Hadoop Mini Clusters.
Spark Application Templates
This repository contains basic Templates for Simple Spark Application, Simple SparkStreaming Application(Network WordCount) and Simple Spark Mllib(KMeans example).
Giraph Tree Rooter
A simple example of using Giraph to root nodes in a tree.
NLP4L is a natural language processing tool for Apache Lucene written in Scala.
Schedoscope is a scheduling framework for painfree agile development, testing, (re)loading, and monitoring of your datahub, lake, or whatever you choose to call your Hadoop data warehouse these days.
Distributed RDFS & OWL Semantic Reasoning System with Spark
Search Using Brute Force Scans
Hadoop tools for manipulating ClueWeb collections and performing document retrieval using brute force scan techniques.
Bouquet is an open-source analytics toolbox to explore, share, and connect your data to applications and visualizations.
Copybook Input Format
Using JRecord to build a mapred and mapreduce inputformat for HDFS, MAPREDUCE, PIG, HIVE, Spark.
The ability to rebalance on clusters that have HBase by selecting folders to rebalance.
Hadoop File Ingestor
A simple program to put files from a directory into HDFS with the added functionality and defining how that action will happen.
Fixed Length Input Format
This is a FixedLengthInputFormat for Hadoop map reduce.
Subscribe for upcoming posts!
Join the channel!