CoreNLP wrapper for Spark
CoreNLP wraps Stanford CoreNLP annotation pipeline as an Apache Spark ML Transformer.
Distributed Machine Learning Common Codebase
A common bricks library for building scalable and portable distributed machine learning.
Lightweight CPU/GPU Matrix and Tensor Template Library in C++/CUDA for (Deep) Machine Learning.
Smile (Statistical Machine Intelligence and Learning Engine)
Smile is a set of pure Java libraries of various state-of-art machine learning algorithms.
Avro RPC Quick Start
Apache Avro RPC Quick Start. Avro is a subproject of Apache Hadoop.
Fair job scheduler on Mesos for batch workloads and Spark.
A repo for benchmarking distributed implementations of the singular value decomposition.
A streaming anomaly detection system built with Oryx.
Choosing a fantasy football team using spark, hive, python, and really just about anything.
LSA of Legal Documents
Latent Semantic Analysis of Legal Documents.
Scikit-Learn Score Example
Example of applying a fit sklearn model to a distributed dataset using pyspark.
Sparkling Pandas Example
Examples of using SparklingPandas and Pandas with PySpark.
Subscribe for upcoming posts!
Join the channel!