Weekly BigData & ML Roundup – Oct. 19, 2017


Rapid Draw
A simple artificial intelligence experiment to find out if mobile neural networks can recognize human-made doodles

Chancey NN
Predict college admissions outcome using AI

Instagram Influencers
Identifying a Large Number of Fake Followers on Instagram

NLP Tasks
Natural Language Processing Tasks and Selected References

The Python Graph Gallery
A website displaying hundreds of charts made with Python


a language for image processing and computational photography

Test Tube
Python library to easily log, organize and optimize Deep Learning experiments

Artemis aims to get rid of all the boring, bureaucratic coding involved in machine learning projects, so you can get to the good stuff quickly.

Scikit Plot
An intuitive library to add plotting functionality to scikit-learn objects.

Sacred is a tool to help you configure, organize, log and reproduce experiments developed at IDSIA.


Tensorflow Implementation of Generative Adversarial Imitation Learning

PyTorch SepConv
an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch


C++ native client for Impala and Hive, with Python / pandas bindings

Guided LDA
Semi supervised guided topic model with custom guidedLDA

BLAS-like Library Instantiation Software Framework


A Hyper-Relational Database for Knowledge-Oriented System

Distributed training framework for TensorFlow.


Looking into adding your repo? tweet to @stkim1!

BigData and ML Toolset & Library Weekly Roundup – Oct. 12, 2017

PocketCluster Index now has “search” activated at the top navigation bar.


PyTorch Zero To All
Simple PyTorch Tutorials Zero to ALL!

Pandas Cookbook
Recipes for using Python’s pandas library

Simple LSTM
Minimal, clean example of lstm neural network training in python, for learning purposes.

Awesome GAN Applications
Curated list of awesome GAN applications and demo

Word-level language modeling RNN
word-language-model imported and modified from pytorch-examples

Nvidia OpenSeq2Seq
Multi-GPU sequence to sequence learning

Knowledge Browser
Real-time query spark and visualise it as graph.

OpenSim Reinforcement Learning
Reinforcement learning environments with musculoskeletal models


Deep Learning toolkit for Computer Vision

Fast, flexible and easy to use probabilistic modelling in Python.

SQL-based streaming analytics platform at scale

Image Monkey Core
ImageMonkey is a free, public open source image validation service.

Sequence Semantic Embedding
An encoder framework toolkit for NLP related tasks and it’s implemented in TensorFlow by leveraging TF’s convenient DNN/CNN/RNN/LSTM etc


Poincare Embeddings
NumPy implementation of Poincaré Embeddings for Learning Hierarchical Representations (Facebook Research)

PyTorch QRNN
PyTorch implementation of the Quasi-Recurrent Neural Network – up to 16 times faster than NVIDIA’s cuDNN LSTM

This repository provides code for machine learning algorithms for edge devices developed at Microsoft Research India.


Deep Learning Library (DLL) for C++

Neural networks in JavaScript


A lightweight, modular, and scalable deep learning framework.

Weekly BigData & ML Roundup – Oct. 5, 2017


Pytorch Exercises
Familiarize PyTorch with simple quizzes

Fast Neural Style Transfer
Demo of in-browser Fast Neural Style Transfer with Deeplearn.JS library

2x Image Resolution
Rescale images to two times the original size using Decision Tree models. Matches and improves on traditional rescaling methods such as bilinear resampling. Noticable improvements on percieved sharpness of the image.

The AI can paint on a sketch accroding to a given specific color style.


Tensorflow UE4
Unreal Engine plugin for TensorFlow. Enables training and implementing state of the art machine learning algorithms for your unreal projects.

Validation of local and remote data tables

Apache RocketMQ
A distributed messaging and streaming platform with low latency, high performance and reliability, trillion-level capacity and flexible scalability.

Apple Core ML Tools
Core ML community tools contains all supporting tools for CoreML model conversion and validation. This includes Scikit Learn, LIBSVM, Caffe, Keras and XGBoost.

Facebook ELF
An Extensive, End-To-End, Lightweight and Flexible Platform for Game Research

Visualization of NBA games from raw SportVU data logs


Semantic Segmentation
Semantic Segmentation using Fully Convolutional Neural Network.

Auto Sleep Scorer
An open-source sleep stage classification Python package

TensorFlow implementation of ENet, trained on the Cityscapes dataset.

CycleGAN Tensorlayer
Re-implement CycleGAN in Tensorlayer

State-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc.

A pyTorch implementation of the DeepMoji model

PyTorch NTM
A Pytorch implementation of an NTM (Neural Turing Machine)

ShuffleNet implementation in TensorFlow


The goal of this library is to give the user the ability to efficiently train Deep Learning models in a homomorphically encrypted state without needing to be an expert in either

Fully asynchronous, pure JavaScript implementation of the Parquet file format

Mocked Streams
Scala Library for Unit-Testing Processing Topologies in Apache Kafka / Kafka Streams

A Golang library for text processing, including tokenization, part-of-speech tagging, and named-entity extraction.

Reinforcement Learning framework to facilitate development and use of scalable RL algorithms and applications

Go scientific library for scientific computations involving linear algebra, special functions, Bessel, fast Fourier transforms, geometry calculations, NURBS, numerical quadrature, polyhedra, 3D transfinite interpolation, random numbers, Mersenne twister, probability distributions, optimisation, graph, plotting, visualisation, tensors, eigenvalues, differential equations, and much more.

Apache CarbonData
Apache CarbonData is an indexed columnar data format for fast analytics on big data platform, e.g.Apache Hadoop, Apache Spark, etc.


1000+ tools, frameworks and libraries indexed at PocketCluster Index!
Looking into adding your repo? tweet to @stkim1!

E-mail Subscribtion
Subscribe for upcoming posts!
Join Slack
Join the channel!


Weekly BigData & ML Roundup – Sep. 28, 2017

Yahoo has recently open-sourced Vespa, a Big Data Processing and Serving Engine, with CNBC’s praise for its battle-hardened content recommendation, AD targeting, and search execution capabilities.


Deep Learning tutorials in jupyter notebooks.

Archive the Twitter sample firehose and daily trends

Benchmark Databases
A minimal benchmark of various tools (statistical software, databases etc.) for working with tabular data of moderately large sizes (interactive data analysis).

Baidu – Mobile Deep Learning
This research aims at simply deploying CNN (Convolutional Neural Network) on mobile devices, with low complexity and high speed.


A Game Agent Framework helping you create AIs / Bots to play any game you own

Unity Machine Learning Agents
Unity Machine Learning Agents

This C++ toolbox is aimed at representing and solving common AI problems, implementing an easy-to-use interface with Python bindings which should be hopefully extensible to many problems, while keeping code readable.

A dataset for RGB-D machine learning tasks captured throughout 90 properties with a Matterport Pro Camera


PyTorch Generative Model Collections
Collection of generative models in Pytorch version

Splitting GAN
Code for Class-Splitting Generative Adversarial Networks


A Library for Bayesian Deep Learning, Generative Models, Based on Tensorflow

A small Julia library and package wrapper for ML/PR/AI

Computational graph library for Machine Learning. The main point is to combine mathematical operation together to form a workflow of choice. The graph takes care of evaluating the gradient of all the inputs to ease up setting up the minimizer.

Face Alignment
2D and 3D Face alignment library build using PyTorch


Yahoo – Vespa
An engine for low-latency computation over large data sets. It stores and indexes your data such that queries, selection and processing over the data can be performed at serving time.

1000+ tools, frameworks and libraries indexed at PocketCluster Index!
Looking into adding your repo? tweet to @stkim1!

E-mail Subscribtion
Subscribe for upcoming posts!
Join Slack
Join the channel!


Weekly BigData & ML Roundup – Sep. 21, 2017


The fast.ai deep learning library, lessons, and tutorials

TensorFlow Tutorials
TensorFlow Tutorials with YouTube Videos

Evolving Snakes
Snakes from the classical game are controlled by neural networks and evolve using a genetic algorithm.

TagSpace tensorflow
Tensorflow implementation of Facebook TagSpace

DeepLearing Benchmark
Playing with various deep learning tools and network architectures

Interactive Real-Time Visualization for Streaming Data


A minimalist tree plotting library using toyplot graphs

The new architecture of co-computation for data processing and machine learning.

Cat Classifier
An experiment to visualize a trained logistic regression model as graph plots

FAIR Sequence-to-Sequence Toolkit
Facebook AI Research Sequence-to-Sequence Toolkit written in PyTorch

A platform to build deep learning models online


Torch implementation of various types of GAN (e.g. DCGAN, ALI, Context-encoder, DiscoGAN, CycleGAN)

A general-purpose neural model for efficient learning of entity embeddings for solving a wide variety of problems

Volumetric Regression Network
Torch7/MATLAB code for “Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression”


A flexible neural network library for Node.js and the browser

TensorFlow API for .NET languages

1000+ tools, frameworks and libraries indexed at PocketCluster Index!
Looking into adding your repo? tweet to @stkim1!

E-mail Subscribtion
Subscribe for upcoming posts!
Join Slack
Join the channel!


Weekly BigData & ML Roundup – Sep. 14, 2017


Awesome Pytorch List
A comprehensive list of pytorch related content on github such as different models, implementations, helper libraries, tutorials, etc.

Awesome AI Security
A curated list of AI security resources.

Technical Book on Deep Learning
This note presents in a technical thought in pedagogical way the three most common forms of neural network architectures: Feedforward, Convolutional and Recurrent.

A toolkit for controlling Euro Truck Simulator 2 with python to develop self-driving algorithms.


Deep Learning Model Convertors
The convertor/conversion of deep learning models for different deep learning frameworks/softwares.

Open Neural Network Exchange (ONNX) is the first step toward an open ecosystem that empowers AI developers to choose the right tools as their project evolves

An open source web application that helps researchers, students and data-scientists to create, collaborate and participate in various AI challenges organized round the globe

A JavaScript WebGL Framework for Data Visualization

AutoML Service
Deploy AutoML as a service using Flask

Lexicon Rainbow
A minimal data visualization module between a single ordinal scale and a single linear scale with in-built GUI


TensorFlow GANs Comparison
Implementations of (theoretical) generative adversarial networks and comparison without cherry-picking

PyTorch ActorCriticRL
PyTorch implementation of DDPG algorithm for continuous action reinforcement learning problem.

A recurrent unit that can run over 10 times faster than cuDNN LSTM without loss of accuracy to training RNNs as Fast as CNNs

Deep Recommender
Deep learning for recommender systems


TensorFlow Agents
Efficient Batched Reinforcement Learning in TensorFlow

An open-source NLP research library, built on PyTorch.

A modern, lightweight, performant and tunable OpenCL BLAS library written in C++11, designed to leverage the full performance potential of a wide variety of OpenCL devices from different vendors.

1000+ tools, frameworks and libraries indexed at PocketCluster Index!
Looking into adding your repo? tweet to @stkim1!

E-mail Subscribtion
Subscribe for upcoming posts!
Join Slack
Join the channel!


Weekly BigData & ML Roundup – Sep. 7, 2017


PIano AI
Realtime piano learning and accompaniment from a Raspberry-Pi-powered AI

Word Ordering
Can neural networks order a scramble of words correctly?

Deep-Learning BootCamp
A nonprofit community run, 5-day Deep Learning Bootcamp


Process authoring tool for Apache Flink


PyTorch implementation of the method described in the Voice Synthesis for in-the-Wild Speakers via a Phonological Loop.

TensorFlow TransX
Holographic Embeddings implementation in Tensorflow


Bidirectional LSTM-CRF for Sequence Labeling. Easy-to-use and state-of-the-art performance.


1000+ tools, frameworks and libraries indexed at PocketCluster Index!
Looking into adding your repo? tweet to @stkim1!

E-mail Subscribtion
Subscribe for upcoming posts!
Join Slack
Join the channel!