Weekly BigData & ML Roundup – Mar. 23, 2017

LLVM for Data Processing
When Apache Spark launched Tungsten, there was a hint of incorporating LLVM into data processing pipeline to make use of modern CPU features and its superb machine code for performance boost.

LLVM appears again with Weld, a new code generation project for data analytics. The project claims that Tensorflow, Spark, and Numpy can be accelerated up to 30x with just few operations of Weld!

Interestingly, Matei Zaharia is in its contributor list.


Algorithmic Trading Pipeline
Algorithmic Trading Pipeline for Online Betting Markets

Blackbird Bitcoin Arbitrage: a long/short market-neutral strategy


Weld is a runtime and language for accelerating data analytics frameworks

Utitilties for BigQuery such as downloading table / query to csv/ndjson/excel/gsheet or new table using iterators for a low memory footprint.

Generating Vectors for DBpedia Entities via Word2Vec and Wikipedia Dumps

ETL Starter Kit
Extract, Transform, Load (ETL) refers to a process in database usage and especially in data warehousing.

Facebook Visdom
A flexible tool for creating, organizing, and sharing visualizations of live, rich data. Supports Torch and Numpy.

Vivitics Node (VNode)
Vivitics Node (VNode): A workbench for Data Science powered by Jupyter and Docker


Mozilla DeepSpeech
A TensorFlow implementation of Baidu’s DeepSpeech architecture

Official implementation of “Learning to Discover Cross-Domain Relations with Generative Adversarial Networks”


Machine Learning Model Server

UC Berkeley Ray
An experimental distributed execution engine

