Amédée d'Aboville

  • About Me
  • Projects

PCA On Large Matrices: You don't need Spark.

October 11, 2016 · 7 minute read · Tags: data science , pca , spark

The other day I found this post on the Domino Data Science blog that covers calculating a PCA of a matrix with 1 million rows and 13,000 columns. This is pretty big as far as PCA usually goes. They used a Spark cluster on a 16 core machine with 30GB of RAM and it took them 27 hours. I read up a bit on PCA and realized you can do PCA on large (several billion element) matrices much faster and without using any Big Data tech like Spark by using better algorithms and more RAM.
Continue reading

Some Data Processing

March 26, 2015 · 8 minute read · Tags: curl , etl , football , ruby , unix

This post is from my old blog. It’s about a weekend project where I downloaded a bunch of football match data and did some light analysis of it. I had further plans to use it for “Machine Learning” and try my hand at a prediction engine, but I didn’t get that far. Sadly, I couldn’t get the pictures back. It’s a great example of where I was 4 years ago and reminds me of the progress I’ve made.
Continue reading

Indistinguishable From Magic Has to Die

November 10, 2014 · 4 minute read · Tags: Emerson , magic , society

It’s inaccurate and immoral to think of our hacks as magical.
Continue reading

Why Hackers Should Study Communications

October 22, 2014 · 4 minute read · Tags: McLuhan , hackers , media , systems

We have advanced models of tech, we need advanced models of society to match.
Continue reading

© 2017 - Powered by Hugo with the Type Theme