24 Feb
Hive is a data warehouse system build ontop of Hadoop. I’ve been experimenting with it for the past few days. Using the thrift service, I’ve been able to drive it from PHP. Here’s what I’ve done to get it going:
Launching a Cluster
Using the EC2 scripts, I launched a cluster of Hadoop servers on EC2. It’s [...]
Posted in Uncategorized by: gary.richardson
1 Comment
18 Feb
I’ve been playing around with Hadoop recently. It’s pretty slick. The EC2 scripts make it incredibly easy to set up and work with. HDFS is pretty neat too. I’ve been working with 10GB data sets and moving the data around and working with them isn’t painful.
I was curious as to how much free space was [...]
Posted in Uncategorized by: gary.richardson
No Comments
13 Feb
Darren Barefoot has a post about Yellow Pages (phone books not NIS).
A few years back my wife and I moved from a basement suite we were renting to our current apartment. Since we had never gotten our own phone book, I remember the first time the yellow pages showed up.
One day, I was packing in [...]
Posted in Uncategorized by: gary.richardson
1 Comment