mindstorms: NoSQL: Pig Latin and JSON on Amazon Elastic Map Reduce

Pig Latin and JSON on Amazon Elastic Map Reduce

In order to not have to learn everything about setting up Hadoop and still have the ability to leverage the power of Hadoop's distributed data processing framework and not have to learn how to write map reduce jobs and … (this could go on for a while so I'll just stop here). For all these reasons, I choose to use Amazon's Elastic Map infrastructure and Pig. I will talk you through how I was able to do all this [take my log data stored on S3 (which is in compressed JSON format) and run queries against it] with a little help from the Pig community and a lot of late nights. I will also provide an example Pig script detailing a little about how I deal with my logs (which are admittedly slightly abnormal).

tags:MapReduce

via NoSQL databases

NoSQL: Pig Latin and JSON on Amazon Elastic Map Reduce

Pig Latin and JSON on Amazon Elastic Map Reduce

Related Posts

Post a Comment

mindstorms

Latest comments

think differently big

Tag Cloud Sphere ▼

Follow Alex on Twitter ▼

Daily Cloud Stream ▼

Show more articles

Tags

Archive