The naive approach that consists in getting all the required data at the Client in order to apply locally some processing should be limited in a distributed setting to trivial tasks operating on a tiny subset. There are two fundamentals reasons for that. First, this generates a lot of network exchanges, consuming without necessity a lot of resources and sometimes leading to unacceptable response time. Second, centralizing all the information then processing it, simply misses all the advantages brought by a powerful cluster of hundreds or even thousands machines. The lesson is simply: When you deal with BigData, the data center is your computer.
via NoSQL databases