Do Androids Browse (For Electric Sheep)?

The movie Blade Runner is based on a Philip K. Dick short story entitled “Do Androids Dream of Electric Sheep?

Perhaps some new questions should be added to this classic…

In an interesting example of science fiction becoming reality, a group of researchers is now creating a sort of world wide web for the robots of the world. Whether or not androids dream, they may soon be able to use social networks for robots, and use public, internet-accessible resources to get their day-to-day work done. The initiative is called “RoboEarth“:

I┬ábelieve this sort of technological evolution is the wave of the future. It represents a promising confluence of cloud computing, distributed architecture, big data, hadoop-like map-reduce, supercomputing, ubiquitous internet connectivity, and the every-device-has-an-IP-address promise of IPv6. It would be nice if my next Roomba didn’t have to relearn the floorplan of my house, but could simply download knowledge that the older model has laboriously developed. I’ll bet over the next decade, the market will discover hundreds of variations on that theme.

I just hope we’re smart enough to stop before robots start frittering away their time clicking cows on Facebook… :-)

Big Data In Motion

I’ve been at Cloud Expo this week, listening to lots of industry hoopla about building cloud-centric apps, managing clouds, purchasing hardware for clouds, buying private clouds from public cloud providers, and so forth.

Photo credit: aquababe (Flickr)

One interesting decision made by the organizers of the conference was to bring “big data” under the same conference umbrella. There’s a whole track here about big data, and it gets mentioned in almost every presentation.

And I’ve sensed a shift in the wind.

Years and months ago, “big data” was all about mining assets in a data warehouse. You accumulated your big data over time. It sat in a big archive, and you planned to analyze it. You spun up hadoop or used some other map-reduce-style tool to crunch for days or weeks until you achieved some analytical goal.

What I’m hearing now is an acknowledgement that an important use case for big data–perhaps the most important use case–has little to do with data at rest. Instead, it recognizes that you’ll never have time to go back and sift through a vast archive; you have to notice trends by analyzing data as it streams past and disappears into the bit bucket. The data is still big, but the bigness has more to do with volume/throughput, and less to do with cumulative size.

This has interesting implications. Algorithms that were written on the assumption that you can corral the data set under analysis need to be replaced by ones based on statistical sampling; exactness needs to give way to fuzziness.

Interestingly, I think this will make computer-driven data analysis much more similar to the way humans process information. As I’ve said elsewhere, when faced with a difficult design problem, a smart question to ask is: how does Mother Nature solve it?