The case of Big Data and Artificial Intelligence

Buzzwords like Big Data, Artificial Intelligence and Machine Learning are everywhere. Why do we see so much interest in them right now?

Why are they like the London Tube?

The changes in data analytics in the last years are a bit like personal transport at the end of the 19th century in London. A couple years earlier, everybody could walk through the city within 30 minutes, but as the city grew, new modes of transportation were needed as distances got longer. With enough resources, one could travel through the city by carriage, but this solution didn't work for all who wanted to get across. One great solution were the first underground trains, but they were built and operated by individual companies and one could not change between lines. While this improved travel times immensly, people were still facing problems when switiching lines. Eventually, all lines were merged into a single company with just one great visual overview on how to get where as shown in the well-known image above. And of course, just a single, low fare for the whole travel.

Data science a couple of years back was exactly like walking through London by foot. You could do it at first, but the more data and data sources you had to manage, the harder it got. We had tools and programming languages, but we were missing an ecosystem to manage all of it with reasonable pricing for it. While we are not at the stage where we have a lovely tube map with easy to manage solutions, we have the main building blocks in place and some - proprietory - solutions offer already everything along the data science process. In comparison to the Javascript ecosystem with tools like npm or React and dozens of related frameworks for testing etc., we are lacking the possibilities to switch between frameworks or ideas but staying in the programming language. Both R and Python offer most of the capability needed, but might lack one library needed in the current project and switching between the two proves not to be that easy. On the other hand, cloud solutions like MS Azure Machine Learning Studio usually lock you in one single programming language or workflow, they offer the whole process in a single tool with affordable pricing.