Blog

Let’s dive into Data Science

We are happy to announce that we have opened a new line of business –Data Science! We would like to give you little insight into the much ”hyped” world of Artificial Intelligence (AI) and Data Science. Data Scientists (known as the sexiest job in the 21st century) are the ones extracting  (business) value from data.

You can find many different definitions of Data Science and AI, there are no agreement yet. Here we try to give our view on the subject.

We would like to describe Data Science with the picture below (lent from Rafal Lukawiecki).
Data Science

It is the cross-section of Big data, Data Wrangling, Machine learning, and Statistics. It is a collection of methods to extract knowledge and insights from structured and unstructured data. The notion of AI is much harder to explain or visualize with a picture. When speaking of AI, we must mention machine learning. In practice when we talk about  AI, we often actually mean machine learning. Machine learning is a system able to automatically learn and improve from experience without being explicitly programmed for it. That means learning through data examples, that can be classified into supervised learning, unsupervised learning, and reinforcement learning. Machine learning is always predictive, it gives you a result with some probability.

In supervised learning, machine learning algorithms learn on labeled data. Data is labeled when it is given a value we are trying to predict. For example when we want to train a machine-learning algorithm to identify cats, then we feed it a lot of pictures of cats with a label „cat“. Algorithm crunches through all the pixels of the pictures and it starts to understand which combinations of pixels may refer to a cat. If shown a new picture, the algorithm analyses its pixels and makes a prediction based on previous experience, that the probability of cat being on the picture is for example 82 %. As you can imagine, in traditional software development it would be impossible to preprogram all the pixel combinations for cats to be recognized in pictures.

Unsupervised learning algorithms are used on data that is not labeled. Unsupervised learning is used to find hidden patterns or outliers in data. For example, in clustering, you feed the algorithm a lot of data related to your customers and the algorithm divides customers into relevant groups based on the data. In fraud detection, unsupervised learning can crunch through huge volumes of data and find irregularities that may point to fraud.

In reinforcement learning algorithms interact with the environment by doing actions on which algorithms get feedback, an error or a reward. For example, algorithms are trained to play games with reinforcement learning. Algorithms start doing all the actions enabled in the game and through feedback learn which actions are the ones helping to reach the goal.

AI or machine learning is not something new, these concepts have been around since the 1960s. Machine learning needs a lot of data and, as can be seen from the previous description, and a lot of computing power to process it. The picture below helps to explain why AI and machine learning are getting so much attention lately.
Data growth

As the storage and processing power cost have plummeted, we have seen exponential data growth. Machine learning makes it now possible and cost-effective to tackle problems deemed too complex to effectively solve with traditional programming.

How Data Science and AI is changing our world needs another blog post (or two). We will try to sum it up very briefly. Advances in:

  1. Machine vision and  hearing
  2. Natural language processing
  3. Reasoning, problem-solving
  4. Predicting

are fundamentally changing how we approach our problems and needs in almost every domain. Everyday there are news of AI being better than human at something. Already 37 % of organizations involved in the Gartner 2019 CIO Survey have implemented AI. 80% of all new technologies are founded on AI.

We here in Net Group believe that data is the most valuable resource to your business. Contact us by e-mail hannes.hansalu@netgroup.com and let’s make your data shine like gold!