Let us agree on one thing: we’ve all heard of Machine Learning, either in the news, at school or in the company we work for. It is actually believed to drive the fourth industrial revolution after the steam-powered engine, science-based supply chains and digitalization.
However, maybe you never felt that you fully understand its potential and the domains it may be applied too.
In this post, I’m going to define machine learning, present its three categories and eventually walk through domains of applications with some concrete examples.
At the end of the article, you’ll be able to talk with confidence about artificial intelligence
What exactly is machine learning?
Machine Learning is a field of artificial intelligence (AI) that studies the capacity to learn from past experiments to adapt to a changing environment.
This means that the deterministic rules are no longer programmed as was done traditionally but rather learned from the given input, often called training data. These rules need to incorporate important information while having a good generalization power as the goal is not to predict the past but the unknown future.
We have to distinguish three different categories of machine learning: supervised learning, unsupervised learning and reinforcement learning.
Supervised learning is when the response variable that we want to predict is contained in the training data, therefore the algorithm learns a function that links the different samples’ characteristics, often called features or predictors, with the response variable, often called labels or target.
Unsupervised learning is when the purpose is to find previously unknown patterns without pre-existing labels. This is mostly used when the labeling is expensive or impossible or when the goal is to reduce the data’s complexity, known as dimension reduction. The most famous techniques are: principal component analysis and cluster analysis.
Reinforcement learning uses a different philosophy to do machine learning. It is based on four major actors: the environment, the agent, the action and the reward. The agent has to learn the best action to take in the environment to get the highest reward.
Domains of application
Fraud detection has been an important task in the banking and insurance industries as fraud causes unnecessary and costly operations. It encompasses activities that prevent money or assets from being obtained illegally.
Fraud comes in different forms: stealing credit cards, forging identity to sell someone else’s property, asking for insurance payout through false pretenses (fake death, exaggerated losses, etc.).
This detection was traditionally done by human analysts, however, with the growing number of people committing those illegal activities in the digital era, this problem can no more be tackled without automation. Machine learning comes here as a savior.
Let’s illustrate this with a concrete example. Lending Club is the largest US peer-to-peer lending company based in San Francisco, it enables borrowers to create personal loans from $1,000 to $40,000 where the standard period is three years. Investors then make money from generated interested in loans they’ve chosen from the listings. In this process, it is important that the investors are confident in the loan’s repayment and the platform should investigate more with people who may commit fraud.
In this context, we have two types of people: payers and delinquent debtors. It is considered a machine learning classification problem as we have two classes (0 for payers and 1 for delinquent debtors) and the goal is to use characteristics of each borrower to predict his class.
Information about borrowers is important as it is the only thing the model sees. We say that the model is at most as accurate as of the data it was trained on. Therefore, one shouldn’t expect much if the data collected is small or ambiguous.
Good characteristics (or predictors) to consider here include the loan amount, the interest rate on the loan, issue date, employment title, employment length, homeownership, annual income, the purpose of the loan, number of delinquencies the past two years, months since last delinquency.
Forecasting is what everyone would like to be able to do: prediction of the future. Let us agree on one thing: it would be amazing If we could predict events’ outcomes as we do with the weather. Imagine that you are a supply chain manager with the capacity to predict the customer’s demand, there will be neither shortages nor excess inventory.
Let’s illustrate this with a concrete example as we did above. Intermarché is a large chain of supermarkets in France with 2 328 stores. They have a « pick & collect » service that they want to make as efficient as possible. The forecasting of the customers’ orders is crucial as it will help store managers to improve the layout and expect the demand.
This task may be dealt with using two different philosophies: consider it as a time series problem or as a regression problem.
As a time series problem, one can use statistical methods like ARIMA. It stands for Auto-Regressive Integrated Moving Average and it is a forecasting technique that projects the future values of a series based entirely on its own inertia. It assumes that the data is stationary which implies that the series remains at a fairly constant level over time. If there is a trend, ARIMA won’t perform well, therefore it is advised to remove the trend term before.
As a regression problem, one can use as predictors the characteristics of the stores and statistical indicators about the past (e.g. moving average, standard deviation).
Besides, Facebook developed its own forecasting tool called Prophet. You can check it out!
In today’s society, people from different nationalities speaking different languages have to interact in complex environments. A good communication is key in those collaborations. Traditionally, the translation was done through human translators offering their service for high fees in general. However, with the growing number of multilingual associations, automated translation is a must.
There are two categories of machine translations: SMT (Statistical Machine Translation) and NMT (Neural Machine Translation).
SMT is based on statistical models derived from the analysis of bilingual text corpora.
NMT uses advanced neural network architectures to learn the existing mechanisms in the training data composed by paired documents (a document in the source language and the same document in the target language). In general, the architecture used is an encoder/decoder paradigm where the encoder learns hidden patterns from the source language and the decoder uses it to predict sentences in the target language.
Please check out our PyTorch tutorial on Neural Machine Translation !