Articles Intranet Access
Background Image

Why machine learning matters ?

by Frank Bitan — Posted on May 01, 2018

Machine learning, and related advances like deep learning, have enabled computers to acquire tacit knowledge by being trained with lots and lots of sample inputs, thus learning by analyzing large amounts of data instead of being explicitly programmed. Machine learning methods are now being applied to vision, speech recognition, language translation, and other capabilities that not long ago seemed impossible but are now approaching or surpassing human levels of performance in a number of domains.

As its domain of applications continues to expand, machine learning (ML) is raising serious concerns on its impact on automation and the future of work. In What can machine learning do? Workforce implications, an article recently published in Science, MIT professor Erik Brynjolfsson and CMU professor Tom Mitchell explore this question by analyzing which tasks are particularly suitable for ML, as well as its expected impacts on the workforce and the economy.

Which tasks are most suitable for machine learning?

Machine learning systems are not equally suitable for all tasks. It’s been most successful when applied with supervised learning and deep learning algorithms, which require very large amounts of carefully labelled data to be used for training, - e.g., cat, not-cat. While very effective in such domains, the authors remind us that ML systems are significantly narrower and more specialized than humans. There are many tasks for which they’re completely ineffective given the current state-of-the-art.

Brynjolfsson and Mitchell identify eight key criteria that help distinguish tasks that are suitable for ML, from those where ML is less likely to be successful.

Learning a function that maps well-defined inputs to well-defined outputs. Such functions include classification, - e.g., labelling images of specific animals or the probability of cancer in medical records; and predictions, - such as the likelihood of defaulting on a loan application. These amount to statistical correlations without necessarily capturing causal effects.

Large (digital) data sets exist or can be created containing input-output pairs. The bigger the training data sets, the more accurate the learning. One of the key features of deep learning algorithms is that, unlike classic analytic methods, there’s no asymptotic data size limit beyond which they stop improving.

The task provides clear feedback with clearly definable goals and metrics. “ML works well when we can clearly describe the goals, even if we cannot necessarily define the best process for achieving those goals.” ML is particularly powerful when there are specific, system-wide performance metrics, - e.g., get the most points in a video game, optimize the overall traffic flow of a city, - and such metrics can be incorporated in the training data.

No long chains of logic or reasoning that depend on diverse background knowledge or common sense. “ML systems are very strong at learning empirical associations in data but are less effective when the task requires long chains of reasoning or complex planning that rely on common sense or background knowledge unknown to the computer.” ML does well in situations that require quick reaction and provide quick feedback like a video game. It does less well in events that depend on the context established by multiple previous events.

No need for detailed explanation of how the decision was made. Explaining to a human the reasoning behind a particular decision or recommendation made by a machine learning algorithm is quite difficult, because its methods, - subtle adjustments to the numerical weights that interconnect its huge number of artificial neurons, - are so different from those used by humans.

A tolerance for error and no need for provably correct or optimal solutions. ML algorithms derive their solutions based on statistics, assigning probabilities to the different options it evaluates. It’s rarely possible to train them with 100% accuracy. Even the best ML systems make errors, - as do the best humans, - so it’s important to be aware that they’re not perfect.

The phenomenon or function being learned should not change rapidly over time. “In general, ML algorithms work well only when the distribution of future test examples is similar to the distribution of training examples.” If the function changes rapidly over time, retraining is typically required, requiring the acquisition of new training data.

No specialized dexterity, physical skills, or mobility required. ML systems have already surpassed human levels of performance in a number of tasks. However, while the digital AI brains of robots are doing quite well, their physical capabilities are still quite clumsy compared to humans, especially when dealing with unstructured tasks and environments.

The Science article includes fairly elaborate supplementary materials to help evaluate what the current generation of ML systems can and cannot do.