In a recently published paper, you question the ‘bigger is better’ paradigm in artificial intelligence. Can you explain why this approach is problematic?
It has various negative consequences. Firstly, it compromises the viability of AI. Costs are rising, and economic viability is only possible for applications that generate a lot of revenue. The ecological impact also increases, for example in terms of the energy consumed.
In the long term, this implies a choice between development at the expense of our living environment (some are talking about significantly increasing electricity production), or limiting the technology to a minority.
Furthermore, with the race for size comes the incessant need for more data. It is becoming impossible to control what we put into our models: pornographic images, copyright infringement, personal data.
Finally, the arms race is leading to a concentration of power. Economically, very few players control the value chain, and this control is achieved above all through control of the astronomical computing resources required: chip manufacture and leasing of computing power on the cloud.
Read : Hype, Sustainability, and the Price of the Bigger-is-Better Paradigm in AI (Gaël Varoquaux, Alexandra Sasha Luccioni, Meredith Whittaker)
In particular, you point out that the focus on large-scale AI models is to the detriment of important applications such as health and education. Can you explain why?
There are many reasons for this. First of all, in many of these fields of application it's not just a question of a magic wand that predicts better: you have to take into account the specificities of the field.
Then there are questions of profitability: in these fields, the revenue margin for IT is currently low, unlike online advertising, which feeds AI. Finally, there is an opportunity cost: the effort spent on improving the ‘natural’ applications of large models is not spent on these applications.
At the event organised by Scaleway, you will be talking about sustainable AI. Can you explain how these two concepts are linked and why the ‘bigger-is-better’ model poses a problem for the sustainability of AI?
AI is consuming more and more resources, such as energy. At the moment, it's a small part of our consumption, but as the use of AI only increases, it's very problematic.
It's a classic example of how, no matter how efficient we make a technology, the consumption associated with that technology only increases as usage increases. It's this paradox, known as Jevons' paradox, that explains why world coal consumption has only increased over the last 200 years.
How can smaller, specialised models be an alternative to larger models in terms of performance and environmental impact?
The argument in favour of the large model is that it is a generalist, and can be specialised in many different tasks. But often this isn't necessary, and you want a model that's good at narrow tasks. A specialised model works better. And in many cases, it's not the biggest model that's best, because a model that's too big can amplify the noise.
Do you think we could achieve a balance between computing power and energy efficiency in the future?
Things will improve, of course, and we'll have increasing computing power at a limited energy cost.
The problem is that AI is intended to be everywhere and for everyone. It is quite possible that it will become a basic technology, indispensable to economic and social life, like the internet or mobile phones. With the trajectory of the major models, I don't see how the gains in efficiency will be able to balance the increase in usage. This is Jevons' paradox. That's why we need to stop glorifying the big models and put more effort into specialised models.
On 7 November, Gaël Varoquaux will be taking part in the AI sustainability and specialised models: balancing power and efficiency round table organised by Scaleway. To find out more about the event.