Special series An old machine learning truism states that the more complex and large a model, the more accurate the outcome of its predictions, up to a point.
If you’re studying ML disciplines like natural language processing, it’s the massive BERT and GPT models that make precision practitioners green.
However, the excitement fades when it comes to making these models work in production, as their sheer size makes deployments a real struggle. Not to mention the cost of setting up and maintaining the infrastructure necessary to move from research to production.
Reading this, avid followers of computing trends may recall the emergence of serverless computing a few years ago.
The approach pretty much promised great computational capabilities that could automatically scale up and down to meet changing demands and keep costs down. It also freed the teams from the burden of looking after their infrastructure, as it was mainly a managed offer.
Well, serverless hasn’t gone away since then and seems like an almost ideal solution at first glance. When digging deeper, however, limitations on things like memory occupancy and deployment package size prevent making this a straightforward option. However, interest in the combination of serverless learning and machine learning is growing. And with it, the number of people working on means building BERT models and co-tailoring vendor specifications to facilitate serverless deployments.
Find out how to build confidence in your AI applications with our MCubed web conference this week
To learn more about these developments, we’ll be welcoming Marek Å uppa in Episode Four of our MCubed Web Lecture Series for Machine Learning Practitioners on December 2nd. last year to explore ways to modify sentiment analysis and classification models so that they can be used in serverless environments – without the feared degradation of performance.
In his talk, Å uppa will talk a bit about his team’s use case, the things that led them to consider serverless, the issues they encountered during their studies, and the approaches they found most promising to achieve. appropriate latency levels for production environments for their deployments.
As usual, the webcast on December 2 will start at 1100 GMT (1200 CET) with a roundup of machine learning news related to software development, which will give you a few minutes to settle in before delving into the topic of deploying models in environments without server.
We would love to see you there; we’ll even send you a quick reminder the same day, just sign up here. Â®