Your Browser is Out of Date

Nytro.ai uses technology that works best in other browsers.
For a full experience use one of the browsers below

Dell.com Contact Us
United States/English
Justin Potuznik
Justin Potuznik

Justin Potuznik is an Engineering Technologist at Dell Technologies specializing in AI & HPC.

Before his role at Dell Technologies, Justin was a Distinguished Engineer at UnitedHealth Group. He helped introduce AI/ML/HPC infrastructure as a capability to advance the enterprise at UHG.

New challenges are what get Justin excited. Justin likes nothing more than to be breaking new ground with new technology that can transform the world.


https://www.linkedin.com/in/jpotuz/


Assets

Home > Workload Solutions > Artificial Intelligence > Blogs

AI PowerEdge Artificial Intelligence ChatGPT

Do AI Models and Systems Have to Come in All Shapes and Sizes? If so, Why?

Justin Potuznik Justin Potuznik

Wed, 24 Apr 2024 13:21:25 -0000

|

Read Time: 0 minutes

I was recently in a meeting with some corporate strategists. They were noting that the AI market was too fragmented post ChatGPT and they needed help defining AI. The strategists said that there was too much confusion in the market, and we needed to help our customers understand and simplify this new field of technology. This led to an excellent discussion about general vs. generative AI, their different use cases and infrastructure needs, and why they need to be looked at separately. Then to reinforce that this is top of mind for many, it was not two hours later I was talking to a colleague and almost the same question came up: why the need for different approaches to different types of AI workloads? Why are there no “silver bullets” for AI?

“Traditional” vs. LLM AI

There is a division in AI models. The market has settled on the terms ‘General’ vs. ‘Generative’ for these models. These two types of models can be defined by their size as measured in parameters. Parameters can be defined as the weights given to different probabilities of a given output. The models we used in past years have ranged in parameter size from tens of millions (ResNet) to at most 100s of millions (BERT). These models remain effective and make up the majority of models deployed in production.

The new wave of models, publicly highlighted by OpenAI’s GPT-3 and ChatGPT, show a huge shift. ChatGPT clocks in at five billion to 20 billion parameters; GPT-3 is 175 billion parameters. GPT-4 is even more colossal, somewhere in the range of 1.5 to 170 trillion parameters, depending on the version. This is at the core of why we must treat various AI systems differently in what we want to do with them, their infrastructure requirements, and in how we deploy them. To determine the final size and performance requirements for an AI model, you should factor in the token count as well. Tokens in the context of LLMs are the units of text that models use for input and output. Token count can vary from a few hundred for an LLM inference job to 100s of billions for LLM training.

Why the jump?

So, what happened? Why did we suddenly jump up in model size by 2+ orders of magnitude? Determinism. Previously AI scientists were trying to solve very specific questions.

Let’s look at the example of an ADAS or self-driving car system. There is an image recognition model deployed in a car and it is looking for specific things, such as stop signs. The deployed model will determine when it sees a stop sign and follows a defined and limited set of rules for how to react. While smart in its ability to recognize stop signs in a variety of conditions (faded, snow covered, bent, and so on), it has a set pattern of behavior. The input and output of the model always match (stop sign = stop.)

With LLM or generative systems they must deal with both the problem of understanding the question (prompt) and then generating the most appropriate response. This is why Chat GPT can give you different answers to the same input: it reruns the entire process and even the smallest changes to the particulars of the input or the model itself will cause different outcomes. The outcomes of ChatGPT are not predetermined. This necessitates a much higher level of complexity and that has led to the explosive growth of model size. This size explosion has also led to another oddity: nothing is a set size. Most generative models are sized in a range as different versions and will be optimized for specific focus areas.

So, what do we do?

As AI practitioners we must recognize when to use different forms of AI models and systems. We must continue to monitor for additional changes in the AI model landscape. We must endeavor to find ways to optimize models and use only the parts we need because this will lead to significant reductions in the size of models and the ease, speed, and cost effectiveness of AI system deployments. If your AI project team or company would like to discuss this, reach out to your Dell Technologies contact to start a conversation on how Dell Technologies can help grow your AI at any scale.

Author:  Justin Potuznik

Engineering Technologist – High Performance Computing & Artificial Intelligence

Dell Technologies | ISG Chief Technology & Innovation Office

https://www.linkedin.com/in/jpotuz