Home > AI Solutions > Gen AI > Guides > Design Guide—Generative AI Digital Assistants in the Enterprise > Tenets of an architecture for AI-powered digital assistants
The use cases and requirements introduced in this chapter are guidelines for the development of the architecture for digital assistants. This section describes the key tenets of the architecture that influence the design decisions that were made to assemble this solution from Dell Technologies products, from open-source implementations, and from products stemming from the Independent Software Vendors (ISVs) Pryon and UneeQ.
To facilitate the integration into existing customer environments, the solution uses a combination of leading-edge commercial components as well as widely adopted open-source software. This approach applies to all layers of the system. Starting from operating systems, middleware, and databases, this integration extends further to control planes, observability tools, and dashboards. Including open-source components provides for easier integration into existing environments and lowers the learning curve when the solution is handed over to the operations team.
Where possible, the most cost-effective alternatives are chosen for hardware infrastructure to find the most cost-effective server/CPU/ GPU/storage combination. The key goal of performance benchmarking and sizing efforts is to allow an implementation team to make the right tradeoffs between different alternatives. For instance, a GPU accelerator that offers a better ratio of performance to cost is favored over an alternative setup.
Every component of the architecture must be automatically deployable and managed by a cloud-native, declarative approach. This approach is achieved by using Kubernetes, a cloud-native open-source technology for orchestrating and managing the deployment of containerized environments. Ensuring that each component of the solution can be deployed (using Helm charts) and managed by Kubernetes from a single point of control allows for one common way of managing the solution.
Overall deployment must be designed to provide High Availability (HA) or Disaster Recovery (DR). This requirement includes, but is not limited to, the Kubernetes control plane, log aggregation, telemetry, alerting, image repository, Kafka, Redis, or other shared components.
The solution uses standard observability frameworks such as Prometheus (for metric collection) and Grafana (for dashboards). These frameworks establish a single point of control of all hardware and software components in the overall solution.
The deployment of generative AI solutions might suffer from the lack of skills to run and evolve the solutions as part of a continuous improvement process. To address this challenge, enterprises frequently procure such solutions as managed services, whether the solution is delivered on-premises or hosted by a managed service provider (MSP). The present solution enables this option by delivering a single control point, for which pre-existing integration into a managed services environment (such as the Dell Managed Services Platform (DMSP)) can be used.