In this section, we delineate the validated scope of the GenAI solution. Any elements not explicitly mentioned within this description are considered to be out of scope or not validated as part of the solution.
GenAI use cases have been validated using Hugging Face and MosaicML Open-Source libraries on the same solution architecture.
Hugging Face
Hugging Face provides transformer libraries with which we perform a general process of end-to-end LLM fine-tuning.
End-to-end fine-tuning requires several major components to produce a result that is of value:
- Compilation of a fine-tuning dataset
- As an initial step in the comprehensive process of fine-tuning pretrained LLM models, users are presented with the raw form of the fine-tuning dataset. They compile their own data in a prompt-response format, mirroring the example provided. This process ensures users can prepare their data to suit the requirements of the model.
Note: Fine-tuning dataset is out of the scope of this solution and the dataset used is pre-tuned and placed in the Dell APEX File Storages.
- Selection of pretrained model
- We chose foundation models, namely OpenAI GPT, Google BERT, and Falcon 1 B, from the Hugging Face Transformers library for the fine-tuning of this LLM.
- Transfer of data and model from Dell APEX File Storage to CPUs/GPUs of Databricks compute cluster
- Data is retrieved from Dell APEX File Storages using S3 protocol. The Databricks Spark service efficiently manages IO activity, enabling us to perform advanced parallel operations. Alternatively, users can also stream datasets for fine-tuning.
Note: The utilization of Streaming Dataset libraries is beyond the scope of this paper, as it primarily falls under the purview of Data Engineers or Data Scientists.
- Model fine tunings run on CPUs/GPUs
- The Hugging Face transformer code for fine-tuning automatically identifies the system's CPU and GPU configuration, ensuring proper distribution of data and model resources.
Note: As this solution is a functional validation, we have evaluated only on CPUs.
- Evaluation of a fine-tuned model on unseen data.
- You have the choice to either set aside a portion of the initial dataset for validation during the fine-tuning process or conduct a distinct evaluation task with fresh data, such as one of the numerous in-context learning (ICL) tasks, to assess the performance of your newly refined model.
- Model inference to generate useful output.
- Finally, you can submit fresh prompts to the fine-tuned model and receive responses to empirically confirm their accuracy and utility for the user.
MosiacML
MosaicML also provides composer libraries with which we perform the general process of end-to-end LLM training.
LLM training requires several major components to produce a result that is of value:
- Compilation of a fine-tuning dataset
- As an initial step in the comprehensive process of training LLM models we setup the fine-tuned dataset as train and test dataset generated synthetically using dataset library.
- Selection of pretrained model
- MosaiclML training runtime provides a deep learning library built for NLP and Datasets. For this validation, we only run the training algorithm resnet_56 from the Mosaic composer library.
- Transfer of data and model from the Dell APEX File Storages to CPUs/GPUs of Databricks compute cluster.
- Data is retrieved from Dell APEX File Storage using S3 protocol. The Databricks Spark service efficiently manages IO activity, enabling us to perform advanced parallel operations. Alternatively, users can also stream datasets for fine-tuning.
Note: The utilization of Streaming Dataset libraries is beyond the scope of this paper, as it primarily falls under the purview of Data Engineers or Data Scientists.
- Model training on CPUs/GPUs
- The MosaiclML composer code automatically identifies the system's CPU and GPU configuration, ensuring proper distribution of data and model resources.