Brief evaluation and findings

Thank you for your feedback!

Conclusions

The Dell SOAS team’s evaluation corroborates academic findings that agentic workflows can yield improved performance compared to single-model, single-prompt approaches. In some instances, the workflow achieved results comparable to high-end models. However, the substantial effort that prompt engineering and workflow experimentation with SLMs require to obtain performance parity with larger models underscores the challenges in applying these techniques to use cases demanding extensive business and domain knowledge.
Future work should focus on incorporating additional few-shot examples and using more advanced models to address the identified challenges. Further, implementing strict validation procedures, such as procedures for formatting the labels and feedback loops could mitigate issues related to label consistency and adherence to formatting guidelines.

Conclusions