In this validation use case, we performed image classification. The dataset consists of 1,310 images with each image belonging to one of five vehicle type categories (sedan, hatchback, pick-up, SUV, and van). Based on the images trained, the model can predict a type of the vehicle.
Data ingestion—We used the publicly available VTID1 dataset. We uploaded the dataset to the H2O Driverless AI local repository.
Data preparation—The dataset is organized in directories with directory names that indicate the label associated with the image. We split the dataset into train, test, and validation and used it in H2O Driverless AI. We selected the target column or label (directory name) during the experiment creation process. Based on the data, H2O Driverless AI uses ImageVectorizer features for the model.
Model building—The following figure shows the model configuration. Because the objective is to obtain the best available model, we set the accuracy and time to 10 and interpretability to 1 in the training settings. As this use case addresses an image classification problem in which neural networks perform better in general cases, we explored the expert experiment settings to filter and use TensorFlow- and PyTorch-based models.
Figure 8. Model building for image classification use case
Model deployment – When the experiment is completed, you can download the scoring pipeline. A container is built similarly to the sentiment analysis use case. An API request can be sent to REST endpoint with the image array when you need to predict the vehicle type for new data.