In order to classify the appearance quality level of Fritillaria thunbergii, the F. thunbergii dataset was constructed with the DigiEye system followed by an image annotation tool. Several statistical learning and object detection algorithms were selected to train and test the F. thunbergii dataset. The results showed that the model trained by the YOLO-X of YOLO (you only look once) series had relatively better performance. In addition, to optimize YOLO-X, according to the unique features of F. thunbergii dataset, a dilated convolution structure was embedded into the end of the backbone feature extraction network of YOLO-X as it could improve the model sensitivity to the dimension feature. The mean average precision (mAP) of the improved model was raised to 99.01%; the average precision (AP) for superfine, level one, level two, moth-eaten, mildewed, and broken F. thunbergii were raised to 99.97%, 98.33%, 98.47%, 98.71%, 99.73%, and 98.85%, respectively; and the weighted harmonic mean of precision and recall (F1) were raised to 0.99, 0.92, 0.94, 0.97, 0.99, and 0.97, respectively. The tune-up in this study enhanced the detection performance of the model without increasing the number of parameters, computational complexity, or major changes to the original model. This study provides a scientific basis for the subsequent construction of F. thunbergii detection platform.