Traditional evaluation metrics like accuracy can be misleading in the presence of class imbalance. For example, a model that always predicts the majority class could still achieve high accuracy despite failing to identify any minority class instances. Therefore, metrics such as precision, recall, F1-score, and AUC-ROC are more appropriate as they provide a nuanced view of model performance.