Data imbalance occurs when the number of instances in one class significantly outnumbers those in another. In the context of epidemiology, this often happens when studying rare diseases or conditions where the number of cases (positive instances) is much smaller than the number of non-cases (negative instances).