The context refers to a situation where I am interested in predicting the results of class size tests. I have each other`s test results and class sizes. I was warned not to calculate the test result for each class (i.e. create a new variable class_test_average, and then use class_size to predict class_test_average. I was told that if I did, I could have a problem with “aggregation bias” and “ecological error.” But these concepts have been expressed to me in a somewhat depressive way. I understood that ecological error is linked to conclusions that relationships at the macro level are translated into the same relationships at the micro level. However, I did not understand the aggregation bias at all. James, L. R. (1982).
Distortion of aggregation in perception agreement estimates. Journal of Applied Psychology, 67(2), 219. For me, the term Bias indicates that I should, by aggregation, systematically advance results in order to overestimate or underestimate the size of relationships. But I don`t understand why that happens. The main drawback of using aggregated data is probably the inherent difficulty in drawing conclusions in several valid steps on the basis of a single level of analysis . Alker has identified three types of erroneous conclusions that can occur if a researcher tries to generalize from one level of study to another. The individualistic error is the attempt to establish macro-level relationships (aggregates) from micro-levels (individuals). This is the classic problem of aggregation that was first studied by economists, and according to Hannan [15, p. 5], it is an attempt to group observations on “behavioural units” to study the economic relationships that exist for sectors or economies as a whole. Flat-level sections may occur if, at the same level of analysis, conclusions are drawn from one subpopulation to another. The ecological illusion, named by Robinson`s work , is the opposite of the individualistic illusion and involves conclusions ranging from higher levels of analysis to lower levels of analysis. Robinson showed that there was not necessarily an equivalent between individual and ecological correlations, and that the latter would generally be greater than the former. Although the ecological illusion has been widely debated and made public, it remains a common error in studies of cause-and-effect findings.
It has long been known that the use of aggregated data can produce correlation coefficients with a significant distortion of their values at the individual level [10, 21]; and Blalock  has shown that regression coefficients can also be distorted. It is proven that it is wrong to think that relationships that exist at one level of analysis necessarily have the same force at another level. Estimates from aggregated data apply only to the observation unit system used. The consequences of using potentially biased estimates of correlation and regression coefficients to replace “real” micro-level estimates are the most serious with respect to the causal conclusions that need to be drawn from statistical analyses, and only a little later in the work that refer to distortion of aggregation and ecological error (in fat). because I plan to do multi-level modeling anyway. this will avoid both the bias of aggregation and the ecological illusion.