How PBRF Averages Penalise Departments with Research Grant Success

Note added 30 May, 2007: The 2006 PBRF round used the same reporting methodology, with precisely the same flaws I described in my 2004 article below. Whereas new categories "C(NE)" and "R(NE)" were invented for "new entrants", these have been aggregated with ordinary "C" and "R" scores in deriving leagues tables by "subject" and "nominated academic unit". Thus on the basis of the 2006 round, the most successful physics department in New Zealand - the one that brings in the most actual PBRF dollars (as determined from PBRF 2006 report, Appendix A, Tables A-70, A-71 and A-72), namely the Department of Physics and Astronomy at the University of Canterbury - scores "4th" in the "Physics subject area results" (Table A-34) - since as a very result of its success with research grants it hires more postdocs than its sister departments in other universities.

An abridged version of this article (minus the abstract and footnotes to Table 1) was published in New Zealand Education Review, 19 May 2004.

Abstract It is demonstrated that the use of simple averages to compare PBRF scores of departments has led to erroneous representations of the data in the popular press and university publications.

As an example, an analysis of 4 physics departments/sub-schools at Auckland, Canterbury, Otago and Victoria universities is undertaken. Using information from university calendars PBRF scores are divided between continuing academic EFTs and other researcher EFTs, using very reasonable assumptions. The average score of the continuing academics in each of the four departments surveyed, at 5.6-5.7, is indistinguishable, while those of other researchers at 0.9-1.3 are also in a similar ballpark.

Where differences exist in the rankings given in TEC tables it is due to the proportion of "other researchers" in each department. Departments which win more research grants, and hire more junior research staff (postdoctoral fellows etc) on these grants are likely to score lower averages. Thus in this sample of four at least, a lower PBRF total average should be seen as a measure of distinction.

While the problems indicated should not affect PBRF funding, which is based on total staff EFTs, it is argued that the PBRF reporting methodology, in terms of public perceptions generated by the TEC league tables, is fundamentally flawed.

Imagine two departments, both of 10 fulltime continuing academic staff, who each score "B" in the PBRF exercise. However, there is one difference: department 1 has very little in the way of external research grants, whereas department 2 is very successful at attracting research funds and on these grants employs 5 very good postdoctoral fellows who, being promising early career researchers with some publications each score "C". In this case - leaving aside the important statistic of PhD thesis completions not addressed in the measure - one department is clearly more successful than the other: it is department 2 with an average PBRF rating of 4.7 as opposed to department 1 with an average of 6.0. Due to research grant success the conventional interpretation of the results is reversed!

Although this example is hypothetical, effects of precisely this sort do distort the "subject" league tables which have been drawn up in the TEC report, and widely quoted as defining who is "better" than whom. Such crude measures which totally ignore the career profile of those averaged are potentially damaging, as they can lead to quite false impressions.

To those who might argue that a department which is successful at getting research grants would surely get more "A"s I should point out two things. Firstly, the average score of an "A" academic employing one "C" postdoc is 6.0 (the same as two "B" researchers employing no one), the average of an "A" academic who brings in more money to employ two "C" postdocs is less at 4.7, etc. If one employs a very new PhD who ends up "R", it's bad luck. Thus research grant success clearly does lower the average, even if there are more "A"s to show for it.

The second point is that judging by the known results of many colleagues, a significant number of researchers who have been successful in winning Marsden grants as Principal Investigator, and who do employ postdocs, have been rated "B". In the Physical Sciences panel at least, it is clear that anyone not already a professor stood little chance of scoring an "A": the number of professors in the physics subject sample was actually rather greater than the number of A's awarded. Many mid-career researchers with international reputations were not awarded "A". One suspects there is a significant correlation between "A" and age.

If one is to compare departments, then the most honest way to compare them is based on the continuing academic staff only. It is the continuing academic staff who are the ones able to apply for research grants, who lead the research programmes and who define the reputation of a department. In the table below I have therefore reanalysed the scores of my own subject area. In particular, I have taken the "top" four physics departments and accounted for continuing academic EFTs, using the 2003/2004 calendars of the institutions in question. Without access to individual scores I have made a simplifying but reasonably accurate assumption that continuing academics, Lecturer and above, were responsible for the highest scores in each department, whereas postdocs, tutors and technicians (where these were included) were responsible for the lower scores. This will not be a one-to-one match in every case, but it is very close. [See Table 1.]

TABLE 1: PBRF averages in physics at four New Zealand universities

Declared EFTs by Average PBRF score by

University Subject Dept Continuing Others Subject Dept Continuing Others

total total academics total total academics

U Auckland 29.5 30.3 21.3 9.0 4.2 4.3 5.7 0.9

U Canterbury 30.2 29.2 17.8 11.4 3.8 3.8 5.6 1.1

U Otago 17.8 21.8 12.8 9.0 3.6 3.8 5.6 0.9

Victoria U 9.5 12.5 9.5 3.0 5.1 4.6 5.7 1.3

Notes: [1] Continuing academic staff EFTs, at Lecturer or above, have been gleaned from 2003 and 2004 calendars. Where numbers increased from 2003 to 2004 (by one each at Auckland and Otago), in the absence of knowledge as to whether this was before or after census date, 1 July 2003, I have averaged the extra academic at 0.5 EFT. Furthermore, information concerning fractional appointments is not given in university calendars, but can often be worked out from the TEC tables when comparing the grade distribution in both staff number and EFT weighted columns. As an example, where twelve individuals contribute 11.8 EFTs in category B in a unit, one such individual has to be 0.8. In general I have made a best guess combining calendars, and the subject and TEO nominated units tables in the TEC report. Where fractional EFTS of academics were unclear (e.g., for 1 person at Auckland and at 2 Otago, all in category C), fulltime status of relevant academic staff was assumed. Overall a combined error of up to 1.0 EFTs might be expected, giving a typical error of 0.1-0.4 in each assigned PBRF average.
[2] In listing "top" physics departments I do not include units with total EFTs less than 3 in the physics subject table (AUT, Lincoln, Waikato). One peculiarity of the TEC tables is that a university with no physics department can end up "at the top" because a single physicist is employed there. Massey University did return 11.5 EFTs in physics, but as its Institute of Fundamental Sciences lumps together chemists, physicists and mathematicians, and there is no separate heading for each discipline in its Calendar, it is impossible to reliably estimate the number of continuing academic staff EFTs for the physics sub-school.
[3] Apparent discrepancies arise since "subject" areas do not correspond exactly to academic units. At Auckland one B researcher and one R researcher with a fractional appointment in the Physics Department did not nominate physics as their subject area, while a C researcher outside the Physics Department with a half-time appointment did nominate physics. At Canterbury there is likewise one C researcher outside the Department of Physics and Astronomy who nominated physics as subject. At Otago a total of 5 members of the Department of Physics (one A, one B, two C's, one R, and including two half-timers) do not identify with the subject physics. At Victoria, where physicists are part of the School of Chemical and Physical Sciences, there were 3 researchers in the School, one with a B and 2 with a C who did not choose either physics or chemistry as their subject area. Engineering and Technology and Applied Mathematics seem to be likely subject areas for these 3 individuals, given VUW's talleys. I have assumed those listed under "Physics" in the VUW Calendar include these 3 individuals. This would make the total declared EFTs of those listed under "Physics" in the VUW calendar 12.5 rather than 9.5, which seems likely as they do employ postdocs, and the chemistry subject figures talley with chemists listed in the VUW calendar.
[4] The "continuing academic" and "other researcher" EFTs are split by Department (or Physics sub-school in the case of VUW) rather than "subject" as these are the units listed in calendars, and also in most cases the financial units to which PBRF money will ultimately flow.

Any small discrepancies that result from the assumptions used to derive the table are likely to be of minor significance compared to the huge variance in individual scores which will arise from the non-linear rounding method that converted individual numerical scores to letter grades in the first place.

The figures in the table indicate that in all four universities the average score for continuing academic staff comes out essentially the same at 5.6-5.7, the 0.1 difference being well within the margin of error of these "best estimates". Although this may at first appear surprising, what it is actually saying is that the majority of continuing academics rate "B", but among continuing academics there are a few more C's than A's. Furthermore, it is also interesting that the average scores for researchers other than academics also all come out in the same ballpark, with Victoria very slightly ahead.

Given the similarity of scores, perhaps we should call it a four-way draw. But there are some clear distinctions between the departments: in terms of continuing academics there is a great difference in the size of the departments, and thus presumably also in the number of sub-fields in which each has excellence: Auckland is the largest, Victoria the smallest. These differences in research breadth are exacerbated by the fact that respectively 24% and 18% of those in the smallest departments do not do research in physics, but another subject area. Given that Otago and Victoria have no separate Engineering School, this is perhaps not surprising. The sample naturally splits into two groups of two: large physics departments at the two universities with engineering schools, and small departments at those without.

Furthermore, if one looks at the ratio of "other researcher" to "continuing academic" EFTs then Canterbury and Otago with 64% and 59% respectively clearly come out ahead, as compared to 42% for Auckland and 32% for Victoria. Is this an indication that Canterbury and Otago physics departments are bringing in more research grants per academic staff member, and hiring proportionately more postdocs?

Before rushing to conclusions we should note there may be differences in the non-academic staff who were included in the survey at different institutions. Experimental research is generally carried out by large teams: technical staff are vital to these efforts and often contribute so much to the design of experiments that their names end up on research publications, even if they never lead the research and would not end with an "A" or "B". There is a grey area as to whether such staff are PBRF eligible; at Canterbury we included them, at other institutions probably not. At Canterbury the inclusion of these staff appears to have increased the number of R's: having a couple of research papers was not enough to be a "C" it turns out, some "peer esteem" was required as well.

Despite differences in counting technical staff, I still believe our Department is among the top when it comes to research grants per academic staff. If it is true then such research success ought to be reflected in numbers of graduate students. Unfortunately the TEC aggregated MSc and PhD thesis completions by TEO rather than by Department. Independent surveys such as the Australasia-wide one in the Physicist do not distinguish between thesis and postgraduate coursework degrees. However, I do know from anecdotal evidence that Canterbury Physics and Astronomy has more thesis students than Auckland Physics. How we compare proportionately with the two smaller departments, at Victoria and Otago, I do not know. But in terms of a race between the two large physics departments at least, the position with regards thesis students is clear. And if we are really out in front then is such excellence reflected in our undergraduate teaching? Well, last December at the Inaugural Australian Physics Competition at the ANU in Canberra, which was open to teams of third year physics undergraduates from New Zealand universities as well as Australian ones, the winning team in all Australasia was Canterbury.

The object of this exercise has not been to produce a statistic by which my Department comes out on top - no single statistic is accurate enough to encompass what it means to be "best". Given the discussion above the two largest physics departments in the country, Auckland and Canterbury, may still have room to quibble about who is better. However, I do wish to express alarm at the poor methodology that has been used in analysing the PBRF results. Physics will not be the only subject affected. If comparisons are to be made with other countries, where "whole department" methodologies are used, the results of naive averaging could be especially misleading. In any case, "whole department" methodologies which do not mix continuing academics with postdocs are the only way one can really compare departments. A simple average over the scores of late-career professors and fresh postdocs is misleading, given that the most successful academics also hire many postdocs.

Since actual funding is based on total staff EFTs the results of the averaging described here should be financially neutral to departments if sensible models of fund distribution are used internally within universities. But in the matter of public and peer perceptions it is quite different, as many popular press reports have been based on fundamentally flawed interpretations of the TEC league tables.

In 1979 in my last year of school in Palmerston North I asked a family friend, a physics lecturer at the local university, which university was the best for me if I wanted to study theoretical physics. That lecturer, Paul Callaghan, did not suggest his own university (then Massey) but Canterbury. Since all departments have research strengths in particular sub-fields, and since the nature of those strengths changes over time, there is no substitute for such personal advice. However, knowing that not all young New Zealanders can call on expert advice among their circle of acquaintances, I am concerned that prospective students might use the league tables to draw completely wrong conclusions about individual departments. Before the next PBRF round a better reporting methodology needs to be found.

I am not against the PBRF exercise, and I am not against department rankings in principle; quite the contrary. However, as a scientist I deplore shonky analysis of statistics. We cannot expect the popular media to be as numerate as scientists, and TEC therefore has a social responsibility not to present its findings in ways that draw the statistically naive to incorrect conclusions. Lamentably, as with the whole of research funding in this country, I suspect that not enough funds were allocated to the exercise.

David L Wiltshire
(Dept of Physics and Astronomy, University of Canterbury)
11 May, 2004

Click here for a PDF version of this article