Note added 30 May, 2007: The 2006 PBRF round used the same reporting methodology, with precisely the same flaws I described in my 2004 article below. Whereas new categories "C(NE)" and "R(NE)" were invented for "new entrants", these have been aggregated with ordinary "C" and "R" scores in deriving leagues tables by "subject" and "nominated academic unit". Thus on the basis of the 2006 round, the most successful physics department in New Zealand - the one that brings in the most actual PBRF dollars (as determined from PBRF 2006 report, Appendix A, Tables A-70, A-71 and A-72), namely the Department of Physics and Astronomy at the University of Canterbury - scores "4th" in the "Physics subject area results" (Table A-34) - since as a very result of its success with research grants it hires more postdocs than its sister departments in other universities.
An abridged version of this article (minus the abstract and footnotes to Table 1) was published in New Zealand Education Review, 19 May 2004.
Abstract It is demonstrated that the use of simple averages to compare PBRF scores of departments has led to erroneous representations of the data in the popular press and university publications.
As an example, an analysis of 4 physics departments/sub-schools at Auckland, Canterbury, Otago and Victoria universities is undertaken. Using information from university calendars PBRF scores are divided between continuing academic EFTs and other researcher EFTs, using very reasonable assumptions. The average score of the continuing academics in each of the four departments surveyed, at 5.6-5.7, is indistinguishable, while those of other researchers at 0.9-1.3 are also in a similar ballpark.
Where differences exist in the rankings given in TEC tables it is due to the proportion of "other researchers" in each department. Departments which win more research grants, and hire more junior research staff (postdoctoral fellows etc) on these grants are likely to score lower averages. Thus in this sample of four at least, a lower PBRF total average should be seen as a measure of distinction.
While the problems indicated should not affect PBRF funding, which is based on total staff EFTs, it is argued that the PBRF reporting methodology, in terms of public perceptions generated by the TEC league tables, is fundamentally flawed.
Imagine two departments, both of 10 fulltime continuing academic
staff, who each score "B" in the PBRF exercise. However,
there is one difference: department 1 has very little in the way of
external research grants, whereas department 2 is very successful
at attracting research funds and on these grants employs 5 very good
postdoctoral fellows who, being promising early career researchers with some
publications each score "C". In this case - leaving aside the
important statistic of PhD thesis completions not addressed in the
measure - one department is clearly more successful than the other:
it is department 2 with an average PBRF rating of 4.7 as opposed to
department 1 with an average of 6.0. Due to research grant success
the conventional interpretation of the results is reversed!
Although this example is hypothetical, effects of precisely this sort
do distort the "subject" league tables which have been drawn up in the TEC
report, and widely quoted as defining who is "better" than whom.
Such crude measures which totally ignore the career profile of those
averaged are potentially damaging, as they can lead to quite false
impressions.
To those who might argue that a department which is successful at
getting research grants would surely get more "A"s I should point
out two things. Firstly, the average score of an "A" academic employing one
"C" postdoc is 6.0 (the same as two "B" researchers employing no one),
the average of an "A" academic who brings in more money to employ two
"C" postdocs is less at 4.7, etc. If one employs a very new PhD who
ends up "R", it's bad luck.
Thus research grant success clearly does lower
the average, even if there are more "A"s to show for it.
The second point is that judging by the known results of many colleagues,
a significant number of researchers who have
been successful in winning Marsden grants as Principal Investigator,
and who do employ postdocs, have been rated "B". In the Physical
Sciences panel at least, it is clear that anyone not already a
professor stood little chance of scoring an "A": the number of professors
in the physics subject sample was actually rather greater than the number of
A's awarded. Many mid-career researchers with international reputations
were not awarded "A". One suspects there is a significant
correlation between "A" and age.
If one is to compare departments, then the most honest way to compare
them is based on the continuing academic staff only. It is the continuing
academic staff who are the ones able to apply for research grants, who
lead the research programmes and
who define the reputation of a department. In the table
below I have therefore reanalysed the scores of my own subject area.
In particular, I have taken the "top" four physics
departments and accounted for continuing academic EFTs, using the 2003/2004
calendars of the institutions in question. Without access to individual
scores I have made a simplifying
but reasonably accurate assumption that continuing academics, Lecturer
and above, were
responsible for the highest scores in each department, whereas
postdocs, tutors and technicians (where these were
included) were responsible for the lower scores. This will not be a
one-to-one match in every case, but it is very close. [See Table 1.]
[2] In listing "top" physics departments I do not include
units with total EFTs less than 3 in the physics subject table (AUT,
Lincoln, Waikato). One peculiarity of the TEC
tables is that a university with no physics department can end up "at the
top" because a single physicist is employed there. Massey University did
return 11.5 EFTs in physics, but as its Institute of Fundamental
Sciences lumps together chemists, physicists and mathematicians, and there
is no separate heading for each discipline in its Calendar, it is impossible
to reliably estimate the number of continuing academic staff EFTs for
the physics sub-school.
[3] Apparent discrepancies arise since "subject" areas
do not correspond exactly to academic units. At Auckland one B researcher
and one R researcher with a fractional appointment in the Physics Department
did not nominate physics as their subject area, while a C researcher
outside the Physics Department with a half-time appointment
did nominate physics. At Canterbury there is likewise one C researcher
outside the Department of Physics and Astronomy who nominated
physics as subject. At Otago a total of 5 members of the Department
of Physics (one A, one B, two C's, one R, and including two
half-timers) do not identify with the subject physics. At
Victoria, where physicists are part of the School of Chemical and
Physical Sciences, there were 3 researchers in the School,
one with a B and 2 with a C who did not choose either
physics or chemistry as their subject area.
Engineering and Technology and Applied Mathematics seem to
be likely subject areas for these 3 individuals, given VUW's talleys.
I have assumed those listed under "Physics" in the VUW Calendar include these
3 individuals. This would make the total declared EFTs of those listed
under "Physics" in the VUW calendar 12.5 rather than 9.5,
which seems likely as they do employ postdocs, and the chemistry subject
figures talley with chemists listed in the VUW calendar.
[4] The "continuing academic" and "other researcher" EFTs are split by
Department (or Physics sub-school in the case of VUW) rather than
"subject" as these are the units listed in calendars, and also in most cases
the financial units to which PBRF money will ultimately flow.
The figures in the table indicate that in all four universities the average
score for continuing academic staff comes out essentially the same at
5.6-5.7, the 0.1 difference being well within the margin of error of
these "best estimates".
Although this may at first appear surprising, what it is actually
saying is that the majority of continuing academics rate "B", but
among continuing academics there are a few more C's than A's.
Furthermore, it is also interesting that the average scores for
researchers other than academics also all come out in the same
ballpark, with Victoria very slightly ahead.
Given the similarity of scores, perhaps we should call it a four-way draw.
But there are some clear
distinctions between the departments: in terms of continuing academics
there is a great difference in the size of the departments, and thus
presumably also in the number of sub-fields in which each has excellence:
Auckland is the largest, Victoria the smallest. These differences in research
breadth are exacerbated by the
fact that respectively 24% and 18% of those in the smallest departments
do not do research in physics, but another subject area. Given that
Otago and Victoria have no separate Engineering School, this is perhaps not
surprising. The sample naturally splits into two groups of two: large
physics departments at the two universities with engineering schools, and
small departments at those without.
Furthermore, if one looks at the ratio of "other researcher" to
"continuing academic" EFTs then Canterbury and Otago
with 64% and 59% respectively clearly come out
ahead, as compared to 42% for Auckland and 32% for Victoria. Is this
an indication that Canterbury and Otago physics departments are bringing
in more research grants per academic staff member, and hiring
proportionately more postdocs?
Before rushing to conclusions we should note there may be
differences in the non-academic staff who were included in the survey
at different institutions. Experimental research is generally carried
out by large teams: technical staff are vital to these efforts and
often contribute so much to the design of experiments that their names
end up on research publications, even if they never lead the research
and would not end with an "A" or "B". There is a grey area as to
whether such staff are PBRF eligible; at Canterbury we included them,
at other institutions probably not. At Canterbury the inclusion of
these staff appears to have increased the number of R's: having a
couple of research papers was not enough to be a "C" it turns out,
some "peer esteem" was required as well.
Despite differences in counting technical staff, I still believe our
Department is among the top when it comes to research grants per academic
staff. If it is true then such research success ought to be reflected in
numbers of graduate students. Unfortunately the TEC aggregated MSc and
PhD thesis completions by TEO rather than by Department. Independent surveys
such as the Australasia-wide one in the Physicist do not distinguish
between thesis and postgraduate coursework degrees. However, I do know from
anecdotal evidence that Canterbury Physics and Astronomy has more thesis
students than Auckland Physics. How we compare proportionately
with the two smaller departments,
at Victoria and Otago, I do not know.
But in terms of a race between the two large physics departments at least,
the position with regards thesis students is clear.
And if we are really out in front then is such excellence reflected in our
undergraduate teaching? Well, last December at the Inaugural Australian
Physics Competition at the ANU in Canberra, which was open to teams of
third year physics undergraduates from New Zealand universities as well as
Australian ones, the winning team in all Australasia was Canterbury.
The object of this exercise has not been to produce a statistic by which
my Department comes out on top - no single statistic is accurate enough to
encompass what it means to be "best". Given the discussion above
the two largest physics
departments in the country, Auckland and Canterbury, may still have
room to quibble about who is better. However, I do wish
to express alarm at the poor methodology that has been used in analysing
the PBRF results. Physics will not be the
only subject affected. If comparisons
are to be made with other countries, where "whole department" methodologies
are used, the results of naive averaging could be especially misleading.
In any case, "whole department" methodologies which do not mix continuing
academics with postdocs are the only way one can really compare departments.
A simple average over the scores of late-career professors and fresh
postdocs is misleading, given that the most successful academics also hire
many postdocs.
Since actual funding is based on total staff EFTs the results of
the averaging described here should be financially neutral to
departments if sensible models of fund distribution are used internally
within universities. But in the matter of public and peer perceptions
it is quite different, as many popular press reports have been
based on fundamentally flawed interpretations of the TEC league tables.
In 1979 in my last year of school in Palmerston North I asked a family
friend, a physics lecturer at the local university, which university was the
best for me if I wanted to study theoretical physics. That lecturer, Paul
Callaghan, did not suggest his own university (then Massey) but Canterbury.
Since all departments have research strengths in particular sub-fields,
and since the nature of those strengths changes over time, there is no
substitute for such personal advice. However,
knowing that not all young New Zealanders can call on
expert advice among their circle of acquaintances, I am concerned
that prospective students might use the league tables
to draw completely wrong conclusions
about individual departments. Before the next PBRF round a better
reporting methodology needs to be found.
I am not against the PBRF exercise, and I am not against department
rankings in principle; quite the contrary.
However, as a scientist I deplore shonky analysis
of statistics. We cannot expect the popular media to be as numerate
as scientists, and TEC therefore has a
social responsibility not to present its findings in ways that draw
the statistically naive to incorrect conclusions. Lamentably,
as with the whole of research funding in this country, I suspect that
not enough funds were allocated to the exercise.
David L Wiltshire
TABLE 1: PBRF averages in physics at four New Zealand universities
Declared EFTs by
Average PBRF score by University Subject Dept Continuing Others Subject Dept Continuing Others total total academics total total academics U Auckland 29.5 30.3 21.3 9.0 4.2 4.3 5.7 0.9 U Canterbury 30.2 29.2 17.8 11.4 3.8 3.8 5.6 1.1 U Otago 17.8 21.8 12.8 9.0 3.6 3.8 5.6 0.9 Victoria U 9.5 12.5 9.5 3.0 5.1 4.6 5.7 1.3
Notes:
[1] Continuing academic staff EFTs, at Lecturer or above, have been
gleaned from 2003 and 2004 calendars. Where numbers increased from 2003
to 2004 (by one each at Auckland and Otago),
in the absence of knowledge as to whether this was before or
after census date, 1 July 2003, I have averaged the extra academic at
0.5 EFT. Furthermore, information concerning fractional appointments
is not given in university calendars, but can often be worked out from
the TEC tables when comparing the grade distribution in both staff
number and EFT weighted columns. As an example, where twelve
individuals contribute 11.8 EFTs in category B in a unit, one such
individual has to be 0.8.
In general I have made a best guess combining calendars, and the
subject and TEO nominated units tables in the TEC report. Where
fractional EFTS of academics were unclear (e.g., for 1 person
at Auckland and at 2 Otago, all in category C), fulltime status of
relevant academic staff was assumed. Overall a combined error of up to
1.0 EFTs might be expected, giving a typical error of 0.1-0.4 in
each assigned PBRF average.
Any small discrepancies that result from the assumptions used to derive the
table are likely to be of minor significance
compared to the huge variance in individual scores
which will arise from the non-linear rounding method
that converted individual numerical scores to letter grades in the
first place.
(Dept of Physics and Astronomy, University of Canterbury)
11 May, 2004