NCA-Forum Double Session on Scholarly Metrics in a Digital Age, E. Johanna Hartelius and Gordon R. Mitchell

SERRC —  May 2, 2014 — Leave a comment

Author Information: E. Johanna Hartelius and Gordon R. Mitchell, University of Pittsburgh, ejh1979@pitt.edu / gordonm@pitt.edu

Hartelius, E. Johanna and Gordon R. Mitchell. “NCA-Forum Double Session on Scholarly Metrics in a Digital Age.” Social Epistemology Review and Reply Collective 3, no. 6 (2014): 1-29.

The PDF of the article gives specific page numbers. Shortlink: http://wp.me/p1Bfg0-1rw

This interview was conducted on November 23, 2013 at the 99th National Communication Association Convention in Washington, D.C. Gordon Mitchell and Johanna Hartelius.

Gordon Mitchell: Greetings, my name is Gordon Mitchell, co-chair with Johanna Hartelius for this NCA [National Communication Association]-Forum double session. We are both from the University of Pittsburgh, and we are very delighted to have a special guest and excellent commentators to help us tackle this very challenging topic of scholarly metrics in a digital age. First of all, I want to introduce Carolyn Miller from North Carolina State University. She is one of the top scholars in our field, holding a Ph.D. from Rensselaer Polytechnic Institute. She is now the SAS Institute Distinguished Professor of Rhetoric and Technical Communication. Let’s welcome Carolyn Miller.

[audience applause]

The other panelist here for our first session is Dr. Lawrence Martin. He is former dean of the graduate school at Stony Brook University. He founded the firm Academic Analytics in 2005. That firm is now emerging as one of the top big data resources for higher education. He has basically grown this company and is now working with many universities, including ours, to enable use of this fascinating tool for measuring scholarly productivity. Let’s welcome Lawrence Martin.

[audience applause]

We are going to integrate audience response system polling, so let’s start off with a practice slide. Does everyone have a clicker? Here is the practice question; it is a quiz of sorts. Who said this: “Of all things the measure is Man, of the things that are, that they are; and of the things that are not, that they are not”? Go ahead and click now to indicate your answer. Okay, final call. There we have it; of course the answer is D) Protagoras. And this is actually a suitable keynote for this panel. Protagoras, one of the older Greek Sophists, was saying that humans can learn how to measure things, especially valuations, through communication. And of course he was also author of the famous “two-logoi” fragment, in addition to the “human-measure” fragment. Together, they speak to the centrality of the field communication when we are thinking about scholarly metrics. We are lucky to have so much native knowledge and skill and expertise in this area, such as Carolyn Miller and Johanna Hartelius. Johanna Hartelius is one of our field’s leading scholars on the rhetoric of expertise; that is actually the title of her first book. I’m going to turn it over to Johanna to deliver a couple more framing remarks, and then we will go into the program.

Johanna Hartelius: Thanks Gordon. On the note of Protagoras, via Isocrates and the politikos logos, expertise—the subtext of this whole exercise—is not only rhetorically constructed, but rhetorically negotiated and judged with a set of criteria that are also negotiated and judged within the contingencies of the rhetorical situation. The exigence for today’s session is the emergence in academe of certain metrics of scholarship that interpret and evaluate our expertise. These metrics function in particular ways, so we are providing a forum for you to deliberate on issues like: How will what I do be evaluated by my employer? This is a rather material and concrete concern. On another level: What are ways of knowing in the university, both for those of us who work in scholarship, and also those who work in administration and large-scale decision-making? The order of activities today is fairly straightforward, and printed in your program. We are going to hear from Professor Miller in a minute. Her perspective and expertise will certainly help to provide some rich fodder for discussion and conversation. After that we will hear from Professor Martin. He will introduce us to this tool, Academic Analytics. Information about that tool can also be gleaned from the handout material that you have. He will give us more background on that mechanism. After that, we get to the really fun part, which is going to be a conversation between everyone in the room, regarding how do we use all of these ideas in a pragmatic way. So without further ado, I will turn it over to Professor Miller.

Carolyn Miller: I’m speaking today as someone who has no particular expertise in scholarly metrics, so I’m very interested to hear Dr. Martin. What I’m going to try to do is to speak for the field of rhetorical studies, as a humanistic discipline, and sort of outline some of the issues in scholarly metrics. I see them focusing on four topics: scholarly production, dissemination, influence, and assessment.

First, for production: in the digital age, this is crucial, because it is changing so much. We have online publishing, E-books, online journals, open access journals. But we have more than this. We have alternate forms of scholarly production—creation of archives, scholarly networks, curated collections, demonstrations, informational and entertainment installations, what Cheryl Ball calls “scholarly multi-media,” or web-texts. These are all nontraditional forms of scholarly production, or at least they can be. In the web-text, for example, multiple semiotic modes and production media are central to the argument, so that the visual and experiential are as important as the discursive. The exemplar here is the journal Kairos, A Journal of Rhetoric, Technology and Pedagogy. For another example, one of my colleagues in Renaissance studies has produced, with the help of significant NEH funding, what he calls “The Virtual Paul’s Cross,” which is a digital installation and website that models St. Paul’s Cathedral and its churchyard in London, before the great fire of 1666. It reproduces the experience of hearing the delivery of a sermon by John Donne, as it unfolded on a particular day, in that specific physical location. There is a visual model, there is an acoustic model, you can position yourself in different places in the churchyard to test the audibility. The team included engineers, architects, and a dialect coach, among others. It is an elaborate virtual installation. Another colleague is planning a project that will use Microsoft’s motion detection Kinect system to model rhetorical performances by experts and students, with comparison to historical sources, such as Gilbert Austin’s Chironomia. So those are examples of nontraditional forms of production.

The dissemination of such products is going to be much different from that which we are used to. I don’t know whether traditional metrics can capture their value or influence. In addition to the more traditional dissemination, through personal and library subscriptions and purchases, we have distribution via scholarly social media, such as Research Gate, Academia.edu, and Mendeley, and also by archives and portals for professional associations and other less formal scholarly networks. With more informal dissemination, I think you lose some capability for systematic tracking. Evaluating the reputation of these disseminators, or “publishers,” is going to be difficult, or at least different, for the foreseeable future.

Under the topic of influence, then, even with more traditional forms of scholarly production, in online and open access journals, measuring influence is not as simple as citation counting, as imperfect as that has been in the humanities. Certainly it is easier than ever to locate and count citations, but if we take the case of open access publication, where one of the goals is literally to open up access to wider publics then to just privileged scholarly subscribers, how do you measure influence then? Do simply look at hit rate? Do you care about unique IP address hits? Do you distinguish scholarly from public, or simply random access? How do you determine the influence of a project like “The Virtual Paul’s Cross”? How do you determine uptake into course syllabi, graduate student research, and public engagement in the humanities?

Finally, evaluation. Metrics, of course, are all about evaluation. It seems to me that evaluation that is fair, rather than comprehensive, is going to have to account for some of these changes in the manner of production, dissemination, and influence of scholarship. Metrics, I think, will have to be multiple in order to capture the variety of work that we are now seeing as valuable. And metrics, I believe, should only be one of the tools that we use to evaluate scholarship. Peer evaluation, I think, should continue to be essential to the assessment and evaluation process, with external review, departmental mentorship, and disciplinarily informed assessment that takes all of these factors into account. Thank you.

[Audience applause]

Johanna Hartelius: While Gordon is setting this up, we’re going to pass around a consent form, because we hope to use whatever we talk about today, and the ideas we develop for future scholarship. Of course, the data are available for anyone who is interested in it, so we are just asking for your agreement to participate.

 

Gordon Mitchell: Another small point of explanation. There is a three-page document that is coming around. This is a very useful document, as it is the listing, in the Academic Analytics database, of the peer departments in the Communication cluster. It is actually two different clusters, one is “Communication and Communication Studies,” and the other is “Rhetoric, Composition and Writing.” Some institutions actually overlap and are in both.

 “Communication & Communication Studies” Cluster

Academic Analytics Disciplinary Taxonomy

(current as of November 2013)

Institution Name Program Name
American University Communication
Arizona State University Communication
Bowling Green State University Media and Communication
Clemson University Rhetoric, Communication, and Information Design
Columbia University Communications
Cornell University Communication
Drexel University Culture and Communication
Florida State University Communication
George Mason University Communication
Georgia State University Communication Studies
Howard University Communication and Culture
Illinois Institute of Technology Technical Communication
Indiana University Communication and Culture
Kent State University Communication and Information
Louisiana State University Communication Studies
Michigan State University Communication
Michigan Technological University Rhetoric and Technical Communication
New York University Media, Culture and Communication
North Carolina State University Communication Rhetoric and Digital Media
North Dakota State University Communication
Northwestern University Communication Studies
Ohio State University, The Communication
Ohio University Communication Studies
Pennsylvania State University, The Communication Arts and Sciences
Purdue University Communication
Regent University Communication
Rensselaer Polytechnic Institute Communication and Rhetoric
Rutgers – New Brunswick Communication Information and Library Studies
Southern Illinois University Carbondale Speech Communication
Stanford University Communication
University at Albany, State University of New York Communication
University at Buffalo, State University of New York Communication
Texas A&M University Agricultural Leadership, Education, and Communications
Texas A&M University Communication
Texas Tech University Technical Communication and Rhetoric
University of Tennessee, The Communication and Information
University of Texas at Austin, The Communication Studies
University of Alabama, The Communication and Information Sciences
University of Arizona, The Communication
University of California, Davis Communication
University of California, San Diego Communication
University of California, Santa Barbara Communication
University of Colorado Boulder Communication
University of Connecticut Communication Processes
University of Denver Human Communication Studies
University of Georgia Communication Studies
University of Hawaii Communication and Information Sciences
University of Illinois at Chicago Communication
University of Illinois at Urbana-Champaign Communications and Media
University of Illinois at Urbana-Champaign Communication
University of Iowa, The Communication Studies
University of Kansas, The Communication Studies
University of Kentucky Communication
University of Maine, The Communication and Mass Communication
University of Maryland, College Park Communication
University of Massachusetts Amherst Communication
University of Memphis Communication
University of Miami Communication
University of Michigan Communication Studies
University of Minnesota, Twin Cities Communication Studies
University of Missouri Communication
University of Nebraska – Lincoln Communication Studies
University of New Mexico, The Communication
University of North Carolina at Chapel Hill Communication Studies
University of North Dakota, The Communication and Public Discourse
University of Oklahoma Communication
University of Pennsylvania Communication
University of Pittsburgh Rhetoric and Communication
University of South Florida Communication
University of Southern California Communication
University of Southern Mississippi, The Communication Studies
University of Utah, The Communication
University of Washington Communication
University of Wisconsin – Madison Communication Arts
University of Wisconsin – Milwaukee Communication
Washington State University Communication
Wayne State University Communication
West Virginia University Communication Studies

“Composition, Rhetoric & Writing” Cluster

Academic Analytics Disciplinary Taxonomy (current as of November 2013)

Institution Name Program Name
Boston University Editorial Studies
Carnegie Mellon University Rhetoric
Clemson University Rhetoric, Communication, and Information Design
Duquesne University Rhetoric
Illinois Institute of Technology Technical Communication
Indiana University of Pennsylvania Composition and Tesol
Iowa State University Rhetoric and Professional Communication
Miami University English
Michigan State University Rhetoric and Writing
Michigan Technological University Rhetoric and Technical Communication
New Mexico State University Rhetoric and Professional Communication
North Carolina State University Communication Rhetoric and Digital Media
North Dakota State University English, Rhetoric Writing and Culture
Ohio University English Rhetoric and Composition
Ohio University English – Creative Writing
Binghamton University, State University of New York English/Creative Writing
Syracuse University Composition and Cultural Rhetoric
Texas Tech University Technical Communication and Rhetoric
Texas Woman’s University Rhetoric
University of Texas at El Paso, The Rhetoric and Composition
University of Arizona, The Rhetoric, Composition and Teaching of English
University of California, Berkeley Rhetoric
University of Houston Literature and Creative Writing
University of Louisville English Rhetoric and Composition
University of Minnesota, Twin Cities Rhetoric and Scientific and Technical Communication
University of North Texas Composition Studies
University of Pittsburgh Rhetoric and Communication
University of Southern California Literature and Creative Writing
Utah State University Theory and Practice of Professional Communication
Virginia Polytechnic Institute and State University Rhetoric and Writing
East Carolina University Technical and Professional Discourse

The key thing to notice there is that when Academic Analytics is ranking institutions and departments, this is the peer group they use to define the entire set of departments. So without further ado, please welcome Lawrence Martin.

[audience applause]

Lawrence Martin: Thank you very much. It’s a pleasure to be here. I’m going to go fairly quickly through some slides, really to stimulate discussion more than anything else.

NCA_1

The first thing that we always say is, faculty research productivity is only one aspect of faculty work. It is important that you place the other things that faculty are expected to do in context, and that you recognize the different kinds of demands that are placed on people at different kinds of institutions, different career stages and different disciplines, in terms of teaching, service and outreach, and so on. That should obviously provide the context within which levels of performance, in terms of research production, are based.

NCA_2

So, at the heart of the Academic Analytics system for evaluation of faculty research productivity, is that we start by identifying individual people in whom we have an interest. In this case, we started out with faculty at Ph.D. granting programs at U.S. universities. We have expanded that to include all faculty members at Ph.D. granting universities, so we are tracking about 330,000 faculty members a year. Then we go looking for their scholarship. So unlike the approach by people like those at Elsevier Thomson take, where for them the unit of record is a grant, or a journal article, or a patent, for us the unit of record is a person. And you can imagine, in the Anglophone world, the “John Smith problem” is acute. With Asian heritage, the number of people with the first name with the letter Y and the last name Li is substantial.

NCA_3

Finding out which one of those people each particular piece of scholarship belongs to is really what makes the difference between having accurate data or not. So a huge amount of our company’s effort is devoted to data verification. We have almost 100 people who do verification to ensure accuracy of the data.

NCA_4

The great thing about having data attached people is that it immediately gives you tremendous flexibility to configure those people in lots of different ways. So you can look at a department, a graduate program, an institute, a center, a system, a state, a discipline. For many of our clients, they will ask us to pull together different combinations. For instance, the example that Gordon just gave, if people tell us that they would like to see the following 24 programs that you have labeled as this, and 14 programs labeled as that, and 42 labeled as something else, creating that comparative universe is just math, because we already have the data attached to the people.  So reconfiguring the people in a meaningful way is very important. While it is very easy to think of Academic Analytics as being in the ranking business, at an early stage we made the terrible mistake of publishing rankings in the Chronicle of Higher Education. The truth is there is no such thing as a ranking of a program or of a department, or of a discipline. So the table here shows you, these are the relative rankings, for one particular program, based on 30 different metrics. Assuming that the data are right, and we think they are, then every one of those rankings is true, but they are different. So you can rank on books per faculty member, you can rank based on journal articles per faculty member, you can rank based on citations per publication. Each of those rankings tells you something, and collectively they tell you a lot. But how you combine them together is very much based on your approach to the discipline. So we have a tool that allows you to select variables and assign weights that match your own scholarly objectives.

NCA_5

This is what people tend to call the “productivity flower.” This is a graphic where the outside of the gray shows the highest performance level that exists in that discipline, nationally. The dark around the circle is the median performance, nationally. And each of the petals, the color of the petals, you see journal articles are in blue; citations are in the reddish color, honors and awards in in gold color, book publications are in purple, and research grants are in green.

NCA_6

I think this shows you why a single measure of performance in one particular area is unlikely to give you a good description of the nature of the scholarly endeavor in that particular program. But by weaving these pieces of information together, you can construct a good understanding, a strong narrative that helps you understand performance in that particular discipline. We have multiple ways of doing this.

NCA_7

You can do this looking at national percentile ranking; the chart on the top right uses Z-scores, or standard deviation units. Again, allowing you to see in which areas you do well, and in which areas you do less well. The chart at the bottom left allows you to see where each of your faculty members is benchmarked against national standards in the discipline. We have also developed a lot of information to help you decide what are critical publications.

NCA_8

This is an example from my own discipline of anthropology that shows three different ways of thinking about what is the top journal: Where is the most research in anthropology published? Where is the work published by anthropologists receiving the most citations? And what is the citation rate for papers written by people who are anthropologists in each of these journals? Among the problems with Journal Impact Factors is that we now know that things like PLoS, PLaS, Nature and Science, we have people from 120 different disciplines publishing in those journals and there are a range of citation impact factors—a low of 5 and a high of 150—so when Thomson tells you it is 34, we don’t find that particularly meaningful. We have a similar way of approaching the weighting of book publishers. Most recently we have collected data on the year of Ph.D. and the source of Ph.D. for every faculty member in our database. And this is just a graphic that shows you the histogram as career progression, if you like, for the aggregate based on age cohort, that shows when scholars develop a particular level of output.

NCA_9

This is been cut off to show only the top 15 chemistry programs in the country, so they’re all very high-performing, but the other faculty don’t perform at that level, and that is a very important factor to think about. I was interested today to look at Communication and Communication Studies, and if you look at the left here, you are a fairly individualistic bunch.

NCA_11

NCA_10

Of the programs that we have classified, as you can see, 37 are called “Communication,” 14 are “Communication Studies,” three names appear twice, and then there are 20 uniquely named programs. So we have 77 Ph.D. programs in the category Communication/Communication Studies. But we also have 144 departments at those Ph.D. granting universities. So we have data on 1,546 faculty; 997 of those authored 5,214 articles; and 893 authors obtained citations (so 100 people published articles that did not receive citation), but the 893 authors who did obtain citations obtained 40,000 citations. This is a book discipline as well, so 722 people authored 1,602 books over an eight-year period. 145 people held 260 research grants totaling almost $50 million. And 256 have attributed to them 417 honors and awards. So what Academic Analytics tries to do is to provide universities, particularly administrators at universities, with discipline-level comparative data to help describe what scholarship is being done and to place it in comparative context, with the goal of driving institutional progress.

Gordon Mitchell: Thank you very much, Lawrence. That was very succinct and informative. We also appreciate the tailoring specifically to the communication field—that was excellent. I should make a program announcement and clarification about personnel. Dean Barbara O’Keefe, of Northwestern University, was scheduled to participate in this panel. Subsequently, she informed us that she decided not to participate. We had switched things around to accommodate Lawrence Martin’s schedule, and she expressed a concern that she did not want to participate in what she perceived to be a pro-industry advertisement, and she did not want to provide “cover” for that. I should say, for the record, that we are not receiving any money from Lawrence Martin for the right to appear at today’s event.

[audience laughter]

. . . in fact, we are paying him! We vigorously tried to persuade Dean O’Keefe that in fact, this panel is not a pro-industry advertisement; it is a deliberative event. Lawrence just did a great job getting the conversation started, but now you have the opportunity to deliberate, and we will use the deliberative polling technology to help with that.

Johanna Hartelius: While Gordon is setting this up, the rhythm of this will be: first a question prompt, all of you will be invited to respond; following that, Lawrence Martin and Carolyn Miller will have an opportunity to speak to some of the things in play. Following that, we will re-poll, with the same prompt, to see if there is any difference. Then we will open it up to you, to discuss, not only with our esteemed speakers, but also with each other. Part of the nature of the space is that this is designed to promote conversation between colleagues.

Gordon Mitchell: So the first prompt, this is actually language from the Academic Analytics website; this is their promotional material: “Academic Analytics’ data are useful tools to guide university leaders in understanding strengths and weaknesses, establishing standards, allocating resources, and monitoring performance.” You have a seven-point Likert scale option to respond to that. It is challenging to absorb all the information that Lawrence just shared with us, and look at the one-page handout, and feel like you have a definitive answer at this point. But that is part of the exercise; through deliberation, you might learn more, and we will actually see if there is perhaps change of opinion. So go ahead and click your console if you would like to participate at this point.

Carolyn Miller: There is a question of whether anyone up here [panelists] should click a response. For example, I will not vote.

Lawrence Martin: While I certainly am not.

[audience laughter]

Gordon Mitchell: Okay, final call. Let’s see how it turned out.

NCA_12

So we have zero percent “Strongly Agree”; 16% “Agree”; 42% “Somewhat Agree”; 11% “Neutral”; 5% “Somewhat Disagree”; 21% “Disagree”; and 5% “Strongly Disagree.” So a very wide spread in terms of where the group is coming down. No one is a “Strongly Agree” at this stage, at least. Let’s open it up to Carolyn Miller and Lawrence Martin for some reflections and comments on what you see in the poll.

Carolyn Miller: I guess I would say first off, the 42% “Somewhat Agree” is interesting. The question I would ask—it seems to me that it might make a pretty big difference whether we are evaluating programs, institutions, or individual faculty member productivity. This question is phrased in a pretty global way, and I think I can see that what might be useful for assessing, say, departmental or programmatic productivity, I can see the utility there, whereas most of my remarks, I guess, were directed at this question of individual faculty productivity. I think that might account for some of this.

Gordon Mitchell: Interesting. Lawrence, what do you make of it when you look at the data? Do you think that this is kind of par for the course in terms of how others in academia might see the tool?

Lawrence Martin: Well, I think there are big differences between faculty members and administrators. There are big differences between faculty members in the lettered disciplines and faculty members in the science disciplines. There are big differences across national boundaries. In the UK, individual faculty members have been assessed for the last 25 years; there is a culture of assessment, even in the humanities, where it is much more difficult to assess people when the unit of work is something which is episodic and very substantial. It is much easier to evaluate a chemist, who publishes ten journal articles a year, than someone who may publish a book every three to seven years. I think that the important thing is the verb we chose, which is “to guide.” One of the things that we stress is that we do not make decisions for universities. We do not tell them what to do. We provide comparative information that thoughtful, intelligent people, with an understanding of the discipline, and an understanding of the university, can use to provide context, within which to interpret the performance of their own programs and departments.

Gordon Mitchell: Thank you very much. Let’s hear from the audience. I would be especially interested in someone who was “Neutral,” someone who was “Somewhat Disagree,” “Disagree,” or “Strongly Disagree.” If you were in one of those categories, maybe you could raise your hand, and I’ll recognize you, and then you can explain to Lawrence Martin what your reservations were, and give him a chance to persuade you, to possibly shift your opinion in the re-poll. Any volunteers for that?

Audience Member: I’ll give it a go. I think it would be useful as long as it is provided to us, along with some kind of rhetorical framing to address the university administrator on how to read this. I know when I show my university administrator data on my productivity, he says things like, “Well, how excited should I be by this?”—inviting me to provide the frame. So for me, it is a qualified “Strongly Disagree,” as long as that coaching, that framing, is provided.

Lawrence Martin: I think that is a very important point. One of the problems is if you have a provost who is a chemist, and you tell him, “Listen, I published two papers last year, and three the year before that.” And he’s like, “What the hell are you doing? You know, I published five papers in the last three months!” And they don’t realize that you are in political science, where the counts work differently. I think one of the things that is most important is that we provide discipline level technical data, so they can say, “Hmmmm, that’s at the level of Berkeley, or MIT,” and that tells them something that helps them put it into context. And I think the new data on career path, where you can see, for most of the lettered disciplines, and most of the science disciplines, it takes about 15 years post-Ph.D. for people to hit this maximum level of productivity. Almost no administrator I have ever met expected the data to be like that. What you’ll see here, what you’ll observe, is that young faculty work crazy long hours trying to get tenure, trying to build their resume, but they have a low level of output despite the tremendous level of effort that they observe. And that may be because they do not have friends that can help them get grants, or help them to get into journals, whatever. In the sciences it’s that they haven’t yet built the big lab. In the lettered disciplines, you know historians hit their maximum period of productivity 18 years after completion of the Ph.D. One of the things that we found in regards to tenure and the external program review process is, having comparative data prevents people from doing what is really wrong, which is comparing the performance of chemistry, to physics, to biology, to English. In the absence of external comparative data in the discipline, people compare disciplines within the university, and always come to the wrong conclusion.

Carolyn Miller: I just wanted to address the distinction drawn between data and judgment. I think that is a very important one. This happens over and over and over again, and the way that a rhetorician would characterize this, is that there is a stasis shift. It is a very common stasis shift from the stasis of fact or definition to the stasis of quality or judgment, or what Linda Walsh calls “the upward pull” of the stases. It is very easy for us to hand administrators a bunch of numbers and say, “Here are the facts. Here is the information,” and it is very easy for an administrator to make the numbers into a judgment, without providing any additional thought to it.

Gordon Mitchell: Lawrence, that actually brings up something you mentioned before the panel started. You mentioned that the president at Stony Brook University will look at the flower petal and believe that that is all that he needs. Is that right?

Lawrence Martin: Yeah, he thinks that gives him the collective set of information that gives him a very strong picture of what’s happening in a particular discipline at a university.

Gordon Mitchell: Well, what do you think of Carolyn’s comment just now that it’s incomplete?

Lawrence Martin: Well, I think one measure is always incomplete. Our staff used to have a sign that said, “We will become what we measure,” because we will tend to become what we measure. So it is important to measure things where if the metric goes up, in general, we think something good is happening as a result of that. I think that universities know three things. They almost always know the research expenditures. They almost always know how many students they have. And they almost always have an idea of how many degrees that they give. I have never come across a university that knows how many books have been published by their faculty in any given year or in a particular time period. Immediately, if you are locked into the money disciplines, your scholarship is so off onto the side that they don’t bother to even keep a record. So when we go to universities and say, you know, your faculty published 137 books, and 50 of those were in the top presses in the field, I think it really changes the nature of the conversation. I read a study at a UK university very recently, at University College London, a very strong science university. The number one ranked program, when they built an index that they wanted, turned out to be English Language and Literature. And no one had ever seen a comparison of disciplines that had English Language and Literature at the top. Every comparison of disciplines is driven by things that are easy to count—journal articles, grants, and so on. So the fact that this system of measurement compares English programs against other English programs, and physics programs against other physics programs, led to a very rich conversation. The English Language and Literature’s status within the university was significantly affected.

Gordon Mitchell: Let’s have one more question, and then we will re-poll. So be thinking about how you might respond again when we repeat the question.

Audience Member: The reason why I clicked “Disagree” was because I was hesitant about the word “useful”— I don’t really know what it means in the context of this question. And the second thing is, it seems to really rely on a top-down approach to publishing; is there a dichotomy or a false binary between research and teaching, one that highlights quanitity to a larger degree than quality. I do not know how you would address that.

Lawrence Martin: I think it is a very real concern, that if you just measure the number of papers, then publishing more papers is better, even if they are not better papers. That is one part of the argument about using citation data, looking at top presses. A book is a book, but if you publish a book in Lawrence’s Backyard Press, and you don’t apply a filter, that’s just as good as publishing at Harvard University Press. I mean, not really. We don’t think we should be in the business of making those judgments. We provide empirical data—where top people publish their books, where top-ranked programs publish their books—how many citations you’re likely to get, if you are someone in communication and you publish in the following 100 journals, so that you, within your discipline, can make informed judgments about whether there are different categorical the units that would help you differentiate quality rather than quantity.

Carolyn Miller: One last point, I would just ask that you consider, as you repoll, how much scholarly productivity, in non-traditional forms of production, is not captured by this tool.

Gordon Mitchell: Okay, repolling on the first prompt is now open.

NCA_13

Gordon Mitchell: So Lawrence, it looks like you got a fan!

[audience laughter]

Gordon Mitchell: Let’s go back and see if we can learn something from the difference. Before, “Somewhat Agree” was the dominant category, with 42%. And it’s about the same. But we are also seeing some shifting negative.

Johanna Hartelius: Let me present it somewhat differently than Gordon did. If you changed your mind, and shifted opinion, or recorded it differently this go around compared to the last one, what might have accounted for that, in your own assessment?

Audience Member: I actually shifted a bit more toward “Agree.” I fixated also on the term “useful,” but I think in a different respect. I think that a question within this question is: Do you like your dean/provost? . . .

[audience laughter]

Audience Member: . . . Because I don’t doubt for a second that this is useful to university administrators. If I trust my dean, I like things that are useful to him. But if I don’t trust him, I am scared to death of things that are useful to him. A pertinent question is, and I think this is especially important for humanities and social sciences, so a discipline like ours is: What is your relationship to higher administration and the broader politics of higher education, which frankly have been very adversarial as of late. I don’t think this question asks for normative judgment about whether this will make our lives better or worse. Frankly it’s asking whether our provosts and deans will get use out of it. I don’t know, I don’t see much ambiguity in terms of the “yes,” but whether or not normatively that is a good thing is profoundly context-based.

Gordon Mitchell: Johanna, should we take one more question?

Johanna Hartelius: Yes.

Audience Member: I just wanted to follow up on that real quick. I am from UNC-Chapel Hill, where Republicans have taken over. Although I shifted it up—I agree that this is useful—in line with you. Even though you say, the worst thing you can do is compare across disciplines, that this is meant for a detailed, intra-disciplinary comparison across departments, that sentence is already too long for politics in any state higher education; it’s too big. It’s what Pat McCrory and others will do in North Carolina; they will say “ehh, cultural studies people, two publications. Science industry—39.” They will not make the fine-grained distinctions if there are political gains to be had, no adding any kind of nuanced thing about this product, which, as wonderful as it could be, certainly can be mobilized and utilized for very dire political ends, depending on the political orientation.

Gordon Mitchell: Thank you for that contribution. I think it is a good segue to the next slide, which is focusing on the flower chart. Again, we have language pulled directly from the Academic Analytics promotional material. The prompt is: “Academic Analytics’ ‘flower chart’ affords the viewer a visualization of the overall productivity of the faculty within a given academic discipline, facilitating rapid identification of the strongest and weakest areas in a given academic discipline.”

Johanna Hartelius: We should emphasize something Gordon said in passing there, the langauge both from the first prompt and the second prompt is not so much ours. It has been excerpted from material produced by Academic Analytics about itself. So these are, as far as we can represent them, the description that the tool provides of itself.

Gordon Mitchell: So this prompt is about the flower chart. Polling is open. Again we have the Likert options. Final call. Okay, let’s see. No one really strongly agreeing or disagreeing. That’s interesting. Lawrence, what do you make of that?

NCA_14

Lawrence Martin: I think that if everybody who had a copy of the flower chart for a program that they either knew or didn’t know within the discipline, and were asked to interpret the information, and had an hour to think about the information, and you could write a paragraph or two about what you could tell about that program, people are very surprised how much information they have. I go and do this at universities all the time. I’m an anthropologist. I can look at any discipline, for example your economics program looks as if it is at this sort of stage, they are very usually surprised. I think it is a graphic that you need to really spend some time on to really understand.

Gordon Mitchell: Thank you. Let’s hear from someone again, who was perhaps on the skeptical half, who has some reservations about the flower chart, and might not agree. Yes, sir.

Audience Member: About the issue of the flower chart being a visualization of the overall productivity of faculty, data don’t say anything without a warrant. Before I hear that a measure has some meaning, which is itself a claim for limitless amounts of productivity within a discipline, we are measuring upwards and upwards, to infinity. So then success is always about being in the top tier.

Gordon Mitchell: So basically, you are saying that the flower chart in some ways brings into the evaluative process a pure quantity metric?

Audience Member: Yes, and an unattainable ideal of productivity.

Carolyn Miller: I would I guess argue from anecdote here. Would the flower chart capture a project like the “Virtual Paul’s Cross”?

Lawrence Martin: No. We measure the things that have traditionally been measured by the National Research Council. What we did was to add books, and to go from 41 disciplines to 171 disciplines. We are now in the process of adding book chapters. We have added conference proceedings. We are trying to develop data on citations to books and citations to book chapters. We have sat with chairs, and there is no question the difficult disciplines to measure are things like performance disciplines, architecture for example. And yet the people in those disciplines know exactly how to establish standards; they know who is good, who is less good, and so on. And they can describe them. One of the problems that administrators have always had is that they have asked humanists to figure out how to measure themselves, and that’s not something that they’re so good at. Humanists are good at articulating how they know who to hire, how they know who to promote and tenure. And if you are good at listening to that, you can turn that into a system that you can tabulate data for and make measurements. But the cost for example, in music, the cost to tabulate everything that the performance discipline thinks is important, is probably on the order of $5 million a year. At the moment, that’s not a viable thing. Our solution to that is going to be that we will host a data warehouse, where if people in those disciplines want to help us to develop a categorical system for that performance activity, you can put data in and get data out. We hope to facilitate the exchange of information, but we don’t see a viable way of applying that information oursevles.

Gordon Mitchell: Another audience question or comment?

Johanna Hartelius: Yes, you  sir.

Audience Member: I have a general question for Lawrence, it’s a question about resistance. It seems like with a lot of the evaluation, we have to be careful with this kind of data, because it seems like Academic Analytics is sort of falling out of the sky. At every university that I know of, there is a long, calcified, hard-fought system of evaluation that people have put a lot of thought and effort into, to get what they want. Faculty have done that, administrators have done that, and people don’t want to see their part of that puzzle go away, often to the expense of the larger analytical question. So maybe you could provide a bit of the kind of context that you get when you go to provide this. Even though it might be good, at least I find, even if something is good and accurate, that’s not enough of a warrant to persuade people, especially when they are so heavily invested in a system that is a kind of hodgepodge political conception.

Gordon Mitchell: I would just follow on there to mention something that you mentioned earlier in the panel—you said you made a mistake because you published rankings in the Chronicle of Higher Education. There was resistance that was stimulated as a result of that, is that correct?

Lawrence Martin: Yeah. I had a different business partner than the one I have today. There are really two reasons it was a mistake. First, it focused information on rankings, rather than the rich information available to understand the program. And second was the data just weren’t good enough at that point. The level of accuracy hadn’t reached the point that we are at. We didn’t know that at the time. We thought they were good enough, pretty close to being right. It turns out that it is quite a bit harder to do it. We see great, quick adoption in things like external program review, where it is very useful to help to pick discipline-level peers, people who are doing scholarly work in the areas that we are working in, and doing it at your level. The places where it has been best adopted are where it has been pushed down to department heads and department heads are using the information, and discussing it with their faculty, talking about what they’re doing, where they really are, and whether there are things they can improve on.

Gordon Mitchell: Lawrence, on that point, let me speak to you as department chair. At the University of Pittsburgh, we are having a problem, which is the fact that our dean has said that the chairs can have access codes, but under the proviso that we cannot share the data with our faculty. What is your reaction to that?

Lawrence Martin: I think that it would be very difficult for us to say that every faculty member had full access, but I think for the chair not to be able to . . . I mean most of the chairs at Stony Brook prepare reports based on the data, and discuss those reports at faculty meetings. Faculty are inherently competitive people. You start showing them where they are, compared to where other people are in the discipline, performance just goes up. Steadily. People are, you know, they want to be the best. I don’t want to be critical of deans’ decisions, but it is very tempting to hold this high up in the administration, but universities do not get better because provosts or presidents say that they should. They get better because English gets better; physics gets better; and sociology gets better. And those departments get better because the people in them do more, they are better supported, they encouraged to be better scholars, they hire better people, they tenure and promote better people. So if you don’t get the information in the hands of the disciplines, unless you are an all-powerful chair, who can just command your faculty—I know what we should be doing, so you should do it—I haven’t met many of those outside of medical schools. We did a study at Oxford Univesity recently. They sat around and discussed the data we had. The faculty said, “You know what, we need to have a discussion about whether we should develop a publication strategy that compares our faculty to colleagues.” Is that a good idea, and if so, what should it be? And they said, “In 900 years, we’ve never had that discussion.”

Gordon Mitchell: Okay, let’s repoll on the flower chart prompt.

NCA_15

Gordon Mitchell: Again, looks like some shifting to the right, but also some to the left. Moderate movement. Let’s actually go to the final prompt. This is another kind of University of Pittsburgh experience question. It is interesting that the Academic Analytics tool itself was designed to measure faculty productivity. But this same tool can be used to assess graduate programs. So, “Should faculty scholarly productivity be the key index of graduate program strength?” Let’s poll on that prompt. As people are tallying their votes, let me also mention that we are learning from our administrators, who are working with Lawrence, that it is on the horizon that you will begin to have different tools that will begin to actually measure graduate program strength independently, using other metrics. But at this current stage, the only data that are in the database have to do with faculty productivity. Is that correct?

Lawrence Martin: Not for yours; others who subscribed later can use a different tool. We do have the outcomes of graduate education module that looks at placements of your graduates, and also their level of scholarship.

Gordon Mitchell: So the fact that we were an earlier adopter is limiting us?

Lawrence Martin: You can upgrade.

Gordon Mitchell: It’s interesting, on that point, I was at a meeting with Vice-Provost David DeJong and I asked him how much it cost, and he wouldn’t answer.

Lawrence Martin: I don’t know how much it would cost. I try to stay out of the business side of this.

Gordon Mitchell: Let’s take a look at the polling results. Lawrence and Carolyn, what do you make of them?

NCA_16

Lawrence Martin: I suppose I could begin. I was a graduate dean for almost twenty years. Faculty scholarly productivity should not be—this is not an Academic Analytics statement, by the way—it should not be, I think it defines the boundary conditions within which graduate education happens. If you have weak faculty, you’re not going to have a great program; but having strong faculty does not mean that you do have a great program. The graduate dean at Berkeley surveyed students from top programs and asked them, “Would you repeat the experience again?” And then they asked, “If you did this again, would you go back to the same program and repeat the process?” And programs that ranked in the top 20 nationally, on faculty performance, when you limited it to people who would do another Ph.D., in composition, what do you think would be the right percentage, people who would be recidivists? I thought about 70%. So the range for top 20 programs ranged from a high of 85% to a low of 15%. Fifteen percent! “Yes I would do it again, but I wouldn’t go there.” So the faculty is the boundary condition for where English graduate education falls. Now the counter-argument, I don’t know how many of you have looked at the National Research Council studies of 2010, but they had multiple ways for evaluation of graduate programs. They had research accomplishments, which is essentially faculty scholarly productivity; they had one based on a regression analysis of the reputation of a program, which turned out to be very similar to faculty scholarly productivity; and then they had a broader one, which included diversity, placement, time-to-degree, and completion rates. And when the faculty were polled on those, faculty naturally disagreed fundamentally with what you said—the only things they cared about were numbers of books, numbers of citations, numbers of grants, numbers of honors and awards. They didn’t care at all about graduate completion rates, placement, diversity, and things like that. It’s a very big data sample, and it shows that when people actually say how do they want their program to be measured, they come down in favor of the classic metrics on faculty scholarly performance, even though earlier studies show, very clearly that great faculty does not a great graduate program make.

Gordon Mitchell: Should we go out to the audience for another comment or question?

Johanna Hartelius: Yes.

Audience Member: It is appropriate for the last question you asked too, and I was curious, forgive me if you already explained this, but I’m curious particularly for the humanities, how you weight the prestige of various journals, the prestige of the press that are being published in?

Lawrence Martin: So, historically, we haven’t weighted them. We have just now made a tool that allows you to make those judgments then put those values on. And we provide some empirical data on citation rates for people from a certain discipline in different journals, it’s like a discipline-specific impact factor, which is one way of thinking about it, looking at volume of publication, looking at where people from the top 20 programs in the discipline publish, looking at where the top 20% of people in the discipline publish, so we give sets of empirical data that will allow people within the discipline to decide which are the top journals in their discipline. I had an interesting discussion with some political science colleagues, who said, “Everybody in political science agrees on what are the top five journals.” Business people tell you that, economics as well. So I said, “Would you like to see where the best places for political scientists to publish, if they are trying to get high citation rates?” And they said, “Well, it will be the same.” And it wasn’t. Not one of their top five was one of the top five places to publish if you want to get high citation rates in political science. It turns out that interdisciplinary work attracts much higher citation rates than narrowly focused disciplinary work. The high prestige journals within the discipline tend to be more narrow in focus. So I think it leads to a very interesting discussion about what you are trying to accomplish when you disseminate knowledge through scholarly publication. But I think that is a decision, I don’t think there is a right answer for a discipline; I think people in a university will have different right answers for a discipline, depending on where they are and what they are trying to accomplish. We just provide the information and the tools.

Carolyn Miller: I just wanted to ask Gordon to put on the table what the consequences are for the assessment of graduate programs at Pitt?

Gordon Mitchell: Three of the graduate programs are in the process of being suspended and/or terminated; they are in classics, religious studies and German. There is actually a big kerfuffle between the senior administration and chairs of those departments, and a broader discussion happening.

Carolyn Miller: And those decisions were based on this index, Academic Analytics?

Gordon Mitchell: Academic Analytics’ role is murky. That’s one of the reasons why we are doing this panel; we want to learn more. How about another audience question? Yes.

Audience Member: I wonder how does it work as an intervention tool? It seems like a lot of evaluation happens when we need to terminate programs, when we need to evaluate and figure out where we stand. But I’m specifically struck by how you talk about how awareness can drive a young scholar to achieve certain things. While I agree it is kind of infinite in terms of where you lead, the median is, at least, a good starting place for early scholars to figure out, you know, how to calibrate their workload and their production rate. I just wondered if you could speak to the process of intervention, specifically with regard to awareness?

Lawrence Martin: Yeah, I think that a lot of places now have this “year of Ph.D,” which kind of allows you to set an academic birthdate and look at the career progression. A lot of us have realized that setting reasonable standards and telling people them, you know, typically in this department, or in this university, by the time you’re 10 years from your Ph.D., you will be in this range of performance activity. Most young faculty don’t get that kind of guidance. Obviously, it’s not that, you know, you are fine if you’ve got 10 articles, you’re dead if you’ve got nine articles. We understand there’s a lot more context. But that kind of intervention is important. You know, I hate to hear that programs are being shut down, and certainly if, I’m sure the data will be much more comprehensive than just the Academic Analytics data. But we do a lot of work modeling programs that get people thinking about creative approaches, you know, will it be strong? Will it be weak? That’s very helpful. I think we have done a lot of work with too small programs that aren’t particularly viable, and should you combine them with other things to enrich opportunities? That’s a helpful kind of intervention. Very few provosts make mistakes if they spend money; they tend to do the best they can with the money they have. By and large, when you are in a growing budget environment, decision-making is less risky. But if you are going to cut things, and some of us are going to have to cut things—funding for higher education is going down— I always thought that, I would rather have a surgeon who has a knowledge of anatomy rather than someone who is cutting blindly, without any kind of information to guide the decision. I’m not saying that this tells you the decision. This is one piece of a complex series of sets of information that he would need to have a proper understanding.

Gordon Mitchell: Okay, let’s repoll. This will be our final poll, to see whether the exchange we just had prompted you to take a second or third look. Well, it looks like a big change; now almost two-thirds “Disagree” or “Strongly Disagree” with the prompt—that’s an interesting ending for us.

NCA_17

Johanna Hartelius: We are almost at the end of the first session, and this is a good time for us to tease a little bit about what we are going to talk about in the next session, which is a conversation about the kinds of language that comes to the departmental level from our deans, our provosts, and so forth higher up in the hierarchy. In session two, we are going to try and take inventory of the kinds of arguments that we, as faculty members, are exposed to and engage in, when it comes to conversations about our productivity. That will be directed toward identification and explication of argumentative fallacies. So we will be drawing upon some of our own expertise in this discipline, and then hopefully directing that cataloging of fallacies into a productive discussion that is responsive—not just, “This is a fallacious argument, for the following reasons,” but “This is a fallacious argument, that may be responded to in the following way . . .” so that we are really equipping ourselves with our own expertise to pursue these challenges in an informed way. I really want to thank Professor Martin and Professor Miller for coming, and thank you all for your participation. We’ll be seeing you in 15 minutes.

No Comments

Be the first to start the conversation!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s