Showing posts with label policy. Show all posts
Showing posts with label policy. Show all posts

Thursday, January 12, 2012

Research Works Act - seriously?

I am not a fan of the academic publishing industry, and have written before on the need for more openness in the publishing process. My position is very simple: it is not ethical for taxpayers to be forced to buy access to scientific articles whose research was funded by the taxpayer.

I am very dismayed at the introduction of the Research Works Act, a piece of legislation designed to end the NIH Open Access policy and other future openness initiatives.

Sigh... even in academic publishing, we're socializing the risks and privatizing the gains. Here, I agree completely with Michael Eisen's statement in the New York Times:
 "But the latest effort to overturn the N.I.H.’s public access policy should dispel any remaining illusions that commercial publishers are serving the interests of the scientific community and public."

As this bill was written by representatives taking money from the publishing industry, perhaps we should include lawmakers in that group as well.

Saturday, August 20, 2011

Bayesian truth serum, grading and student evaluations

In one of my last posts, I examined some proposals for making university grading more equitable and less prone to grade inflation. Currently, professors are motivated to inflate grades because high grades correlate with high student evaluations, and these are often the only metrics of teaching effectiveness available. Is there a way to assess professors' teaching abilities independent of the subjective views of students? Similarly, is there a way to get students to provide more objective evaluation responses?

It turns out that one technique may be able to do both. Drazen Prelec, a behavioral economist at MIT, has a very interesting proposal for motivating a person to give truthful opinions in face of knowledge that his opinion is a minority view. In this technique, awesomely named "Bayesian truth serum"*, people give two pieces of information: the first is their honest opinion on the issue at hand, and the second is an estimate of how the respondent thinks other people will answer the first question.

How can this method tell if you are giving a truthful response? The algorithm assigns more points to responses to answers that are "surprisingly common", that is, answers that are more common that collectively predicted. For example, let's say you are being asked about which political candidate you support. A candidate who is chosen (in the first question) by 10% of the respondents, but only predicted as being chosen (the second question) by 5% of the respondents is a surprisingly common answer. This technique gets more true opinions because it is believed that people systematically believe that their own views are unique, and hence will underestimate the degree to which other people will predict their own true views.

But, you might reasonably say, people also believe that they represent reasonable and popular views. They are narcissists and believe that people will tend to believe what they themselves believe. It turns out that this is a corollary to the Bayesian truth serum. Let's say that you are evaluating beer (as I like to do), and let's also say that you're a big fan of Coors (I don't know why you would be, but for the sake of argument....) As a lover of Coors, you believe that most people like Coors, but feel you also recognize that you like Coors more than most people. Therefore, you adjust your actual estimate of Coors' popularity according to this belief, therefore underestimating the popularity of Coors in the population.

It also turns out that this same method can be used to identify experts. It turns out that people who have more meta-knowledge are also the people who provide the most reliable, unbiased ratings. Let's again go back to the beer tasting example. Let's say that there are certain characteristics of beer that might taste very good, but show poor beer brewing technique, say a lot of sweetness. Conversely, there can be some properties of a beer that are normal for a particular process, but seem strange to a novice, such as yeast sediment. An expert will know that too much sweetness is bad and the sediment is fine, and will also know that a novice won't know this. Hence, while the novice will believe that most people agree with his opinion, the expert will accurately predict the novice opinion.

So, what does this all have to do with grades and grade inflation? Glad you asked. Here, I propose two independent uses of BTS to help the grading problem:

1. Student work is evaluated by multiple graders, and the grade the student gets is the "surprisingly common" answer. This motivates graders to be more objective about the piece of work. We can also find the graders who are most expert by sorting them according to meta-knowledge. Of course, this is throwing more resources after grading in an already strained system.

2. When students evaluate the professor, they are also given BTS in an attempt to elicit an objective evaluation.

* When I become a rock star, this will be my band name.

Friday, August 5, 2011

Proposed changes to IRBs

Institutional review boards (IRBs) are committees formed within universities and research organizations. Their job is to review proposed research that uses human subjects, evaluating it for ethical treatment of the human participants. It's an important job given the rather spotty history we have with ethical research (see here, here and here among others).

However, there is a wide range of activities that count as human subjects research, ranging from experimental vaccine trials to personality tests, from political opinions to tests of color vision. Currently, all of this research is broken up into two groups: "regular" human subjects research, which is subject to a full review process and "minimal risk" research, which is subject to a faster review process. Research is defined as minimal risk when it poses no more potential for physical or psychological harm than any other activity in daily life.

My research falls into the minimal risk category. My experiments have been described by several subjects as being "like the world's most boring video game". Outside of being boring, they are not physically harmful, and there is no exposure of deep psychological secrets either. No matter. Each year, researchers like me fill out extensive protocols detailing the types of experiments they propose to do, detailing all possible risks, outlining how subject confidentiality will be maintained, etc. And each participant in a study (each time s/he participates) receives a 3-4 page legal document explaining all of the risks and benefits of the research, which the subject signs to give his consent.

This does seem to be overkill for research which really doesn't pose any sort of physical or psychological threat to participants, and I applaud new efforts to modernize and streamline this process. (Read here for a great summary of the details. Researchers: you can comment until the end of September, the Department of Health and Human Services is soliciting opinions on a bunch of things).

Among the changes are moving minimal risk research from expedited review to no review, and eliminating the need for physical consent forms (a verbal "is this OK with you?" will suffice). These are both good things that would improve my life substantially. However, I believe that standardizing IRB policies across the country would do the most good.

I am currently at my 4th institution and have seen as many IRBs. Two of them have been entirely reasonable, requiring the minimal amount of paperwork and approving minimal risk research across the board. The other two, however, have been less helpful. As Tal Yarkoni points out, "IRB analysts have an incentive to be pedantic (since they rarely lose their jobs if they ask for too much detail, but could be liable if they give too much leeway and something bad happens)". However, I think it goes beyond this. In some sense, IRBs feel they are productive by showing that they have stopped or delayed some proportion of the research that crosses their desks.

I have had an IRB reject my protocol because they didn't like my margin size, didn't like my font size, and didn't like the cute cartoon I put on my recruitment posters (apparently cartoons are coercive). I've had an IRB send an electrician into the lab with a volt meter to make sure my computer monitor wouldn't electrocute anyone. My last institution did not approve an experiment that was a cornerstone of my fellowship proposal as it required data to be gathered online (this is very common in my field) and I couldn't guarantee that someone outside of my approved age range (18-50) was doing my experiment. Under the current rules, I couldn't just use my collaborator's IRB approval as all institutions need to approve a protocol. However, another of the proposed changes will require only one approval.

I'm very optimistic about these proposed changes... let's hope they happen!