Hammersley,
M. Measurement in ethnography: the case of
Pollard on teaching style, pages 49 to 60
Measurement can be found in ethnography if
we define it suitably. The 'standard model'
involves the development of specific
categories which are exhaustive exclusive
and can be located on a particular scale.
The rules should be 'explicitly stated,
should specify concrete indicators, should
be unambiguous, and should be applied in the
same way to all of the data' (49). However,
there is a more general issue — 'the linking
of abstract concepts to particular data; and
this problem faces ethnographers as much as
it does any other social researcher'
There is often 'some vagueness' here, and
problems arise, for example in the recent
article on teaching styles by Pollard. This
is an 'exemplary piece', with an explicit
theory and a wide range of data. The theory
is actually a complex one, presented 'in
softened terms — using words like
'influence', 'implication', 'reinforcement'
rather than causality or determination,
although he does argue that the working
consensus in a class is 'determined by the
coping strategies of teachers and pupils'.
Classroom regimes and teaching styles
(51) are described as located on a
'progressive – traditional dimension',and
individual teachers are allocated, using
different sorts of evidence . This ranges
from documentary information about teachers
and classes, observer description, often as
a summary over time, more 'frequency
specified observer description', time
specific observer description, quotations
from participants accounts used to document
perspectives and describe events. These are
'probably a representative sample of the
kinds of information ethnographers usually
employ' (53) but there are problems.
Accuracy means either that teachers and
pupils actually did do what is described,
whether transcriptions are accurate, or
verbalisation somehow authored. Possible
inaccuracies obviously increase when these
are recorded by field notes rather than
tape-recording. Non-verbal elements, however
are 'more difficult'. We require 'accurate
portrayal of patterns of physical movement',
and again field notes, the limitations of
recording rapid interaction, and memory
limitations present problems.
Ethnomethodologists show some particular
problems, especially when researchers
'attribute intentions and attitudes on the
basis of what people say or do' (54).
Observers might disagree, and it is not
usual to reveal the grounds for these
interpretations. Other evidence is more easy
to assess, as when conversations of
participants support researcher
interpretations. However participants
accounts as descriptions of events are
particularly open to 'threats to accuracy'
since here we are relying on others not just
the researcher. Multiple accounts as in
triangulation is an acceptable strategy,
'but it is no panacea' (55).
There are further problems with
generalisability, which involve researcher
judgements about tip equality. However,
'informal estimates of frequency are open
to… Large errors' (55). Providing
frequencies can help, but we still need to
know exactly how these were obtained, over
what sort of period, and how data (contacts
between teachers and pupils in this case)
were actually identified. In turn this
requires some specifications of
identification criteria and also how they
were applied — e.g. live coding or video
recording. In other words we have the usual
problems in devising category schemes and we
need information about how the most common
variations ('definition, coding procedure
and practice' (56)) were overcome.
Content validity refers to the extent to
which evidence exhausts all the components
of the definition (of this case traditional
or progressive). We need 'adequate
definitions of key concepts'. If the
researcher does not provide them, we could
try to explicate them ourselves or borrow
from other accounts, but the issue is how
the researcher has defined them.
Constructed validity is 'the extent to which
an indicator accurately measures the concept
or component of the concept it is supposed
to measure'. Do variations in the indicator
reflect actual variations in the variable.
Definitions again are required. Empirical
frequencies may not be very valid — for
example if an aspect of an overall style,
learning the tables, for example, is
frequently seen, but not the other
components. Other factors may also influence
particular frequencies, for example a
forthcoming visit by the inspectors 'or even
the presence of the Observer' (57). Another
problem is ad hoc use of indicators, and we
might need to look at how stable
interpretations of data actually are, across
observers, for example. There might be
random variations as well as systematic
error.
Overall we need to ask separate questions:
whether descriptions and explanations are
correct; whether the researchers taken the
best precautions and made the best checks to
maximise validity; whether the researchers
provided us with necessary information about
these precautions and checks. Usually, the
researcher is not in a position to offer
certainty in any of these. The point is that
they need to be addressed in future
research.
Scarth J and Hammersley M Some Problems
in Assessing the Closeness of Classroom Tasks,
pages 70 – 84
There have been variety of ways to define
teachers questions and examination
questions. Usually, most of the questions
turn out to be closed, requiring 'low-level
cognitive operations, notably memory' (70).
There are methodological questions in
measuring closedness, however, although they
have not been given much attention — many
researchers seem to underestimate them. The
problems turn on identifying the task;
specifying what closed means; the
reliability and validity of categories; the
weighting of tasks.
We have to identify tasks first, specifying
rules which helps decide whether task as
occurred and what those tasks might be.
There are particular problems with oral
questions: if a teacher asks a question the
resulting exchange can offer different
possibilities [described in a diagram on 72,
and ranging from pupils answers, silences,
or requests for clarification challenge,
which in turn produce a variety of teacher
responses ranging from overt acceptance to
answer guidance and negotiation]. Questions
run into other questions and go through
cycles — is every teacher elicitation and
answer a separate question? There is a
problem in identifying elicitation is or
matters such as acceptance of answers —
'words and syntax are an imperfect guide to
pragmatic function' (73), for example
questions might be rhetorical, and silences
can elicit answers. Sometimes teacher and
pupil talk overlaps. Some questions might
elicit a sequence of answers, so it is not
clear if this is one task or multiple tasks.
Sometimes one question is embedded with
another. Repetition and clarification
requests are not easy to code. Lots of other
questions and answers go on off task.
Overall there might be considerable
inconsistency in what is recorded [an
example follows], and different roles will
produce different numbers of questions and
answers [in the example cited there could be
either 20 or three questions depending on
definitions. Clearly the problems are
present in attempts to classify questions as
open or closed, and how significant any
differences in scores might be. The same
goes with attempts to classify written work
[with an example] — there could be either 42
or seven tasks.
Closedness is difficult to define. Does it
reflect teacher expectations about answers,
or the cognitive strategies pupils use.
These do not always match, although they are
quite often assumed to. Even taking teacher
expectations, a closed task can be either
one where teacher assumes there is a single
right answer, or whether the task is
intended to demand lower cognitive activity,
and even whether a clear indication is given
to pupils about what would count as a right
answer.
With category systems, the coding tasks are
equally difficult. How many should be used?
Should they be scaled or just used in
classification? Are the categories 'clearly
defined in mutually exclusive and
exhaustive, so that each and every task is
assignable, unambiguously, to one and only
one category' (78). Otherwise reliability
suffers. Each task has to be represented in
an accurate way too, as in constructed
validity.
'Few coding schemes for classroom tasks
approximate these ideals' (79). Getting
information about teacher intentions or the
cognitive operations of pupils is clearly
difficult, and cannot be 'read off
unproblematically' from what is observed.
Contextual information might be important,
for example in a revision session, teachers
might not explicitly require remembering but
will imply it. There is sometimes confusion
between psychological operations which
pupils perform and the logical status of the
information elicited — for example if close
questions require recall, an open ones
require explanation, the problem is that
explanations can also be recalled. Surface
forms tell us nothing about psychological
processes. We could ask teachers and pupils
to comment and indicate their intentions
although this 'has rarely been used'. (79)
There is always 'indeterminacy'with things
like teacher intentions. They might actually
change in response to pupils especially if,
as in Doyle and Carter, pupils can try to
simplify tasks and make them less risky.
Demands are perhaps best seen as negotiated,
but if this is so, they are difficult to
code because they might change. Actual
interaction might be emergent. Teachers
themselves might not apply their own
categories consistently and might differ
from those of the researcher. If we are
looking at what pupils do, there is another
indeterminacy because they might be
answering questions or completing tasks in
different ways. It is rare for these matters
of construct validity to be even addressed
in classroom research.
Tasks might need to be weighted rather than
just counted. In examinations, they might
attract different marks. Ranking them in
terms of different time allocations is more
problematic. Oral questions are particularly
difficult and problems of weighting have not
been typically addressed. Nor do pupils and
teachers necessarily see the importance of
tasks adequately reflected in their official
weight. Ignoring these problems 'is to
weight all tasks equally', however (81).
More
attention needs to be given to these
problems, even if 'effective solutions to
most of them' are not available.
Nevertheless we need 'systematic
investigation of these problems, and of
strategies for dealing with them' (82).