HOMEWORK 3 GRADING GUIDELINES
These guidelines describe how to evaluate and communicate your
evaluations of students' homework problems. For general instructions
on how to record comments and scores, and how to communicate concerns
back to the instructor, please see
http://classes.engineering.wustl.edu/cse547/ehomework/grading.html
GENERAL GUIDELINES AND EXPECTATIONS
(1) Every problem should receive not only a numerical score (see below
for the scale) but also polite, informative comments explaining any
shortcomings that led to a reduction in score. These comments must be
made, and the score must be recorded, as comments on the PDF itself.
The score should also be recorded on the spreadsheet included in your
Box folder.
(2) Follow the per-problem guidelines when assessing if errors or
omissions in the solution are minor, moderate, or major for scoring
purposes. Within those guidelines, I expect you to use good judgment,
and to be consistent across assignments. CONSISTENCY BETWEEN
ASSIGNMENTS AND BETWEEN GRADERS ON THE SAME PROBLEM IS VERY IMPORTANT.
(3) Do not take off points for careless, trivial mistakes, but do note
them if you see them. Do not nitpick grammar, spelling, handwriting,
or teeny weeny notation bugs. If *in the aggregate* these issues make
a solution very hard to read or understand, please leave a note to us
about it on the spreadsheet, but try to grade it as best you can.
(4) Solutions that do not follow the same approach as the instructor's
solution are fine, provided they are correct and well-justified.
There is often more than one way to solve a problem. If you see a lot
of solutions taking an approach that you consider to be unusual or
questionable, please let the instructor know ASAP.
(5) A proposed construction must always be justified. We have
established class norms for what constitutes an adequate
justification, which includes some sort of formal proof. Within this
framework, however, there is a lot of wiggle room as to how much
detail is sufficient. (The instructor's solutions tend to be somewhat
overly detailed!) A good guide is whether you would find a given
assertion surprising -- if so, and it is not backed up by
justification, the solution has an issue. "Obvious" observations need
not be justified. Induction on string length or number of moves is
often a good way to formalize an intuition about an automaton's
behavior on strings, but the proof may be persuasive even without it.
(6) If you cannot understand the approach (right or wrong) that the
student took to solving the problem after five minutes of trying,
please ask one of the TAs to look at it. If they can't help either,
leave a note on the spreadsheet for the instructor to look at it.
GRADING SCALE
Each problem will receive a score of between 0 and 6. No fractional
scores should be given. Problems with multiple parts should receive a
separate score for each part.
The meaning of the scores is as follows:
6 - close to perfect. Trivial errors or omissions that are
clearly typos or math errors (e.g. the odd off-by-one)
and are easily fixed are OK. Trivial notational errors are
also OK for a "6" solution.
5 - minor errors or omissions that are more than just typos and
indicate wrong thinking or missing pieces, in an otherwise
high-quality solution.
Examples of minor errors include important confusion between
states and sets or strings and classes; leaving out a small,
non-core part of the transition function for a defined automaton;
or omitting a minor case in a proof.
4 - a lot of minor errors, or at least one moderate error. The
solution is salvageable but clearly has issues.
Examples of moderate errors include leaving out an important case
in an otherwise correct proof; translating a clearly correctly
described intuition into a broken but salvageable formal
construction; leaving off the easy direction in a proof that
requires an iff; or giving a proof where the logic is generally OK
but makes an important unjustified or wrong step.
3 - multiple moderate errors or a serious error -- the solution is
broken or seriously unjustified.
Examples of serious errors include giving a good intuition and
construction, but little to no proof of correctness; leaving off
the hard direction in a proof that requires an iff; or giving a
very broken construction or incomplete justification that implies
wrong ideas about how to solve the problem.
2 - multiple serious errors; a very broken solution
Giving a good intuition but offering *no* supporting construction
or proof would be a 2, unless the missing details are super-obvious
(in which case, it would be a 3). Severely wrong intuition may be
a 2 or 3 depending on how far the solution gets before failing.
1 - hopeless / incoherent solution
Wrong intuition with no supporting construction or justification is
at most a 1.
0 - nothing submitted, or no attempt to solve the general problem
(as opposed to one or a few specific examples)
I would advise the graders for a given problem to TALK TO EACH OTHER
ensure that everyone has roughly the same understanding of how to
apply the grading scale.
I will provide some guidance for each problem on common minor,
moderate, and major errors.
SUGGESTED APPROACH
Start by looking for the intuition or, if one is not given, the
construction. Is it similar to something you recognize? If not, is it
obviously wrong or incomplete? Different solutions may give quite
different levels of detail -- if you can figure out what is going on,
this is perfectly OK.
Look for the correctness proof. If it tries to prove equivalence of
two things, are all the steps iff, or does the proof have two
directions? If it tries to justify the correctness of an automaton
construction, does the justification apply to arbitrary input strings?
Overall, are the proofs readable? Are any important assertions made
without justification?
You will probably find many solutions that are incomplete or go astray
at some point.
COLLABORATION POLICY ISSUES
If you see two or more solutions that look suspiciously similar,
please privately let the instructor know the names of the students
involved. Note that I'm looking for similarities that are beyond what
you might expect based on following the course collaboration policy,
which allows students to discuss their approach verbally in detail.
If you want me to act on a possible case of cheating, the evidence
needs to be fairly convincing.
------------------------------------------------------------------
* PER-PROBLEM GUIDELINES
(1)
For each of these, give at most a 3 if the wrong conclusion is reached
(i.e. the student falsely claims that the problem is decidable/undecidable).
You may opt to give less based on how incomplete the resulting solution is.
For each one, I need to see an appropriate justification/proof.
(a) This is the only one that could be done with Rice's Theorem, but I
explicitly said not to -- give at most a 2 if they do.
If they follow the right general approach, score based on whether you
can understand their construction. Several of the solutions I saw had
very confusing language -- it was not immediately clear if the
constructed TM accepted a regular language in one case and an
irregular language in the other. This would be a moderate error if
they get it wrong.
(b) My solution requires enumerating all the inputs up to k symbols,
and arguing the number of steps needed to determine if the TM will
ever pass the kth cell. If they do the second part but not the first,
that is a moderate error.
Don't be too harsh if they overcount the number of steps, but do point
it out. Definitely don't take off if they are off by a small constant
factor, but it is a minor error to be off by a lot (a factor of |Q| or
|Gamma|) in an othewise correct argument. Do not be picky about
whether the input starts at the first cell or the second cell of the
tape.
(c) Not much to say on this one -- make sure the argument is correct
and complete, and score 2-4 otherwise depending on severity of issues.
(2)
(a) If they claim the language is undecidable, that is a major error.
To show decidability, they need to bound the number of steps needed
to decide if the TM ever moves left. There are three cases of interest:
- does the TM halt without moving left?
- does the TM loop on a single cell without halting?
- does the TM move right forever without halting?
The interesting work is to bound the steps in the second and
*especially* the third case. If they don't address the second case at
all but consider only rightward moves throughout, that is a moderate
error.
For the third case, there are multiple arguments that work. The key
thing is that, after moving off the input, the TM cannot move right
too often (|Q| times) without looping. It is also correct to bound
the total number of moves *after* moving off the input by |Q||Gamma|.
If they argue this case invalidly, it is a moderate error.
(b) If they claim anything other than the regular language, that
is a major error.
If they don't show both directions (i.e. fail to prove FA -> TM),
that is a minor error.
The interesting part is TM -> FA. There will be different levels of
detail provided. The essential thing is to understand how the move
function of the TM maps to that of the FA.
If they don't handle "S" moves of the TM that do not move the head,
that is a moderate error. It is possible to simplify the TM to get rid
of these moves, or to describe the FA's delta in terms of the set of
rightward moves of the TM.
If they don't handle TMs that may accept or reject before the end
of the input, that is a moderate error.
If they don't handle TMs that may accept only after moving off the end
of the input (i.e. by having the FA be able to transition out of its
accepting state and/or nondeterministically guess when the input
ends), point out the issue, which I sent you in email, but do not take
off, as I didn't call anyone on it until after the homeworks came in.
(3) The only way I know to prove this one is using enumerating
machines. For such proofs, make sure they describe a generic
enumerator for the RE language, along with a method to sample its
output and enumerate only samples that are greater (lexicographically)
than the last sample.
It's not OK to enumerate only strings that are *longer* than the
previous string, because length does not necessarily correspond to
the fixed lexicographic order. This is a moderate error.
(4) The solutions I've seen are based on testing whether the tiles
all have top > bottom, all have bottom > top, or have a combination (in
which case you can easily combine at most two tiles to get a matching
top and bottom sequence).
If they omitted the case that there exists one tile with top = bottom
(in which case the problem is trivially solved by this tile), comment
but don't take off.
(5) The essential parts of a solution are:
1. describe how the RAM's memory is simulated by the TM
2. describe how to implement each instruction
3. describe how to set up the simulation given the input, and
how to produce the desired output on halting.
Leaving out 1 is a moderate error; leaving out 2 is hopeless. If they
leave out 3 but it is tolerably clear that it could trivially be
added, that is a minor error.
For #1, they may refer to the construction we did in class to support
multiple, variable-length cells on a single tape. There are several
ways to do it. If they screw up some details involving counting #'s
correctly or similar, that is a minor error.
For #2, the level of detail does not need to be very high. What I
wrote in my solution is adequate. If they give much more detailed
descriptions involving states and marking, do not take off as long as
it is correct.
A common source of confusion is how to store the instruction sequence
(program) and the PC. Because the instruction sequence is of constant
size, both the sequence and the PC can be encoded in the TM's finite
control and need never appear on any tape. But it's OK if they choose
to store the PC on a tape, or to write the instructions to a tape
after the TM starts. What they cannot do is assume that the
instruction sequence is part of the input or magically appears on a
tape when the TM starts -- that is a moderate error.