This is an interesting discussion. The issue will arise again in a
couple of weeks. As Jacob suggests, let's discuss it in detail at
that time. But I will note that I have been rather pleased with the
clarity with which people have been discussing it right now.
Cheers,
JS.
From: "Jacob M. Kline"
<jkline(a)fas.harvard.edu>
Sender: gov1000-list-admin(a)fas.harvard.edu
To: Michael Richard Kellermann <kellerm(a)fas.harvard.edu>
cc: Jason Lakin <jlakin(a)fas.harvard.edu>du>, <gov1000-list(a)fas.harvard.edu>
Subject: Re: [gov1000-list] sample v. population confidence intervals
Date: Thu, 25 Sep 2003 14:15:37 -0400 (EDT)
You really don't need to use the n-1, not the least reason for this is
that the sample is quite large are the difference is neglible. Let this
go for now, because we have not discussed it in the lecture and because
there are many more important things to study in detail.
Michael Richard Kellermann writes:
Hi -
I buy that argument for most of the things for which we would want to
estimate the mean and standard deviation of the sample in order to
construct a confidence interval. Say we were using a thermometer score
for feelings about the president (leaving aside the problems with such a
measure). In that case there are many different sets of n responses
that would yield the same estimate for the mean. To calculate the
confidence interval we would also have to estimate how tightly the
particular sample that we drew clustered around the mean. Since we would
be estimating two separate parameters, we burn an additional degree of
freedom.
In this case, however, since the responses are approve/disapprove
(success/failure), there is only one possible set of n responses for any
given estimate of the proportion of approvals. It seems to me that the
estimated sample variance follows directly from the estimated mean and
does not have to be estimated separately. I still don't see why we would
need to ever use n-1 for this particular type of question.
Cheers,
Mike
On Wed, 24 Sep 2003, Jason Lakin wrote:
Mike, et al:
My intepretation of the events in question is that you always should use the
n-1 because you never know the actual population mean. However, on the
homework, we were told (at least tonight we were told- it doesn't say this
on the homework) to assume that we did know the population mean, and were
working backwards. In that case, you can use the population sd. The
difference seems to have to do with the presumption about what the
population mean is. if you assume that your estimate is right (a false
assumption, but the one we were supposed to use in the homework), then you
can use n. otherwise, its n-1.
However, i would note that i have never heard of anyone actually doing this,
probably because the difference between n and n-1 is so small as to be
irrelevant in general. So most people just round the n-1 to n, and forget
about it. This is what i have learned in the past...
best
jason
_______________________________________________
gov1000-list mailing list
gov1000-list(a)fas.harvard.edu
http://www.fas.harvard.edu/mailman/listinfo/gov1000-list