I discovered (the hard way) yesterday that some of the error paths in
execv() in sol2 don't release the execv lock.
sigh.
Patch follows, and will also appear on the web site. (Those of you who
stuck to your own code can ignore this.)
Index: sol2/kern/userprog/runprogram.c
===================================================================
RCS file: /disk/disk0/cs161/CVSREPO/os161/sol2/kern/userprog/runprogram.c,v
retrieving revision 1.14
retrieving revision 1.15
diff -u -r1.14 -r1.15
--- runprogram.c 2002/03/07 22:59:39 1.14
+++ runprogram.c 2002/03/28 22:49:28 1.15
@@ -310,17 +310,20 @@
/* make up argv strings */
if (strlen(progname) + 1 > ARG_MAX) {
+ lock_release(argdata.lock);
return E2BIG;
}
/* allocate the space */
argdata.buffer = kmalloc(strlen(progname) + 1);
if (argdata.buffer == NULL) {
+ lock_release(argdata.lock);
return ENOMEM;
}
argdata.offsets = kmalloc(sizeof(size_t));
if (argdata.offsets == NULL) {
kfree(argdata.buffer);
+ lock_release(argdata.lock);
return ENOMEM;
}
@@ -397,11 +400,13 @@
/* allocate the space */
argdata.buffer = kmalloc(ARG_MAX);
if (argdata.buffer == NULL) {
+ lock_release(argdata.lock);
return ENOMEM;
}
argdata.offsets = kmalloc(NARG_MAX * sizeof(size_t));
if (argdata.offsets == NULL) {
kfree(argdata.buffer);
+ lock_release(argdata.lock);
return ENOMEM;
}
--
- David A. Holland / dholland(a)eecs.harvard.edu
One of the groups ran across some mysterious behavior with the `tlbp'
instruction that turned out to be the result of a hardware bug.
It wasn't clearing the "probe failed" bit on a successful probe, so if
you did multiple TLB_Probe calls without an intervening TLB_Read or
TLB_Write (which happen to clear that bit as a side effect) the probe
may falsely appear to fail without finding anything.
A new System/161 release (0.98) has been issued and deployed.
This has two implications:
(1) if a mysterious problem you've been seeing mysteriously
disappears and never comes back, and could conceivably be
related to this issue, this change may be why it disappeared.
(2) if you're working at home and/or have compiled your own
System/161 for some other reason, you should download and
install the new version.
--
- David A. Holland / dholland(a)eecs.harvard.edu
A. Student wrote:
> Kind of a last-minute discovery, but it might save someone from frantic
> bug-hunting anyway: there's a bug in malloctest.c, test 2. When it attempts
> to allocate a second block to ensure that the memory from the first has
> been correctly freed, it never frees that second block, so you end up very
> quickly leaking away most of your memory, until eventually it can't
> allocate any at which point you have an infinite loop of "0 bytes: failed".
> If you only run test 2 once and don't run any other tests, you won't see
> the leak, but as soon as you start trying to run multiple tests it shows up.
Oops....
Everybody take note. :-|
--
- David A. Holland / dholland(a)eecs.harvard.edu
I respectfully offer ip that it might be a good idea to test your code
with many different memory sizes. Some appropriate ones might include:
#31 busctl ramsize=524288
#31 busctl ramsize=1048576
#31 busctl ramsize=2097152
#31 busctl ramsize=4194304
#31 busctl ramsize=16777216
You might find yours works peachy for 512K, but is unhappy for other
sizes.
(Also, make sure your final test is with random autoseed.)
good luck,
-mike
Today's 4-5 section is cancelled. Instead, I am meeting with each group
that is sectioned for this time individually. If you wanted to attend
today's 4-5 PM section because you cannot attend the one you have signed
up for, you can arrange to meet with me. I am free today until 4 PM and
after 7 PM.
-- Sasha
Someone found a race condition in thread_exit in the base OS/161
system: if you get a timer interrupt at the wrong time, it may end up
calling as_activate on a stale address space pointer.
The quick fix is to move the splhigh() up before the call to
as_destroy, like in the enclosed patch.
(A better fix is to store the address in a temporary and set
curthread->vmspace to NULL before calling as_destroy.)
Index: src/kern/thread/thread.c
===================================================================
RCS file: /disk/disk0/cs161/CVSREPO/os161/src/kern/thread/thread.c,v
retrieving revision 1.22
diff -U6 -r1.22 thread.c
--- thread.c 2002/03/05 21:58:22 1.22
+++ thread.c 2002/03/17 18:27:27
@@ -438,23 +438,24 @@
assert(curthread->stack[0] == (char)0xae);
assert(curthread->stack[1] == (char)0x11);
assert(curthread->stack[2] == (char)0xda);
assert(curthread->stack[3] == (char)0x33);
}
+ splhigh();
+
if (curthread->vmspace) {
as_destroy(curthread->vmspace);
curthread->vmspace = NULL;
}
if (curthread->cwd) {
VOP_DECREF(curthread->cwd);
curthread->cwd = NULL;
}
- splhigh();
assert(numthreads>0);
numthreads--;
mi_switch(S_ZOMB);
panic("Thread came back from the dead!\n");
}
--
- David A. Holland / dholland(a)eecs.harvard.edu
It appears that the "sequential" argument to free_kpages, as called by
the kmalloc code, can have one of two values:
1, meaning the address being freed is a single page kmalloc was
using for small allocations; or
2, meaning the address being freed is either a large allocation
done with alloc_kpages, or a completely invalid free attempt.
In the first case the block being freed should be exactly one page
long; in the second case, however, it may also be exactly one page
long (or it may be longer).
It is thus questionable whether this argument provides any useful data
at all; you may find it simplest to ignore it.
In any event, I apologize for the undocumented and icky use of magic
signalling values.
--
- David A. Holland / dholland(a)eecs.harvard.edu
The solution set code for assignment 2 has been released; it can be
fetched (either as a complete tree or as a diff against sol1) from the
assignments page.
We apologize for the delay.
--
- David A. Holland / dholland(a)eecs.harvard.edu
> Valhala page 23 describes UNIX as being:
> "... re-entrant, meaning that several processes may
> be involved in kernel activity concurrently."
>
> is os161 kernel designed to be similarly
> "re-entrant"?
Yes.
> if so, where is the code to handle kernel stacks, and
> is it written yet or do we (eventually) provide it?
Look at the trap code.
- M