OpenSSI Logo SourceForge Logo

 home page
 sourceforge page
 mailing lists
 feature list
 Bruce's corner
 related links
 1.2 stable
 1.9 development
 1.2 stable
 1.9 development
work items
 task list
 bug database
 feature requests
 process mgmt hooks
  hide sidebar
We are proposing 28 patches (which modify the kernel binary) in a total of 8
files, to allow the addition of other patches which are binary neutral.
This is 150 lines of additions and 28 lines of deletions, including comments
and white space.
We have run lmbench to determine that these patchs have no measurable impact
on the base.  In some cases they could be eliminated altogether with the
use of macros.
Below we have attempted to expalin the intent and impact of each patch.

A: drivers/char/n_tty.c
1. In is_ignored(), we add a test to see if the calling process is a
   svrproc (kernel thread servicing requests for remote processes).  This
   will never be true in the base (only if the clusterproc module is installed)
   so an extra if test is added to the base.

B: fs/exec.c:
1. In de_thread(), the call to ptrace_unlink(current) is
   replaced with:
	if (current->ptrace)
   which is just what ptrace_unlink() expands to.  The change is requested
   so we can put a hook in ptrace_unlink().  That hook (to get a sleep lock
   on the process) is not needed when the unlink is on "current".  We also
   propose moving the ptrace_unlink() to happen before the assignments to
   ptrace and parent.  This is requested because in the clusterwide case the
   __ptrace_unlink can release/reacquire the tasklist_lock and we want the
   assignments and the subsequent test of ptrace to be atomic w.r.t. the
2. Also in de_thread(), the ptrace_unlink(leader) call is replaced with:
		if (ptrace) {
                        if (clusterproc_ptrace_lock(leader, parent) >= 0)
                                ptrace = 0;
   Given that clusterproc_ptrace_lock() just returns 0 when CONFIG_CLUSTERPROC
   is not turned on, this ends up with the same code as the base.

C: include/linux/ptrace.h
1. Add the extern for the function do_ptrace_unlink(), which is now
   called in ptrace_unlink() instead of directly calling __ptrace_unlink();
   this gives an opportunity to put hooks into do_ptrace_unlink() to get
   sleep locks in the clusterwide case;
2. make ptrace_unlink() return an int, which in the clusterwide case will 
   indicate if the function released/reacquired the tasklist_lock;

D: include/linux/sched.h
1. A void *clusterproc; is added to the end of the task structure.  If it
   is important, this could be done under ifdef CONFIG_CLUSTERPROC.  This
   pointer will point to an allocated structure if the clusterproc module is
2. A flag (PF_REMOTE) is added to the per process flags. It is proposed that
   this flag would only be set on kernel threads servicing requests from remote
   processes and on local surrogate task structures.  Checking for the flag
   is proposed to be done in a few places in the base.
3. The declaration for do_notify_parent() is changed to return an int rather
   than a void.  This will allow it to indicate to callers whether the
   tasklist_lock was released/reacquired during its execution, which would be
   the case if a remote parent had to be notified.

E: kernel/exit.c:
1. In is_orphaned_pgrp(), a initial call to clusterproc_is_orphan_pgrp() is
   made because in the clusterwide process case, the management of orphanness
   is done differently.  The base version of clusterproc_is_orpha_pgrp() will
   just return -EINVAL so will_become_orphaned_pgrp() will be called, as it
   is in the base.
2. In reparent_thread(), the call to do_notify_parent() can now return
   an error (to indicated that the tasklist_lock was release/reacquired), which 
   is passed back to the callers.
3. In reparent_thread(), the code to determine if the process group of the
   exiting parent will now be orphaned is changed to allow an efficient
   implementation in the clusterwide case.  In the clusterwide case 
   will_become_orphaned_pgrp() will itself do the stopped_jobs and
   __kill_pg_info() calls so only if will... returns 1 does that code
   get executed (which is what will happen if clusterproc is not configured).
   Without clusterproc configured, the code should flow very much as it does
   in an unmodified base.
4. In forget_original_parent, an if test is added (in 2 places) to restrict 
   only local processes from being reparented in the intial passes (remote 
   processes are done later);
5. In forget_original_parent, the calls to reparent_thread(),
   ptrace_unlink() and do_notify_parent() could release/reacquire the 
   tasklist_lock (in the clusterwide case) and the test is added for this case; 
   in this case we start the list over.
6. In forget_oiginal_parent(), after doing all the local processes there is
   a call to clusterproc_rmt_reparent_children() which for some reason
   slightly rearranges the binary, eventhough the routine just returns 0
   when CONFIG_CLUSTERPROC is not defined.
7. In exit_notify(), the call to will_become_orphaned_pgrp() can now
   return -EREMOTE so the test just looks for values >0;
8. In exit_notify(), we propose to add the setting of tsk->exit_state to
   EXIT_ZOMBIE before calling do_notify_parent().  This is important to us
   because we will propose adding a hook in do_notify_parent to notify
   remotely executing parents and without this addition we won't know what
   to notify the parent about.  In the base, the parent can't see the
   child state until after the notification because the tasklist_lock is
   held.  In the clusterwide process model, the do_notify_parent() routine
   will have to release the tasklist_lock (to send the message to the
   parent exeuction node).
9. In wait_task_zombie(), wait_task_stopped() and wait_task_continued(),
   tests are made whether the process is remote and if so the function is
   executed on the node where the process is.
10. In do_wait, the return values from wait_task_stopped() already could
   expect an -EAGAIN, which meant the tasklist_lock had been released and
   the wait should be retried.  We added -EAGAIN as a possible error for,
   wait_task_zombie() and wait_task_continued as well.

F: kernel/ptrace.c
1. In __ptrace_unlink(), we need to save the child's old parent to
   allow us to update that parent later if it is remote;  
2. In __ptrace_unlink() we go to out: if the list is empty instead of
   returning so we can put a hook in out to update the remote parent w.r.t
   the child's ptrace flag.
3. We added the do_ptrace_unlink() routine to allow for hook routines
   to get and release the process sleep lock, which makes the ptrace_unlink()
   more atomic (some atomicity was lost in the clusterwide case because of
   remote parent notifications in __ptrace_unlink(); 
4. In ptrace_check_attached(), we would like to avoid the task structure
   pointer comparisons and replace them with pid comparisons.  The reason is
   that this will allow us to call this routine from a kernel thread on
   behalf of a remotely running process (the svrproc will masquarade as the
   remote process w.r.t. pid but obviously not on task structure address).
   This same issue comes up in sys_setpgid().
5. In ptrace_attach(), we would again like to be able to call this from a
   kernel thread acting on behalf of a remote process.  The existing base code
   calls __ptrace_link() with "current".  If the caller is remote, we
   will be setting up a surrogate task structure (not executable) and would
   like to call __ptrace_link() with that value.  The change to the base is
   just to assign "current" to new_parent and then to use new_parent where
   "current" was used before.  
6. In ptrace_detach, we replace the call to __ptrace_unlink() with one to
   ptrace_unlink() so the hook to get the proc_lock can be invoked.  In the
   base this will only mean an extra test of whether the process is traced.

G: kernel/signal.c
1. The call to clusterproc_rmt_kill_all() in the function
   kill_something_info() modifies the base even when the hook is empty because
   parameters are passed by address.
2. do_notify_parent() and do_notify_parent_cldstop() are modified to return a 
   value (0 in the base) to allow the clusterproc hooks to return -EREMOTE as 
   a way to indicate that the tasklist_lock was released/reacquired.
3. In do_notify_parent(and do_notify_parent_cldstop), we test the PF_REMOTE 
   flag and call the clusterproc routine if the parent is remote.  We could 
   do the remote test in the hook but the base code would still be changed 
   because we are passing a paramenter by address (so the code does not 

H: kernel/sys.c
1. In sys_setpgid(), we would like to avoid the task structure
   pointer comparisons and replace them with pid comparisons.  The reason is
   that this will allow us to call this routine from a kernel thread on
   behalf of a remotely running process (the svrproc will masquarade as the
   remote process w.r.t. pid but obviously not on task structure address).
   This same issue comes up in ptrace_check_attached() as well.

This page last updated on Tue May 10 20:45:42 2005 GMT
privacy and legal statement