Solaris 8, afs3.6 2.6 problem

Mary P Washburn mpmw@silk.Stanford.EDU
Tue, 3 Apr 2001 08:15:22 -0700 (PDT)

In the past 3 months we have upgraded our AFS servers to solaris 8
running afs3.6 2.6.  Recently we are experiencing a problem where a
random client machine will lose sight of one or more afs servers;
rebooting the client machine, purging the cache, or other client
tricks do not fix the problem.  The client machines are a variety of
hardware, OS levels, and AFS client versions.  It appears that the
problem is with the AFS server and some internal database where that
client's name or IP are cached incorrectly.  The only way to fix the
problem is to restart bos on the problem AFS servers.  bos restart
-all for the AFS server causing the particular "connection timeout
problem" provides a fix, but restarting bos causes a 5 (or more)
minute outage or slowdown for anyone using AFS on campus, since we
have 20 AFS servers and the connection timeout problems affect any one
of these machines, this is not a comfortable fix.  Have any other
users experienced this problem?  Are there any other fixes aside from
bos restart?  We are considering downgrading our AFS server software
to afs3.6 2.0.

Thanks for any help you can provide.

  Mary Washburn
  Leland Systems
   Computing Systems and Services
   Stanford University
   Stanford, CA   94305-3090
   Telephone:  (650) 723-0334
   Fax:	    (650) 725-9121