Lars Friend
2008-02-08 20:43:02 UTC
Hello all,
I've got a strange problem that has been plaguing me for a while,
and I have spent a good portion of today trying to get to the bottom of.
We have a cluster of several machines (NetBSD 3.1 i386 running
release 3.1 GENERIC kernels), each performing different functions,
and we have our home directories mounted via nfs (exported split
between a couple of the servers) by running amd with the same maps on
each server. This allows a user to log in anywhere and have their
same home directory no matter which machine they are on. This is
generally very handy, and it works with remarkable stability, until I
go and move a [not logged in] user's home directory from one server
to another (for disk space management reasons, for instance).
The problem boils down to this: Every once in a while when I update
the amd maps, amd will catch the change quickly enough, and amq will
reflect the correct change, but the directory where the symlinks live
(which amd implements as a read-only local NFS system which we mount
on /home) will still have a symlink pointing to the old mapping location.
For instance, if I have server_a, server_b, and server_c, and
server_d where the home directories are mapped like this (on all four hosts):
bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
joe host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
and I change that map to: (after copying joe's home directory from
server b to server a)
bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
joe host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
After waiting for the map_reload_interval to expire, all four servers
(a,b,c,d) will (when asked with amq) tell me that the mapping has
been picked up and honored, but if I go and look in /home, sometimes
(but not always), the old symlink will still be there:
lrwxrwxrwx 1 root wheel 21 Feb 8 15:33 /home/joe ->
/.automount/server_b/data/export/home/joe
This seems to occur more on servers that have lots of activity in
the other amd mapped home directories, but not reliably on any given
host. An amq -f will not clear this condition, and it seems that the
only way to get rid of this is to stop and restart amd (which I
obviously don't like to do with 50+ users logged in).
Does anybody have any suggestions?
Thanks Much,
-lars
I've got a strange problem that has been plaguing me for a while,
and I have spent a good portion of today trying to get to the bottom of.
We have a cluster of several machines (NetBSD 3.1 i386 running
release 3.1 GENERIC kernels), each performing different functions,
and we have our home directories mounted via nfs (exported split
between a couple of the servers) by running amd with the same maps on
each server. This allows a user to log in anywhere and have their
same home directory no matter which machine they are on. This is
generally very handy, and it works with remarkable stability, until I
go and move a [not logged in] user's home directory from one server
to another (for disk space management reasons, for instance).
The problem boils down to this: Every once in a while when I update
the amd maps, amd will catch the change quickly enough, and amq will
reflect the correct change, but the directory where the symlinks live
(which amd implements as a read-only local NFS system which we mount
on /home) will still have a symlink pointing to the old mapping location.
For instance, if I have server_a, server_b, and server_c, and
server_d where the home directories are mapped like this (on all four hosts):
bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
joe host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
and I change that map to: (after copying joe's home directory from
server b to server a)
bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
joe host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
After waiting for the map_reload_interval to expire, all four servers
(a,b,c,d) will (when asked with amq) tell me that the mapping has
been picked up and honored, but if I go and look in /home, sometimes
(but not always), the old symlink will still be there:
lrwxrwxrwx 1 root wheel 21 Feb 8 15:33 /home/joe ->
/.automount/server_b/data/export/home/joe
This seems to occur more on servers that have lots of activity in
the other amd mapped home directories, but not reliably on any given
host. An amq -f will not clear this condition, and it seems that the
only way to get rid of this is to stop and restart amd (which I
obviously don't like to do with 50+ users logged in).
Does anybody have any suggestions?
Thanks Much,
-lars