Discussion:
Odd behavior from amd
(too old to reply)
Lars Friend
2008-02-08 20:43:02 UTC
Permalink
Hello all,

I've got a strange problem that has been plaguing me for a while,
and I have spent a good portion of today trying to get to the bottom of.

We have a cluster of several machines (NetBSD 3.1 i386 running
release 3.1 GENERIC kernels), each performing different functions,
and we have our home directories mounted via nfs (exported split
between a couple of the servers) by running amd with the same maps on
each server. This allows a user to log in anywhere and have their
same home directory no matter which machine they are on. This is
generally very handy, and it works with remarkable stability, until I
go and move a [not logged in] user's home directory from one server
to another (for disk space management reasons, for instance).

The problem boils down to this: Every once in a while when I update
the amd maps, amd will catch the change quickly enough, and amq will
reflect the correct change, but the directory where the symlinks live
(which amd implements as a read-only local NFS system which we mount
on /home) will still have a symlink pointing to the old mapping location.

For instance, if I have server_a, server_b, and server_c, and
server_d where the home directories are mapped like this (on all four hosts):

bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
joe host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link

and I change that map to: (after copying joe's home directory from
server b to server a)

bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
joe host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link

After waiting for the map_reload_interval to expire, all four servers
(a,b,c,d) will (when asked with amq) tell me that the mapping has
been picked up and honored, but if I go and look in /home, sometimes
(but not always), the old symlink will still be there:

lrwxrwxrwx 1 root wheel 21 Feb 8 15:33 /home/joe ->
/.automount/server_b/data/export/home/joe

This seems to occur more on servers that have lots of activity in
the other amd mapped home directories, but not reliably on any given
host. An amq -f will not clear this condition, and it seems that the
only way to get rid of this is to stop and restart amd (which I
obviously don't like to do with 50+ users logged in).

Does anybody have any suggestions?

Thanks Much,

-lars
Christos Zoulas
2008-02-08 20:54:09 UTC
Permalink
Post by Lars Friend
Hello all,
I've got a strange problem that has been plaguing me for a while,
and I have spent a good portion of today trying to get to the bottom of.
We have a cluster of several machines (NetBSD 3.1 i386 running
release 3.1 GENERIC kernels), each performing different functions,
and we have our home directories mounted via nfs (exported split
between a couple of the servers) by running amd with the same maps on
each server. This allows a user to log in anywhere and have their
same home directory no matter which machine they are on. This is
generally very handy, and it works with remarkable stability, until I
go and move a [not logged in] user's home directory from one server
to another (for disk space management reasons, for instance).
The problem boils down to this: Every once in a while when I update
the amd maps, amd will catch the change quickly enough, and amq will
reflect the correct change, but the directory where the symlinks live
(which amd implements as a read-only local NFS system which we mount
on /home) will still have a symlink pointing to the old mapping location.
For instance, if I have server_a, server_b, and server_c, and
bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
joe host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
and I change that map to: (after copying joe's home directory from
server b to server a)
bob host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
joe host!=server_a;rhost:=server_a;rfs:=/data/export/home \
host==server_a;fs:=/data/export/home;type:=link
dave host!=server_b;rhost:=server_b;rfs:=/data/export/home \
host==server_b;fs:=/data/export/home;type:=link
After waiting for the map_reload_interval to expire, all four servers
(a,b,c,d) will (when asked with amq) tell me that the mapping has
been picked up and honored, but if I go and look in /home, sometimes
lrwxrwxrwx 1 root wheel 21 Feb 8 15:33 /home/joe ->
/.automount/server_b/data/export/home/joe
This seems to occur more on servers that have lots of activity in
the other amd mapped home directories, but not reliably on any given
host. An amq -f will not clear this condition, and it seems that the
only way to get rid of this is to stop and restart amd (which I
obviously don't like to do with 50+ users logged in).
Does anybody have any suggestions?
Use 'amq -u' to unmount the offending partition before 'amq -f'.

christos
Lars Friend
2008-02-08 21:04:28 UTC
Permalink
At 03:53 PM 2/8/2008, Christos Zoulas wrote:

(snip)
Post by Christos Zoulas
Post by Lars Friend
Does anybody have any suggestions?
Use 'amq -u' to unmount the offending partition before 'amq -f'.
I just tried this, and the symlink is still wrong. However if I ask
amq, it says all is well.

Thanks thought,

-lars
Post by Christos Zoulas
christos
Christos Zoulas
2008-02-08 22:33:07 UTC
Permalink
On Feb 8, 4:03pm, ***@mcci.com (Lars Friend) wrote:
-- Subject: Re: Odd behavior from amd

| At 03:53 PM 2/8/2008, Christos Zoulas wrote:
|
| (snip)
|
| > >Does anybody have any suggestions?
| >
| >Use 'amq -u' to unmount the offending partition before 'amq -f'.
|
| I just tried this, and the symlink is still wrong. However if I ask
| amq, it says all is well.
|
| Thanks thought,
|

What kind of maps are you using? Did you verify that after amq -u the
symlink was gone?

christos
Lars Friend
2008-02-19 15:46:48 UTC
Permalink
The amq -u does not remove the symlinks.

We are using nfs maps with symlinks (like this):

juser host!=homehost;rhost:=homehost;rfs:=/wd1e/export/home \
host==homehost;fs:=/wd1e/export/home;type:=link

As I understand it, amd consists of two different logical parts:

1) A server that hooks file system access, looks up the mountpoint,
and mounts it. (in /amd/...)

2) A userland NFS server that serves out a file system full of
symlinks to the mountpoints managed by the above process (in /home).

It seems that part 1 is working, such that if I move a home
directory, and then go crawling around in
/amd/homehost/wd1e/export/home/..., or ask amq, the correct
information will be picked up as expected, and the new filesystem
will be mounted. The problem is that part 2 never seems to get the
message. The galling thing is that it is _so_ close, the automounter
mounts the new file system just fine, so it unmounts
/amd/oldhost/wd1e/export/home/juser and correctly mounts the new tree
at /amd/homehost/wd1e/export/home/juser, but the symlink in /home
still (until amd is restarted) points to
/amd/oldhost/wd1e/export/home/juser which of course no longer exists...

I hope that clears things up. That being said, it is only minimally
disruptive to:

sudo /etc/rc.d/amd stop

sudo /etc/rc.d/amd start

but it just doesn't feel right to have to do that.

-lars
Post by Christos Zoulas
-- Subject: Re: Odd behavior from amd
|
| (snip)
|
| > >Does anybody have any suggestions?
| >
| >Use 'amq -u' to unmount the offending partition before 'amq -f'.
|
| I just tried this, and the symlink is still wrong. However if I ask
| amq, it says all is well.
|
| Thanks thought,
|
What kind of maps are you using? Did you verify that after amq -u the
symlink was gone?
christos
Loading...