Everything is fake: GFS install on Gentoo Linux

GFS, 一套由 GPL 變成 commercial, 後來又被 Red Hat 買下來變成 GPL 的一套 shared disk file system.

之前我以為這是要錢的，後來經過某長輩 (z1x) 的指正，才知道原來要錢的是，RHCS (Red Hat Cluster Suite)，這套通通幫你編好相關的 binary 了，你只要圖形介面按一按，就能 create 你要的 cluster 環境了 (好像某長輩也需要這東西)，其實介面就是 Conga 弄出來的。

本來以為 GFS 的運作會像前幾篇介紹的 Gluster, PVFS, Lustre 一樣，後來發現不太像，在這邊花費了相當多的時間。

目前測試的環境：

kernel: 2.6.29-gentoo-r5

gfs: 2.03.09

openais: 0.80.3

cluster-2, cluster-3 都有 linux kernel 某個版本以上的限定，請參考GFS官方網站。

不然可能會有類似下面的錯誤訊息：

cluster-3.0.0.rc2 # ./configure

Configuring Makefiles for your system...

Checking tree: nothing to do

Checking kernel:

Current kernel version: 2.6.28

Minimum kernel version: 2.6.29

FAILED!

Step1.

因為 GFS 是 over 在 OpenAIS framwork 上，所以，嗯，非裝不可。

你可以自行從 source 編譯，不過它有 depend Corosync 甚至 nss (Network Security Services) library 或像是 ldap, slang blahblah header files, 我懶，所以直接從 gentoo portage 安裝 :D

$ emerge -uD sys-cluster/openais

Step2.

kernel 編譯，請記得把 multicast, nbd (Network block device support), gfs2, lock_dlm, dlm 都編成模組。

編好後重開機，用新 kernel.

Step3.

安裝相關 userspace 套件。

$ emerge -uD sys-libs/slang (rgmanager need)

$ emerge -uD sys-cluster/rgmanager

$ emerge -uD sys-fs/gfs2

These are the packages that would be merged, in order:

Calculating dependencies... done!

[ebuild N ] sys-cluster/cman-lib-2.03.09 1,743 kB

[ebuild N ] dev-python/pexpect-2.3 USE="-doc -examples" 148 kB

[ebuild N ] sys-cluster/ccs-2.03.09 0 kB

[ebuild N ] sys-cluster/openais-0.80.3-r1 USE="-debug" 468 kB

[ebuild N ] sys-cluster/dlm-lib-2.03.09 0 kB

[ebuild N ] sys-cluster/dlm-2.03.09 0 kB

[ebuild N ] sys-cluster/cman-2.03.09-r1 0 kB

[ebuild N ] perl-core/libnet-1.22 USE="-sasl" 67 kB

[ebuild N ] dev-perl/Net-SSLeay-1.35 130 kB

[ebuild N ] virtual/perl-libnet-1.22 0 kB

[ebuild N ] dev-perl/Net-Telnet-3.03-r1 35 kB

[ebuild N ] sys-cluster/fence-2.03.09-r1 0 kB

[ebuild N ] sys-fs/gfs2-2.03.09 USE="-doc" 0 kB

Total: 13 packages (13 new), Size of downloads: 2,588 kB

$ emerge -uD gnbd-kernel

$ emerge -uD sys-block/nbd (如果你 gnbd-kernel 編的過的話，請優先用 gnbd Orz...)

$ emerge -uD sys-cluster/gnbd (如果你 gnbd-kernel 編不過的話，請用 nbd = =")

Step4.

載入模組。

$ depmod -a

$ modprobe gfs2

$ modprobe configfs

$ modprobe dlm

$ modprobe lock_dlm

$ modprobe nbd

$ lsmod

Module Size Used by

nbd 10084 0

lock_dlm 14116 0

dlm 112656 10 lock_dlm

configfs 22668 2 dlm

gfs2 332196 1 lock_dlm

$ dmesg

GFS2 (built Jun 1 2009 17:46:14) installed

DLM (built Jun 1 2009 17:45:59) installed

Lock_DLM (built Jun 1 2009 17:46:25) installed

nbd: registered device at major 43

Step5.

組態設定。

$ cat /etc/cluster/cluster.conf (僅需放在其中一個 node, cluster 啟動時會自動複製到其他 node)

$ cat /etc/ais/openais.conf

totem {

version: 2

secauth: off

threads: 0

nodeid: 2

interface {

ringnumber: 0

bindnetaddr: 140.110.x.0

mcastaddr: 226.94.1.1

mcastport: 5405

}

Step6.

啟動服務。

你可以簡單的用 /etc/init.d/gfs2

$ /etc/init.d/gfs2 start

* Loading dlm kernel module ... [ ok ]

* Loading lock_dlm kernel module ... [ ok ]

* Mounting ConfigFS ... [ ok ]

* Starting ccsd ... [ ok ]

* Starting cman ... [ ok ]

* Waiting for quorum (300 secs) ... [ ok ]

* Starting groupd ... [ ok ]

* Starting fenced ... [ ok ]

* Joining fence domain ... [ ok ]

* Starting dlm_controld ... [ ok ]

* Starting gfs_controld ... [ ok ]

* Starting gfs2 cluster:

* Loading gfs2 kernel module ... [ ok ]

或是以下方式啟動 debug 模式：

$ mount -t configfs none /sys/kernel/config

$ ccsd -n

$ cman_tool join -d

$ groupd -D

$ fenced -D

$ dlm_controld -D

$ gfs_controld -D

$ fence_tool join

Step7.

測試！

$ ccs_test connect

Connect successful.

Connection descriptor = 1950

$ cman status

Version: 6.2.0

Config Version: 1

Cluster Name: mycluster

Cluster Id: 56756

Cluster Member: Yes

Cluster Generation: 216

Membership state: Cluster-Member

Nodes: 3

Expected votes: 1

Total votes: 3

Node votes: 1

Quorum: 2

Active subsystems: 7

Flags: Dirty

Ports Bound: 0

Node name: node26

Node ID: 2

Multicast addresses: 226.94.1.1

Node addresses: 140.110.x.26

$ cman_tool services

type level name id state

fence 0 default 00010002 none

[2 3 4]

$ cman_tool nodes

Node Sts Inc Joined Name

2 M 208 2009-06-02 19:00:44 node26

3 M 212 2009-06-02 19:03:57 node27

4 M 216 2009-06-02 19:06:37 node28

Step8.

格式化 partition 與掛載。

$ mkfs -t gfs2 -p lock_dlm -t mycluster:testgfs2 -j 4 /dev/cciss/c0d1p1

This will destroy any data on /dev/cciss/c0d1p1.

It appears to contain a LVM2_member raid.

Are you sure you want to proceed? [y/n] y

Device: /dev/cciss/c0d1p1

Blocksize: 4096

Device Size 33.91 GB (8890316 blocks)

Filesystem Size: 33.91 GB (8890316 blocks)

Journals: 4

Resource Groups: 136

Locking Protocol: "lock_dlm"

Lock Table: "mycluster:testgfs2"

$ mount -t gfs2 -v /dev/cciss/c0d1p1 /mnt/gfs

/sbin/mount.gfs2: mount /dev/cciss/c0d1p1 /mnt/gfs

/sbin/mount.gfs2: parse_opts: opts = "rw"

/sbin/mount.gfs2: clear flag 1 for "rw", flags = 0

/sbin/mount.gfs2: parse_opts: flags = 0

/sbin/mount.gfs2: parse_opts: extra = ""

/sbin/mount.gfs2: parse_opts: hostdata = ""

/sbin/mount.gfs2: parse_opts: lockproto = ""

/sbin/mount.gfs2: parse_opts: locktable = ""

/sbin/mount.gfs2: message to gfs_controld: asking to join mountgroup:

/sbin/mount.gfs2: write "join /mnt/gfs gfs2 lock_dlm mycluster:testgfs2 rw /dev/cciss/c0d1p1"

/sbin/mount.gfs2: message from gfs_controld: response to join request:

/sbin/mount.gfs2: lock_dlm_join: read "0"

/sbin/mount.gfs2: message from gfs_controld: mount options:

/sbin/mount.gfs2: lock_dlm_join: read "hostdata=jid=0:id=262146:first=1"

/sbin/mount.gfs2: lock_dlm_join: hostdata: "hostdata=jid=0:id=262146:first=1"

/sbin/mount.gfs2: lock_dlm_join: extra_plus: "hostdata=jid=0:id=262146:first=1"

/sbin/mount.gfs2: mount(2) ok

/sbin/mount.gfs2: lock_dlm_mount_result: write "mount_result /mnt/gfs gfs2 0"

/sbin/mount.gfs2: read_proc_mounts: device = "/dev/cciss/c0d1p1"

/sbin/mount.gfs2: read_proc_mounts: opts = "rw,hostdata=jid=0:id=262146:first=1"

$ df -h

/dev/cciss/c0d1p1 34G 518M 34G 2% /mnt/gfs

Step9.

disk share.

這邊使用的是 native kernel nbd module, 建議用 gnbd.

9.1. nbd server configuration.

$ cat /etc/nbd-server/config (on node26)

[generic]

[export]

exportname = /dev/cciss/c0d1p1

port = 2000

authfile = /etc/nbd-server/allow

$ cat /etc/nbd-server/allow (on node26)

140.110.x.26

140.110.x.27

140.110.x.28

140.110.x.0/24

9.2. nbd server export.

$ nbd-server (on node26)

9.3. nbd client import.

$ nbd-client node26 2000 /dev/nbd0 (on node27)

Negotiation: ..size = 35561264KB

bs=1024, sz=35561264

9.4. mount!

$ mount -t gfs2 /dev/nbd0 /mnt/gfs (on node27)

$ df -h (on node27)

/dev/nbd0 34G 518M 34G 2% /mnt/gfs

$ cman_tool services (on node27)

type level name id state

fence 0 default 00010002 none

[2 3 4]

dlm 1 testgfs2 00020003 none

[3]

gfs 2 testgfs2 00010003 none

[3]

Step10.

另外一台 client 掛載

$ nbd-client node26 2000 /dev/nbd0 (on node28)

Negotiation: ..size = 35561264KB

bs=1024, sz=35561264

$ mount -t gfs2 /dev/nbd0 /mnt/gfs/ (on node28)

$ df -h (on node28)

Filesystem Size Used Avail Use% Mounted on

/dev/nbd0 34G 518M 34G 2% /mnt/gfs

$ cman_tool services (on node28)

type level name id state

fence 0 default 00010002 none

[2 3 4]

dlm 1 testgfs2 00020003 none

[3 4]

gfs 2 testgfs2 00010003 none

[3 4]

$ cman_tool services (on node27)

type level name id state

fence 0 default 00010002 none

[2 3 4]

dlm 1 testgfs2 00020003 none

[3 4]

gfs 2 testgfs2 00010003 none

[3 4]

Step11.

concurrent write test.

$ vim /etc/gfs/concurrent_test.txt (on node28)

$ vim /etc/gfs/concurrent_test.txt (on node27)

E325: ATTENTION

Found a swap file by the name ".concurrent_test.txt.swp"

owned by: root dated: Wed Jun 3 07:31:38 2009

file name: /mnt/gfs/concurrent_test.txt

modified: YES

user name: root host name: node28

process ID: 4454

While opening file "concurrent_test.txt"

(1) Another program may be editing the same file.

If this is the case, be careful not to end up with two

different instances of the same file when making changes.

Quit, or continue with caution.

(2) An edit session for this file crashed.

If this is the case, use ":recover" or "vim -r concurrent_test.txt"

to recover the changes (see ":help recovery").

If you did this already, delete the swap file ".concurrent_test.txt.swp"

to avoid this message.

Swap file ".concurrent_test.txt.swp" already exists!

"concurrent_test.txt" [New File]

說明：

*停止服務。

umount [-v] "mountpoint"

nbd-client -d /dev/nbd0

fence_tool leave

cman_tool leave

*更新 cluster.conf.

ccs_tool update foo.conf (記得更新 config_version)

革命尚未成功，完整的架構是 gfs2 + gnbd + clvm...

Update:

gnbd 不 support 了... 請看這裡與那裡。

Everything is fake

2009年6月2日星期二

GFS install on Gentoo Linux

沒有留言:

張貼留言

網誌存檔

關於我自己

Everything is fake

2009年6月2日 星期二

GFS install on Gentoo Linux

沒有留言:

張貼留言

網誌存檔

關於我自己

2009年6月2日星期二