2010年5月14日 星期五

Hadoop install on Fedora

最近因為有大量的統計需求,需要強大的運算能力,因此小試了一下 Hadoop, 看是否能符合需求。
基本上安裝相當容易,不容易的地方在 Map/Reduce 相關的程式。

在開始前,建議先閱讀 HDFS Architecture 以了解設定上相關角色。

目前安裝在 Fedora 12 上,套件採用最新版的 hadoop-0.20.2
預計 master node: f180, slave node: f172, f173

Step1. 套件下載與設定

套件下載 (所有節點都執行)
# mkdir /usr/src/hadoop/
# cd /usr/src/hadoop/
# wget http://ftp.twaren.net/Unix/Web/apache/hadoop/core/stable/hadoop-0.20.2.tar.gz
# tar xvf hadoop-0.20.2.tar.gz
# cd hadoop-0.20.2

環境設定 (所有節點都執行)
指定 JAVA 目錄 (必要!)
# cat conf/hadoop-env.sh (Fedora 12 目前使用 openjdk)
-# export JAVA_HOME=/usr/lib/j2sdk1.5-sun
+export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk
指定結點資訊
# cat conf/core-site.xml
hadoop.tmp.dir
/mnt/btrfs/hadoop/hadoop-${user.name} #指定資料暫存目錄
fs.default.name
hdfs://f180:9000 #指定 HDFS master

# cat conf/hdfs-site.xml
dfs.replication
2 #設定 block replication

# cat conf/mapred-site.xml
mapred.job.tracker
f180:9001 #指定 Map/Reduce master

設定 master node (只在 master 執行)
[root@f180 hadoop-0.20.2]# cat conf/masters
f180
設定 slave nodes (只在 master 執行)
[root@f180 hadoop-0.20.2]# cat conf/slaves
f172
f173

建立暫存空間
[root@f180 hadoop-0.20.2]# mkfs.btrfs /dev/cciss/c0d1p1
[root@f180 hadoop-0.20.2]# mkdir /mnt/btrfs/hadoop/
[root@f180 hadoop-0.20.2]# mount -t btrfs /dev/cciss/c0d1p1 /mnt/btrfs/hadoop/

[root@f172 hadoop-0.20.2]# mkfs.btrfs /dev/cciss/c0d4p1
[root@f172 hadoop-0.20.2]# mkdir /mnt/btrfs/hadoop/
[root@f172 hadoop-0.20.2]# mount -t btrfs /dev/cciss/c0d4p1 /mnt/btrfs/hadoop/

[root@f173 hadoop-0.20.2]# mkfs.btrfs /dev/cciss/c0d4p1
[root@f173 hadoop-0.20.2]# mkdir /mnt/btrfs/hadoop/
[root@f173 hadoop-0.20.2]# mount -t btrfs /dev/cciss/c0d4p1 /mnt/btrfs/hadoop/

確認所有結點間,皆可透過 ssh, 以無密碼方式登入。
這部分做法請參考:SSH Login Without Password

Step2. 初始化檔案系統

[root@f180 hadoop-0.20.2]# bin/hadoop namenode -format
10/05/14 17:17:43 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = f180.twaren.net/211.79.x.180
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 0.20.2
STARTUP_MSG: build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by 'chrisdo' on Fri Feb 19 08:07:34 UTC 2010
************************************************************/
10/05/14 17:17:43 INFO namenode.FSNamesystem: fsOwner=root,root,bin,daemon,sys,adm,disk,wheel
10/05/14 17:17:43 INFO namenode.FSNamesystem: supergroup=supergroup
10/05/14 17:17:43 INFO namenode.FSNamesystem: isPermissionEnabled=true
10/05/14 17:17:43 INFO common.Storage: Image file of size 94 saved in 0 seconds.
10/05/14 17:17:44 INFO common.Storage: Storage directory /mnt/btrfs/hadoop/hadoop-root/dfs/name has been successfully formatted.
10/05/14 17:17:44 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at f180.twaren.net/211.79.x.180
************************************************************/

Step3. 啟動 DFS daemon

[root@f180 hadoop-0.20.2]# bin/start-dfs.sh
starting namenode, logging to /usr/src/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-namenode-f180.twaren.net.out
f173.twaren.net: starting datanode, logging to /usr/src/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-f173.twaren.net.out
f172.twaren.net: starting datanode, logging to /usr/src/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-datanode-f172.twaren.net.out
f180.twaren.net: starting secondarynamenode, logging to /usr/src/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-secondarynamenode-f180.twaren.net.out

Step4. 啟動 Map/Reduce daemon

[root@f180 hadoop-0.20.2]# bin/start-mapred.sh
starting jobtracker, logging to /usr/src/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-jobtracker-f180.twaren.net.out
f173.twaren.net: starting tasktracker, logging to /usr/src/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-f173.twaren.net.out
f172.twaren.net: starting tasktracker, logging to /usr/src/hadoop/hadoop-0.20.2/bin/../logs/hadoop-root-tasktracker-f172.twaren.net.out

Step5. 運算測試!
幾本上到 Step4 整個設定就算完成了,你可以檢查 logs 目錄相關紀錄,看是否有 errors 產生。

目前 hadoop 附的範例測試程式有 word count, 可到 gutenberg.org 下載一些電子書來計算。

目前抓了六份文件
[root@f180 hadoop-0.20.2]# ls -al /tmp/gutenberg/
total 6196
drwxr-xr-x 2 root root 4096 2010-05-14 17:29 .
drwxrwxrwt. 8 root root 4096 2010-05-14 17:20 ..
-rw-r--r-- 1 root root 343694 2007-12-03 23:28 132.txt
-rw-r--r-- 1 root root 1945731 2007-04-14 04:34 19699.txt
-rw-r--r-- 1 root root 674762 2007-01-22 18:56 20417.txt
-rw-r--r-- 1 root root 1573044 2008-08-01 20:31 4300.txt
-rw-r--r-- 1 root root 1391706 2009-08-14 07:19 7ldvc10.txt
-rw-r--r-- 1 root root 393995 2009-03-18 19:51 972.txt

計算前須將測試檔案載入 HDFS 中
[root@f180 hadoop-0.20.2]# bin/hadoop dfs -copyFromLocal /tmp/gutenberg gutenberg
[root@f180 hadoop-0.20.2]# bin/hadoop dfs -ls
Found 1 items
drwxr-xr-x - root supergroup 0 2010-05-14 17:29 /user/root/gutenberg
[root@f180 hadoop-0.20.2]# bin/hadoop dfs -ls gutenberg
Found 6 items
-rw-r--r-- 2 root supergroup 343694 2010-05-14 17:29 /user/root/gutenberg/132.txt
-rw-r--r-- 2 root supergroup 1945731 2010-05-14 17:29 /user/root/gutenberg/19699.txt
-rw-r--r-- 2 root supergroup 674762 2010-05-14 17:29 /user/root/gutenberg/20417.txt
-rw-r--r-- 2 root supergroup 1573044 2010-05-14 17:29 /user/root/gutenberg/4300.txt
-rw-r--r-- 2 root supergroup 1391706 2010-05-14 17:29 /user/root/gutenberg/7ldvc10.txt
-rw-r--r-- 2 root supergroup 393995 2010-05-14 17:29 /user/root/gutenberg/972.txt

執行 Map/Reduce !
[root@f180 hadoop-0.20.2]# bin/hadoop jar hadoop-0.20.2-examples.jar wordcount gutenberg gutenberg-output
10/05/14 17:33:51 INFO input.FileInputFormat: Total input paths to process : 6
10/05/14 17:33:52 INFO mapred.JobClient: Running job: job_201005141720_0001
10/05/14 17:33:53 INFO mapred.JobClient: map 0% reduce 0%
10/05/14 17:34:05 INFO mapred.JobClient: map 33% reduce 0%
10/05/14 17:34:08 INFO mapred.JobClient: map 66% reduce 0%
10/05/14 17:34:11 INFO mapred.JobClient: map 100% reduce 0%
10/05/14 17:34:14 INFO mapred.JobClient: map 100% reduce 33%
10/05/14 17:34:20 INFO mapred.JobClient: map 100% reduce 100%
10/05/14 17:34:22 INFO mapred.JobClient: Job complete: job_201005141720_0001
10/05/14 17:34:22 INFO mapred.JobClient: Counters: 17
10/05/14 17:34:22 INFO mapred.JobClient: Job Counters
10/05/14 17:34:22 INFO mapred.JobClient: Launched reduce tasks=1
10/05/14 17:34:22 INFO mapred.JobClient: Launched map tasks=6
10/05/14 17:34:22 INFO mapred.JobClient: Data-local map tasks=6
10/05/14 17:34:22 INFO mapred.JobClient: FileSystemCounters
10/05/14 17:34:22 INFO mapred.JobClient: FILE_BYTES_READ=4241310
10/05/14 17:34:22 INFO mapred.JobClient: HDFS_BYTES_READ=6322932
10/05/14 17:34:22 INFO mapred.JobClient: FILE_BYTES_WRITTEN=6936977
10/05/14 17:34:22 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=1353587
10/05/14 17:34:22 INFO mapred.JobClient: Map-Reduce Framework
10/05/14 17:34:22 INFO mapred.JobClient: Reduce input groups=123471
10/05/14 17:34:22 INFO mapred.JobClient: Combine output records=185701
10/05/14 17:34:22 INFO mapred.JobClient: Map input records=124099
10/05/14 17:34:22 INFO mapred.JobClient: Reduce shuffle bytes=2695475
10/05/14 17:34:22 INFO mapred.JobClient: Reduce output records=123471
10/05/14 17:34:22 INFO mapred.JobClient: Spilled Records=477165
10/05/14 17:34:22 INFO mapred.JobClient: Map output bytes=10427755
10/05/14 17:34:22 INFO mapred.JobClient: Combine input records=1067656
10/05/14 17:34:22 INFO mapred.JobClient: Map output records=1067656
10/05/14 17:34:22 INFO mapred.JobClient: Reduce input records=185701

運算結果產生於 gutenberg-output
[root@f180 hadoop-0.20.2]# bin/hadoop dfs -ls
Found 2 items
drwxr-xr-x - root supergroup 0 2010-05-14 17:29 /user/root/gutenberg
drwxr-xr-x - root supergroup 0 2010-05-14 17:34 /user/root/gutenberg-output
[root@f180 hadoop-0.20.2]# bin/hadoop dfs -ls gutenberg-output
Found 2 items
drwxr-xr-x - root supergroup 0 2010-05-14 17:33 /user/root/gutenberg-output/_logs
-rw-r--r-- 2 root supergroup 1353587 2010-05-14 17:34 /user/root/gutenberg-output/part-r-00000

將運算結果由 HDFS 取回
[root@f180 hadoop-0.20.2]# mkdir /tmp/gutenberg-output
[root@f180 hadoop-0.20.2]# bin/hadoop dfs -getmerge gutenberg-output /tmp/gutenberg-output
[root@f180 hadoop-0.20.2]# head /tmp/gutenberg-output/gutenberg-output
" 34
"'Course 1
"'Spells 1
"'Tis 1
"'Twas 1
"'Twere 1
"'army' 1
"(1) 1
"(Lo)cra" 1
"13 4

最後,Hadoop 提供幾個 Web UI 介面給大家參考。
http://f180:50030/ - web UI for MapReduce job tracker(s)
http://f180:50060/ - web UI for task tracker(s)
http://f180:50070/ - web UI for HDFS name node(s)


2010年5月9日 星期日

MySQL Replication

這邊要做的是簡易版的 db replication. 單純只是想取代傳統用 mysqldump 方式來備份 db.
如果你是想用強大的功能,請參考 MySQL Cluster.
設定非常簡單,目前mysql 版本為 5.1.46, 作業系統是 FreeBSD 7.3-STABLE / 8.0-STABLE
預設先採用 /usr/local/share/mysql/my-large.cnf

# cp /usr/local/share/mysql/my-large.cnf /etc/my.cnf

環境設定目標為,一台 sql master, 一台 slave, master 即時更新資料到 slave.

在此不限定是一台 master, 一台 slave, 可以是一台 master 對多台 slave, 或是多階層架構。

Step1. Master 環境設定
主要在 my.cnf 設定,告訴 SQL server 此次扮演的角色是甚麼,順便設定 replicate 哪個 database

確認底下幾行設定存在於 my.cnf (最低需求)
# cat /etc/my.cnf
log-bin=mysql-bin # mysql 藉由 log 記錄,來進行 replication 工作。
binlog-do-db = 100mountain # 指定要 replication 的資料庫。
server-id = 1 # 指定此機器的角色。

Step2. 建立同步需要的使用者及權限。

mysql> CREATE USER 'db_syncuser'@'%' IDENTIFIED BY 'syncuser_password';
mysql> GRANT REPLICATION SLAVE, REPLICATION CLIENT ON *.* TO 'db_syncuser'@'%';
mysql> FLUSH PRIVILEGES;
mysql> SHOW MASTER STATUS;

Step3. restart mysql master server

# /usr/local/etc/rc.d/mysql-server restart

Step4. Slave 環境設定

# cat /etc/my.cnf (最低需求)
replicate-do-db = 100mountain # 指定 replication 的資料庫。
server-id = 2 # 必須要與 master 不同。

master-host = db_master_ip
master-user = db_syncuser
master-password = syncuser_password

Step5. restart mysql slave server

# /usr/local/etc/rc.d/mysql-server restart

Step6. 測試

確認兩邊皆有 100mountain 這個資料庫
mysql> show databases;
+--------------------+
| Database |
+--------------------+
| information_schema |
| 100mountain |
| mysql |
| test |
+--------------------+
4 rows in set (0.00 sec)

確認 master 及 slave 都有正常啟動
mysql> SHOW MASTER STATUS;
mysql> SHOW SLAVE STATUS;

mysql> use 100moutain; # at master
Database changed
mysql> create table dbtest (col1 INT);
Query OK, 0 rows affected (0.01 sec)

mysql> use 100moutain; # at slave
Database changed
mysql> show tables;
+-----------------------+
| Tables_in_100mountain |
+-----------------------+
| dbtest |
+-----------------------+
1 row in set (0.01 sec)