| « | 十月 2008 | » | ||||
|---|---|---|---|---|---|---|
| 一 | 二 | 三 | 四 | 五 | 六 | 日 |
| 1 | 2 | 3 | 4 | 5 | ||
| 6 | 7 | 8 | 9 | 10 | 11 | 12 |
| 13 | 14 | 15 | 16 | 17 | 18 | 19 |
| 20 | 21 | 22 | 23 | 24 | 25 | 26 |
| 27 | 28 | 29 | 30 | 31 | ||
最近做了一个rac数据库的迁移,中间涉及到很多部分内容,包括rac环境的搭建、ASM的设置、数据库的迁移、升级等。
本文是这次迁移工作的第二部分:安装rac的准备工作。
在所有节点添加相同的用户组和用户:
pre1执行:
bash-3.00# groupadd oinstall
bash-3.00# groupadd dba
bash-3.00# mkdir -p /export/home/oracle
bash-3.00# useradd -u 200 -g oinstall -G dba –d /export/home/oracle oracle
bash-3.00# passwd oracle
New Password:
Re-enter new Password:
passwd: password successfully changed for oracle
安装rac时还会用到nobody用户,检查nobody用户是否存在,如果用户不存在,则需要添加用户:
bash-3.00# id nobody
uid=60001(nobody) gid=60001(nobody)
pre2
安装rac时oracle需要检验各个节点间用户的一致性,包括UID和GID。
先在pre1上查看oracle用户的信息:
bash-3.00# id -a oracle
uid=200(oracle) gid=100(oinstall) groups=101(dba)
在pre2上执行:
bash-3.00# groupadd -g 100 oinstall
bash-3.00# groupadd -g 101 dba
bash-3.00# mkdir -p /export/home/oracle
bash-3.00# useradd -u 200 -g oinstall -G dba –d /export/home/oracle oracle
bash-3.00# passwd oracle
New Password:
Re-enter new Password:
passwd: password successfully changed for oracle
bash-3.00# id -a oracle
uid=200(oracle) gid=100(oinstall) groups=101(dba)
bash-3.00# id nobody
uid=60001(nobody) gid=60001(nobody)
安装RAC要求节点至少有2个物理网卡,3个IP:PUBLIC IP、PRIVATE IP、VIRTUAL IP,其中公有IP和虚拟IP要在同一网段上。
在pre1上执行:
修改/etc/hosts文件,添加如下内容:
127.0.0.1 localhost
172.0.2.1 pre1 loghost
172.0.2.2 vip-pre1
10.0.0.1 priv-pre1
172.0.2.3 pre2
172.0.2.4 vip-pre2
10.0.0.2 priv-pre2
在pre2的/etc/hosts添加如下内容:
127.0.0.1 localhost
172.0.2.1 pre1
172.0.2.2 vip-pre1
10.0.0.1 priv-pre1
172.0.2.3 pre2 loghost
172.0.2.4 vip-pre2
10.0.0.2 priv-pre2
用dladm检查服务器上的网卡设备:
bash-3.00# dladm show-link
ce0 type: legacy mtu: 1500 device: ce0
ce1 type: legacy mtu: 1500 device: ce1
上面信息表示有两块网卡:ce0和ce1
bash-3.00# ifconfig -a
lo0: flags=2001000849<UP,LOOPBACK,RUNNING,MULTICAST,IPv4,VIRTUAL> mtu 8232 index 1
inet 127.0.0.1 netmask ff000000
ce0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 1500 index 2
inet 172.0.2.1 netmask ffffff00 broadcast 172.0.2.255
ether 0:3:ba:2c:da:de
上面的信息说明当前只绑定了一个网卡,还有一个没有分配IP。
bash-3.00# ifconfig ce1 plumb
bash-3.00# ifconfig ce1 10.0.0.1 netmask 255.255.255.0 broadcast 10.0.0.255 up
为了重启后网卡自动绑定IP,修改/etc/hosetname.ce1文件:
bash-3.00# vi /etc/hostname.ce1
priv-pre1
修改/etc/netmasks,添加广播地址和掩码
bash-3.00# chmod o+w /etc/netmasks
bash-3.00# vi /etc/netmasks
172.0.2.0 255.255.255.0
10.0.0.0 255.255.255.0
设置默认路由:
bash-3.00# vi /etc/defaultrouter
172.0.2.252
在另一个节点做类似的操作:
bash-3.00# ifconfig ce1 plumb
bash-3.00# ifconfig ce1 10.0.0.2 netmask 255.255.255.0 broadcast 10.0.0.255 up
bash-3.00# vi /etc/hosetname.ce1
priv-pre2
bash-3.00# vi /etc/netmasks
172.0.2.0 255.255.255.0
10.0.0.0 255.255.255.0
bash-3.00# vi /etc/defaultrouter
172.0.2.252
Oracle在安装clusterware过程中需要拷贝文件到另一个节点,所以需要对两个节点配置ssh验证,使得这两个节点间连接不需要输入口令。(要在oracle用户下执行)
Ø 在所有节点生成RSA和DSA Keys
过程需要回车几次。
bash-3.00$ id
uid=200(oracle) gid=100(oinstall)
bash-3.00$ mkdir ~/.ssh
bash-3.00$ chmod 700 ~/.ssh
bash-3.00$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/export/home/oracle/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /export/home/oracle/.ssh/id_rsa.
Your public key has been saved in /export/home/oracle/.ssh/id_rsa.pub.
The key fingerprint is:
b0:c1:4b:92:b4:05:ff:4e:79:a0:61:89:ab:8d:7b:f8 oracle@pre1
bash-3.00$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/export/home/oracle/.ssh/id_dsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /export/home/oracle/.ssh/id_dsa.
Your public key has been saved in /export/home/oracle/.ssh/id_dsa.pub.
The key fingerprint is:
8f:e3:dc:ca:ff:05:b6:5f:a6:c6:25:9f:3a:35:d1:2a oracle@pre1
Ø 添加密钥信息到验证文件中
这一系列步骤只需要在其中一个节点执行就可以了(这里选择pre1):
首先生成一个验证文件(ssh登录时会读取这个文件的信息),用来存储各个密钥信息:
bash-3.00$ touch ~/.ssh/authorized_keys
把各个节点的密钥信息都放在上一步新建的验证文件中:
bash-3.00$ cd ~/.ssh
bash-3.00$ ls
authorized_keys id_dsa id_dsa.pub id_rsa id_rsa.pub
bash-3.00$ ssh pre1 cat /export/home/oracle/.ssh/id_rsa.pub >> authorized_keys
The authenticity of host 'pre1 (172.0.2.1)' can't be established.
RSA key fingerprint is 0e:3d:ae:3a:49:88:ad:bb:e5:0a:c3:2a:02:35:b2:19.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'pre1,172.0.2.1' (RSA) to the list of known hosts.
Password:
bash-3.00$ ssh pre2 cat /export/home/oracle/.ssh/id_rsa.pub >> authorized_keys
The authenticity of host 'pre2 (172.0.2.3)' can't be established.
RSA key fingerprint is ef:9c:17:53:50:e5:b6:23:d0:89:a5:d8:ef:69:e3:a8.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'pre2,172.0.2.3' (RSA) to the list of known hosts.
Password:
bash-3.00$ ssh pre1 cat /export/home/oracle/.ssh/id_dsa.pub >> authorized_keys
bash-3.00$ ssh pre2 cat /export/home/oracle/.ssh/id_dsa.pub >> authorized_keys
Password:
Ø 在pre1把存储公钥信息的验证文件传送到pre2上
bash-3.00$ scp authorized_keys pre2:`pwd`
Password:
authorized_keys 100% |*********************************************************************************| 1644 00:00
bash-3.00$
Ø 设置验证文件的权限
在每一个节点执行:
bash-3.00$ chmod 600 ~/.ssh/authorized_keys
Ø 启用用户一致性
在你运行OUI的节点以oracle用户运行(这里选择pre1):
bash-3.00$ exec /usr/bin/ssh-agent $SHELL
$ ssh-add
Identity added: /export/home/oracle/.ssh/id_rsa (/export/home/oracle/.ssh/id_rsa)
Identity added: /export/home/oracle/.ssh/id_dsa (/export/home/oracle/.ssh/id_dsa)
Ø 验证ssh配置是否正确
以oracle用户在所有节点分别执行:
ssh pre1 date
ssh pre2 date
ssh priv-pre1 date
ssh priv-pre2 date
如果不需要输入密码就可以输出时间,说明ssh验证配置成功。必须把以上命令在两个节点都运行,每一个命令在第一次执行的时候需要输入yes。
如果不运行这些命令,即使ssh验证已经配好,安装clusterware的时候也会出现错误:
The specified nodes are not clusterable
因为,配好ssh后,还需要在第一次访问时输入yes,才算是真正的无障碍访问其他服务器。
在不同的数据库版本和OS版本下,需要安装的包和补丁是不一样的。在solaris 10 sparc下安装10g rac必须要包含这些系统包:
SUNWarc
SUNWbtool
SUNWhea
SUNWlibC
SUNWlibm
SUNWlibms
SUNWsprot
SUNWtoo
SUNWi1of
SUNWxwfnt
可以用如下命令来检查需要的包是否已经安装:
bash-3.00# pkginfo -i SUNWarc SUNWbtool SUNWhea SUNWlibC SUNWlibm SUNWlibms SUNWsprot SUNWtoo SUNWi1of SUNWxwfnt
system SUNWarc Lint Libraries (usr)
system SUNWbtool CCS tools bundled with SunOS
system SUNWhea SunOS Header Files
system SUNWi1of ISO-8859-1 (Latin-1) Optional Fonts
system SUNWlibC Sun Workshop Compilers Bundled libC
system SUNWlibm Math & Microtasking Library Headers & Lint Files (Usr)
system SUNWlibms Math & Microtasking Libraries (Usr)
system SUNWsprot Solaris Bundled tools
system SUNWtoo Programming Tools
system SUNWxwfnt X Window System platform required fonts
在solaris 10 sparc 上装oracle 10g rac不需要打系统补丁。
在各个节点都要设置环境变量:
pre1:修改~/.profile,添加如下内容:
umask 022
ORACLE_SID=prerac1
ORACLE_BASE=/oracle/app
ORACLE_HOME=$ORACLE_BASE/product/10.2/database
ORA_CRS_HOME=$ORACLE_BASE/product/10.2/crs
NLS_LANG='SIMPLIFIED CHINESE_CHINA.ZHS16GBK'
PATH=$PATH:$ORACLE_HOME/bin
export ORACLE_SID ORACLE_BASE ORACLE_HOME ORA_CRS_HOME NLS_LANG PATH
pre2:修改~/.profile,添加如下内容:
umask 022
ORACLE_SID=prerac2
ORACLE_BASE=/oracle/app
ORACLE_HOME=$ORACLE_BASE/product/10.2/database
ORA_CRS_HOME=$ORACLE_BASE/product/10.2/crs
NLS_LANG='SIMPLIFIED CHINESE_CHINA.ZHS16GBK'
PATH=$PATH:$ORACLE_HOME/bin
export ORACLE_SID ORACLE_BASE ORACLE_HOME ORA_CRS_HOME NLS_LANG PATH
在两个节点都建立相关目录并授权(与环境变量设置要匹配):
mkdir –p /oracle/app/product/10.2/{database,crs}
chown –R oracle:oinstall /oracle
chmod –R 775 /oracle
在两个节点分别修改系统参数,重启
set noexec_user_stack=1
set semsys:seminfo_semmni=100
set semsys:seminfo_semmns=1024
set semsys:seminfo_semmsl=256
set semsys:seminfo_semvmx=32767
set shmsys:shminfo_shmmax=4294967295
set shmsys:shminfo_shmmin=1
set shmsys:shminfo_shmmni=100
set shmsys:shminfo_shmseg=10
设置NDP,使得网络更高效
# ndd -set /dev/udp udp_xmit_hiwat 65536
# ndd -set /dev/udp udp_recv_hiwat 65536
为了确保修改在重启后仍然有效,添加/ect/init.d/nddudp文件:
# vi /etc/init.d/nddudp
ndd -set /dev/udp udp_xmit_hiwat 65536
ndd -set /dev/udp udp_recv_hiwat 65536
然后在/etc的rc1.d、rc2.d、rcS.d目录下建立连接,连接必须以S70或S71为前缀:
# ln -s -f /etc/init.d/nddudp /etc/rc1.d/S70nddudp
# ln -s -f /etc/init.d/nddudp /etc/rc2.d/S70nddudp
# ln -s -f /etc/init.d/nddudp /etc/rcS.d/S70nddudp
做过这些后重启服务器使设置生效。Solaris 10可以用资源控制器来动态修改这些内核参数而不用重启系统,这里不做说明。
oracle要求至少1G物理内存,1.5-2倍内存的swap,/tmp要空闲空间400M以上。
用如下命令在所有节点检查内存和磁盘空间:
$ /usr/sbin/prtconf | grep "Memory size"
Memory size: 4096 Megabytes
$ /usr/sbin/swap -s
total: 165384k bytes allocated + 30456k reserved = 195840k used, 11421296k available
$ df -k /tmp
Filesystem kbytes used avail capacity Mounted on
swap 11419720 112 11419608 1% /tmp
检查芯片,确保你下载的软件与芯片相符:
$ /bin/isainfo -kv
64-bit sparcv9 kernel modules
一个裸设备可以看作是一个分区,按照前面的设计,一共需要4个裸设备:
Ocr -- /dev/rdsk/c3t0d3s5
voting -- /dev/rdsk/c3t0d3s6
ASMDISK -- /dev/rdsk/c3t0d0s6 /dev/rdsk/c3t0d2s6
需要把这些裸设备授权给oracle用户。
在每一个节点以root用户执行:
chown oracle:dba /dev/rdsk/c3t0d3s5
chown oracle:dba /dev/rdsk/c3t0d3s6
chown oracle:dba /dev/rdsk/c3t0d0s6
chown oracle:dba /dev/rdsk/c3t0d2s6
chmod 660 /dev/rdsk/c3t0d3s5
chmod 660 /dev/rdsk/c3t0d3s6
chmod 660 /dev/rdsk/c3t0d0s6
chmod 660 /dev/rdsk/c3t0d2s6
注意:cxtydzsn都是链接,授权后用ls –l查看/dev/rdsk/c3t0d3s5发现还是root.root,不过实际权限已经发生改变了。
Ø 验证网络是否满足需求:
$ ./runcluvfy.sh comp nodecon -n pre1,pre2 -verbose
Verifying node connectivity
ERROR:
User equivalence unavailable on all the nodes.
Verification cannot proceed.
Verification of node connectivity was unsuccessful on all the nodes.
在solaris下很容易遇到这个错误,这是因为Oracle在寻找ssh和scp命令时,去/usr/local/bin目录下寻找,而ssh命令在/usr/bin目录下。
相应的解决方法也很简单,在/usr/local/bin目录下建立一个指向/usr/bin/ssh的链接就可以了。
具体步骤是:
在需要执行cluvfy的节点上执行执行下面步骤:
用root创建链接:
bash-3.00# mkdir -p /usr/local/bin
bash-3.00# ln -s -f /usr/bin/ssh /usr/local/bin/ssh
bash-3.00# ln -s -f /usr/bin/scp /usr/local/bin/scp
在oracle用户下再次添加ssh验证:
$ exec /usr/bin/ssh-agent $SHELL
$ /usr/bin/ssh-add
Identity added: /export/home/oracle/.ssh/id_rsa (/export/home/oracle/.ssh/id_rsa)
Identity added: /export/home/oracle/.ssh/id_dsa (/export/home/oracle/.ssh/id_dsa)
再次执行验证就可以成功了:
bash-3.00$ ./runcluvfy.sh comp nodecon -n pre1,pre2
Verifying node connectivity
Checking node connectivity...
Node connectivity check passed for subnet "172.0.2.0" with node(s) pre2,pre1.
Node connectivity check passed for subnet "10.0.0.0" with node(s) pre2,pre1.
Suitable interfaces for VIP on subnet "172.0.2.0":
pre2 ce0:172.0.2.3 ce0:172.0.2.4
pre1 ce0:172.0.2.1 ce0:172.0.2.2
Suitable interfaces for the private interconnect on subnet "10.0.0.0":
pre2 ce1:10.0.0.2
pre1 ce1:10.0.0.1
Node connectivity check passed.
Verification of node connectivity was successful.
Ø 检验系统是否满足安装rac要求
在pre1上以oracle用户执行如下命令:
$ ./runcluvfy.sh comp sys -n pre1,pre2 -p crs -osdba crs -orainv oinstall
Verifying system requirement
Checking system requirements for 'crs'...
Total memory check passed.
Free disk space check passed.
Swap space check passed.
System architecture check passed.
Operating system version check passed.
Package existence check passed for "SUNWarc".
Package existence check passed for "SUNWbtool".
Package existence check passed for "SUNWhea".
Package existence check passed for "SUNWlibm".
Package existence check passed for "SUNWlibms".
Package existence check passed for "SUNWsprot".
Package existence check passed for "SUNWsprox".
Package existence check passed for "SUNWtoo".
Package existence check passed for "SUNWi1of".
Package existence check passed for "SUNWi1cs".
Package existence check passed for "SUNWi15cs".
Package existence check passed for "SUNWxwfnt".
Package existence check passed for "SUNWlibC".
Package existence check failed for "SUNWscucm:3.1".
Check failed on nodes:
pre2,pre1
Package existence check failed for "SUNWudlmr:3.1".
Check failed on nodes:
pre2,pre1
Package existence check failed for "SUNWudlm:3.1".
Check failed on nodes:
pre2,pre1
Package existence check failed for "ORCLudlm:Dev_Release_06/11/04,_64bit_3.3.4.8_reentrant".
Check failed on nodes:
pre2,pre1
Package existence check failed for "SUNWscr:3.1".
Check failed on nodes:
pre2,pre1
Package existence check failed for "SUNWscu:3.1".
Check failed on nodes:
pre2,pre1
Group existence check failed for "crs".
Check failed on nodes:
pre2,pre1
Group existence check passed for "oinstall".
User existence check passed for "oracle".
User existence check passed for "nobody".
System requirement failed for 'crs'
Verification of system requirement was unsuccessful on all the nodes.
上面随便有部分没有检测通过,但是没有通过部分都是与sun cluster相关的,我们这里用crs,所以不用管这些出错信息。
至此,准备工作已经完成,下面开始安装clusterware。