Today I’vh setup my ODA as virtualized platform which is possible since 2.5 Image. I’vh done my installation with the latest stable release (2.9) which is available right now. I want you show you how setup an shared repository with is possible since 2.8 image. First the theory:
I will show from setup the repository until the installation of the virtual machine:
1. Create a shared repository
Show actual configuration:
>oakcli show repo NAME TYPE NODENUM STATE odarepo1 local 0 N/A odarepo2 local 1 N/A
Create repository shared1 in Diskgroup +RECO with the size of 250GB:
>oakcli create repo shared1 -dg reco -size 250 Created Shared Repo : SharedRepoType
Show new configuration:
>oakcli show repo NAME TYPE NODENUM STATE odarepo1 local 0 N/A odarepo2 local 1 N/A shared1 shared 0 ONLINE shared1 shared 1 ONLINE
Show details:
akcli show repo shared1 -node 0 Resource: shared1_0 AutoStart : restore DG : RECO Device : /dev/asm/shared1-72 ExpectedState : Online MountPoint : /u01/app/sharedrepo/shared1 Name : shared1_0 Node : all RepoType : shared Size : 256000 State : Online
2. Check the implementation on the oda_base nodes
>showmount -e Export list for oda_base_node0: /u01/app/sharedrepo/shared1 192.168.16.24 >showmount -e Export list for oda_base_node1: /u01/app/sharedrepo/shared1 192.168.16.25
ATTENTION: The exports are done dynamically by oak. If you restart nfs service on ODA_BASE the configuration is lost until oak will reapply. >cat /etc/exports will be empty!
3. Check the implementation in vm hosts
[root@odavm01 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/md1 19G 1.1G 17G 6% / tmpfs 1.7G 0 1.7G 0% /dev/shm /dev/md3 519G 209G 284G 43% /OVS /dev/md0 487M 41M 421M 9% /boot none 1.7G 112K 1.7G 1% /var/lib/xenstored 192.168.16.27:/u01/app/sharedrepo/shared1 250G 649M 250G 1% /OVS/Repositories/shared1 [root@odavm02 ~]# df -h Filesystem Size Used Avail Use% Mounted on /dev/md1 19G 1.1G 17G 6% / tmpfs 1.7G 0 1.7G 0% /dev/shm /dev/md3 519G 181G 311G 37% /OVS /dev/md0 487M 41M 421M 9% /boot none 1.7G 112K 1.7G 1% /var/lib/xenstored 192.168.16.28:/u01/app/sharedrepo/shared1 250G 649M 250G 1% /OVS/Repositories/shared1
ATTENTION: The NFS mount point will be mounted with oracle best practice options, so this mounts are HARD mounts which will be a problem on simple reboot!
192.168.16.27:/u01/app/sharedrepo/shared1 on /OVS/Repositories/shared1 type nfs (rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,nfsvers=3,timeo=600,addr=192.168.16.27) 192.168.16.28:/u01/app/sharedrepo/shared1 on /OVS/Repositories/shared1 type nfs (rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,nfsvers=3,timeo=600,addr=192.168.16.28)
So please stop all VMs, execpt oda_base on vm host before reboot. Otherwise the reboot will hang on unmount NFS mountpoint!
4. Create vm template
There is nothing special, on local node import. So straight forward:
>oakcli import vmtemplate w2k8_template -assembly /tmp/stage/w2k8_template.ova -repo odarepo1 Imported VM Template >oakcli show vmtemplate NAME MEMORY VCPU REPOSITORY w2k8_template 4096 1 odarepo1
5. Create virtual machine
>oakcli clone vm w2k8 -vmtemplate w2k8_template -repo shared1 -node 0 Cloned VM Template
We have created a virtual machine which starts only on NODE 0 and is located on shared Repository “shared1”
5. Configure virtual machine
>oakcli show vm w2k8 Resource: w2k8 AutoStart : restore CPUPriority : 100 Disks : |file:/OVS/Repositories/shared1/Vir tualMachines/w2k8/183d0200dd0c 485f8a47a2880b7daf0a.img,xvda,w| Domain : XEN_PVM ExpectedState : online FailOver : true IsSharedRepo : true Keyboard : de MaxMemory : 8192 MaxVcpu : 4 Memory : 8192 Mouse : USB_MOUSE Name : w2k8 Networks : |type=netfront,bridge=net1||type=ne tfront,bridge=net2| NodeNum : 1 NodeNumStart : 1 OS : WIN_2008 PrefNodeNum : 0 PrivateIP : None ProcessorCap : 100 RepoName : shared1 State : Online TemplateName : otml_w2k8_template Vcpu : 4 cpupool : reports_pool vncport : 0
You may configure your vm now.
ATTENTION: During clone of vm you may run in a BUG which causes one of both Nodes a Oracle Cluster Crash of ASM, the Private Interconnect may be not responed anymore until you force umount the ACFS Respository on vm host!
You may get the following:
... 64 bytes from 192.168.16.28: icmp_seq=675 ttl=64 time=0.347 ms 64 bytes from 192.168.16.28: icmp_seq=676 ttl=64 time=0.290 ms 64 bytes from 192.168.16.28: icmp_seq=677 ttl=64 time=0.298 ms ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ping: sendmsg: No buffer space available ...
or
... nfs: server 192.168.16.27 not responding, still trying nfs: server 192.168.16.27 not responding, still trying INFO: task cp:29088 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. cp D 0000000000000000 0 29088 29087 0x00000000 ffff880146cddb68 0000000000000282 ffff880146cdda98 0000000000012180 ffff88015121c240 ffff880154ebe2c0 ffff880146cddb28 ffff8801011f3bc0 ffff880146cddab8 ffffffff810410c1 ffff880146cddab8 00000000009824aa Call Trace: [] ? pvclock_get_nsec_offset+0x11/0x50 [] ? pvclock_clocksource_read+0x29/0x80 [] ? xen_clocksource_read+0x20/0x30 [] ? xen_clocksource_get_cycles+0x9/0x10 [] ? ktime_get_ts+0xb1/0xf0 [] schedule+0x45/0x60 [] io_schedule+0x71/0xb0 [] sleep_on_page+0xe/0x20 [] __wait_on_bit+0x5e/0x90 [] ? __lock_page+0x80/0x80 [] wait_on_page_bit+0x75/0x80 [] ? autoremove_wake_function+0x50/0x50 [] ? pagevec_lookup_tag+0x27/0x40 [] filemap_fdatawait_range+0xf7/0x180 [] ? nfs_file_direct_read_iter+0x50/0x50 [nfs] [] filemap_fdatawait+0x2b/0x30 [] writeback_single_inode+0x1ce/0x230 [] sync_inode+0x4a/0x80 [] nfs_wb_all+0x46/0x50 [nfs] [] nfs_setattr+0x12f/0x140 [nfs] [] notify_change+0x18b/0x2f0 [] do_truncate+0x63/0x90 [] do_sys_ftruncate+0x114/0x130 [] sys_ftruncate+0x13/0x20 [] system_call_fastpath+0x16/0x1b ...
Reference:
Bug 17896838 – DOMU LOST NETWORK CONNECTION DURING COPYING BIG FILES OVER NFS
Workaround:
This issue is caused by known kernel bug 17896838, which is fixed in kernel 2.6.39-214.1 and above. ODA July (2.10.0.0.0?) release will come with the new kernel that includes the fix. In the meantime, the workaround is to set or increase the dom0_mem in /boot/grub/grub.conf to 8192M on dom0 then reboot the dom0:
kernel /xen.gz dom0_mem=8192M crashkernel=256M@64M
In cause of my SR there have been raised a note you can refer to:
ODA DOM0 Lost Network Communications Between Nodes While running ‘okacli clone vm’ or Copy (cp) Commands of Large Files over NFS (Doc ID 1665055.1)
6. Done, now start up vm and have fun!
>oakcli show vm NAME NODENUM MEMORY VCPU STATE REPOSITORY w2k8 1 8192 4 ONLINE shared1