Oracle Database Appliance // Virtualization // Shared Repositories


Today I’vh setup my ODA as virtualized platform which is possible since 2.5 Image. I’vh done my installation with the latest stable release (2.9) which is available right now. I want you show you how setup an shared repository with is possible since 2.8 image. First the theory:

shared_repo_arch
Source: docs.oracle.com

I will show from setup the repository until the installation of the virtual machine:

1. Create a shared repository

Show actual configuration:

>oakcli show repo

          NAME                          TYPE            NODENUM         STATE
          odarepo1                      local           0               N/A       
          odarepo2                      local           1               N/A    

Create repository shared1 in Diskgroup +RECO with the size of 250GB:

>oakcli create repo shared1 -dg reco -size 250

Created Shared Repo : SharedRepoType

Show new configuration:

>oakcli show repo       
           
          NAME                          TYPE            NODENUM         STATE                                                                
          odarepo1                      local           0               N/A 
          odarepo2                      local           1               N/A 
          shared1                       shared          0               ONLINE                                                               
          shared1                       shared          1               ONLINE                   

Show details:

akcli show repo shared1 -node 0
Resource: shared1_0
        AutoStart       :       restore        
        DG              :       RECO           
        Device          :       /dev/asm/shared1-72
        ExpectedState   :       Online         
        MountPoint      :       /u01/app/sharedrepo/shared1
        Name            :       shared1_0      
        Node            :       all            
        RepoType        :       shared         
        Size            :       256000         
        State           :       Online        

2. Check the implementation on the oda_base nodes

>showmount -e
Export list for oda_base_node0:
/u01/app/sharedrepo/shared1 192.168.16.24

>showmount -e
Export list for oda_base_node1:
/u01/app/sharedrepo/shared1 192.168.16.25

ATTENTION: The exports are done dynamically by oak. If you restart nfs service on ODA_BASE the configuration is lost until oak will reapply. >cat /etc/exports will be empty!

3. Check the implementation in vm hosts

[root@odavm01 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md1               19G  1.1G   17G   6% /
tmpfs                 1.7G     0  1.7G   0% /dev/shm
/dev/md3              519G  209G  284G  43% /OVS
/dev/md0              487M   41M  421M   9% /boot
none                  1.7G  112K  1.7G   1% /var/lib/xenstored
192.168.16.27:/u01/app/sharedrepo/shared1
                      250G  649M  250G   1% /OVS/Repositories/shared1

[root@odavm02 ~]# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/md1               19G  1.1G   17G   6% /
tmpfs                 1.7G     0  1.7G   0% /dev/shm
/dev/md3              519G  181G  311G  37% /OVS
/dev/md0              487M   41M  421M   9% /boot
none                  1.7G  112K  1.7G   1% /var/lib/xenstored
192.168.16.28:/u01/app/sharedrepo/shared1
                      250G  649M  250G   1% /OVS/Repositories/shared1

ATTENTION: The NFS mount point will be mounted with oracle best practice options, so this mounts are HARD mounts which will be a problem on simple reboot!

192.168.16.27:/u01/app/sharedrepo/shared1 on /OVS/Repositories/shared1 type nfs (rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,nfsvers=3,timeo=600,addr=192.168.16.27)

192.168.16.28:/u01/app/sharedrepo/shared1 on /OVS/Repositories/shared1 type nfs (rw,bg,hard,nointr,rsize=32768,wsize=32768,tcp,actimeo=0,nfsvers=3,timeo=600,addr=192.168.16.28)

So please stop all VMs, execpt oda_base on vm host before reboot. Otherwise the reboot will hang on unmount NFS mountpoint!

4. Create vm template

There is nothing special, on local node import. So straight forward:

>oakcli import vmtemplate w2k8_template -assembly /tmp/stage/w2k8_template.ova -repo odarepo1

Imported VM Template 

>oakcli show vmtemplate

        NAME                                            MEMORY          VCPU            REPOSITORY
        w2k8_template                                   4096               1            odarepo1            

5. Create virtual machine

>oakcli clone vm w2k8 -vmtemplate w2k8_template -repo shared1 -node 0

Cloned VM Template 

We have created a virtual machine which starts only on NODE 0 and is located on shared Repository “shared1”

5. Configure virtual machine

>oakcli show vm w2k8
Resource: w2k8
        AutoStart       :       restore        
        CPUPriority     :       100            
        Disks           :       |file:/OVS/Repositories/shared1/Vir
                                tualMachines/w2k8/183d0200dd0c
                                485f8a47a2880b7daf0a.img,xvda,w|
        Domain          :       XEN_PVM        
        ExpectedState   :       online         
        FailOver        :       true           
        IsSharedRepo    :       true           
        Keyboard        :       de             
        MaxMemory       :       8192           
        MaxVcpu         :       4              
        Memory          :       8192           
        Mouse           :       USB_MOUSE      
        Name            :       w2k8      
        Networks        :       |type=netfront,bridge=net1||type=ne
                                tfront,bridge=net2|
        NodeNum         :       1              
        NodeNumStart    :       1              
        OS              :       WIN_2008       
        PrefNodeNum     :       0              
        PrivateIP       :       None           
        ProcessorCap    :       100            
        RepoName        :       shared1        
        State           :       Online         
        TemplateName    :       otml_w2k8_template
        Vcpu            :       4              
        cpupool         :       reports_pool   
        vncport         :       0              

You may configure your vm now.

ATTENTION: During clone of vm you may run in a BUG which causes one of both Nodes a Oracle Cluster Crash of ASM, the Private Interconnect may be not responed anymore until you force umount the ACFS Respository on vm host!
You may get the following:

...
64 bytes from 192.168.16.28: icmp_seq=675 ttl=64 time=0.347 ms
64 bytes from 192.168.16.28: icmp_seq=676 ttl=64 time=0.290 ms
64 bytes from 192.168.16.28: icmp_seq=677 ttl=64 time=0.298 ms
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
ping: sendmsg: No buffer space available
...

or

...
nfs: server 192.168.16.27 not responding, still trying
nfs: server 192.168.16.27 not responding, still trying
INFO: task cp:29088 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
cp              D 0000000000000000     0 29088  29087 0x00000000
 ffff880146cddb68 0000000000000282 ffff880146cdda98 0000000000012180
 ffff88015121c240 ffff880154ebe2c0 ffff880146cddb28 ffff8801011f3bc0
 ffff880146cddab8 ffffffff810410c1 ffff880146cddab8 00000000009824aa
Call Trace:
 [] ? pvclock_get_nsec_offset+0x11/0x50
 [] ? pvclock_clocksource_read+0x29/0x80
 [] ? xen_clocksource_read+0x20/0x30
 [] ? xen_clocksource_get_cycles+0x9/0x10
 [] ? ktime_get_ts+0xb1/0xf0
 [] schedule+0x45/0x60
 [] io_schedule+0x71/0xb0
 [] sleep_on_page+0xe/0x20
 [] __wait_on_bit+0x5e/0x90
 [] ? __lock_page+0x80/0x80
 [] wait_on_page_bit+0x75/0x80
 [] ? autoremove_wake_function+0x50/0x50
 [] ? pagevec_lookup_tag+0x27/0x40
 [] filemap_fdatawait_range+0xf7/0x180
 [] ? nfs_file_direct_read_iter+0x50/0x50 [nfs]
 [] filemap_fdatawait+0x2b/0x30
 [] writeback_single_inode+0x1ce/0x230
 [] sync_inode+0x4a/0x80
 [] nfs_wb_all+0x46/0x50 [nfs]
 [] nfs_setattr+0x12f/0x140 [nfs]
 [] notify_change+0x18b/0x2f0
 [] do_truncate+0x63/0x90
 [] do_sys_ftruncate+0x114/0x130
 [] sys_ftruncate+0x13/0x20
 [] system_call_fastpath+0x16/0x1b
...

Reference:
Bug 17896838 – DOMU LOST NETWORK CONNECTION DURING COPYING BIG FILES OVER NFS

Workaround:
This issue is caused by known kernel bug 17896838, which is fixed in kernel 2.6.39-214.1 and above. ODA July (2.10.0.0.0?) release will come with the new kernel that includes the fix. In the meantime, the workaround is to set or increase the dom0_mem in /boot/grub/grub.conf to 8192M on dom0 then reboot the dom0:

kernel /xen.gz dom0_mem=8192M crashkernel=256M@64M

In cause of my SR there have been raised a note you can refer to:
ODA DOM0 Lost Network Communications Between Nodes While running ‘okacli clone vm’ or Copy (cp) Commands of Large Files over NFS (Doc ID 1665055.1)

6. Done, now start up vm and have fun!

>oakcli show vm

          NAME                                  NODENUM         MEMORY          VCPU            STATE           REPOSITORY
          w2k8                                  1               8192               4            ONLINE          shared1