Pages

Friday, April 10, 2015

Flexpod - Setting up a UCS Mini with Direct Attached Storage (DAS)

Overview
This entry will document what we did to setup a UCS Mini Flexpod with direct attached storage for a remote office. Using iSCSI.

This document will not cover step by step instructions on how to do common UCS or NetApp tasks(such as creating service profiles or a storage volume), only things specific to setting it up for using in the UCS Mini configuration.

Equipment
The hardware used:

Type Model Firmware / OS
Chassis UCS 5108 AC2  3.0(1c)
Fabric Interconnects UCS 6324 3.0(1c)
Blades UCS B200 M3 3.0(1c)
Storage FAS 2552 (no external storage) Switchless cDOT 8.3

Reference
For information on how to setup a UCS Chassis from scratch, please see this awesome guide:

I have used this guide to setup two, standard UCS Chassis as well as a framework for setting up the UCS Mini.  

Assumptions
I am assuming:
  • The equipment is un-boxed and racked
    • Do not turn anything on yet, just get it in the racks
  • Proper network infrastructure exists (Cisco switches)
    • I am not a network guy, so sadly this guide will not include any network configurations besides setting IP addresses.
Storage Configuration
The storage will be setup first.  My example is a FAS 2552 with dual controllers, without external storage in a switchless cDOT 8.3 cluster.

  1. Download the official NetApp setup program and guide from the NOW site
    • http://mysupport.netapp.com/NOW/public/system_setup/
    • I am using the FAS 2552 setup guide
  2. Plug in the power, but do not turn it on. Leave the power supply switch off.
  3. Using the supplied network cable, connect the two ACP Ports. Has a picture of a wrench with a padlock on it.
  4. Using SAS cables, connect the left SAS port on controller 1 into the right SAS port on controller 2.
  5. Connect the right SAS port on controller 1 into the left SAS port on controller 2.
  6. Using 10gig SFP cables, e0e on controller 1 to e0e on controller 2
  7. Connect e0f on controller 1 to e0f on controller 2
  8. Connect Remote Management port (wrench icon) to your management network.
  9. Should look something like this:
    • Forgive the cable mess, temporary setup before we shipped it to the remote site.
    • Ports e0c and e0d are going to the Fabric Interconnects, we'll get to that later
  10. Cable a windows workstation to the same network that the NetApp's management ports are connected to.
  11. Install and run the NetApp_SystemStartup.exe
  12. Press the Discover button
  13. Turn on the controller power switches when directed
  14. Follow the onscreen instructions to configure cluster information, authentication, disk, licenses, etc.
    • Unfortunately, I did not take any screen shots during this part or take good notes like I should have.
  15. If everything is cabled and configured correctly, the NetApp can be accessed over SSH or using OnCommand.
  16. Create a SVM with iSCSI enabled

UCS Mini Configuration
Please follow the Speak Virtual UCS setup guide to get the UCS Mini mostly setup.  I am going to document what was different here. Also remember that this guide is configuring everything for use with iSCSI

First thing that was different was configuring the Unified Ports. On each Fabric Interconnect:
  1. On LAN > Appliances > Fab a/b > VLANs
    • Create a vlan. I called mine "iSCSI-App-A/B"
      • Type in a vlan ID that is not in use on your network
      • Select Fab A or B, not both
      • Not native vlan
  2. On Lan > Policies > Appliances > Network Control Policies
    • Create a policy called "AppModeNCP"
      • CDP Disabled
      • MAC Register Mode: Only native vlan
      • Action on Uplink Fail: Warning
  3. On Equipment > Fab Interconnects > fab a/b > Select Configure Unified Ports
    • Set ports 1 and 2 to Appliance Ports. 
      • Set the VLAN to the VLAN created in step 1
      • Set the access control policy to AppModeNCP
      • The mode is Access
    • Set ports 3 and 4 to Network port
      • Mode is Trunk
  4. Wait for the Interconnect to reboot. Then change the other Fabric Interconnect.
  5. On LAN > Policies  > root > vNIC Templates
    1. Create a vNIC Template for iSCSI_A
      1. Select Fab A
      2. Select Adapter
      3. Select iSCSI-App-A for vlan
      4. Updating Template
      5. 9000 MTU
      6. ISCSI_A mac pool (see the speak virtual doc on how to create a mac pool)
      7. QoS Policy VMWare
      8. Network Control Policy: AppModeNCP
      9. Dyname vNIC
    2. Create a vNIC Template for iSCSI_B
      1. Select Fab B
      2. Select Adapter
      3. Select iSCSI-App-B for vlan
      4. Updating Template
      5. 9000 MTU
      6. ISCSI_A mac pool (see the speak virtual doc on how to create a mac pool)
      7. QoS Policy VMWare
      8. Network Control Policy: AppModeNCP
      9. Dyname vNIC

Next is cabling.
  1. On Fab A
    • Connect the top two ports to e0c on both NetApp controllers 1 and 2
    • Connect bottom two ports to 10 gig ports on the network switches.
  2. On Fab B
    • Connect the top two ports to e0d on both NetApp controllers 1 and 2.
    • Connect bottom two ports to 10 gig ports on the network switches.
There's some more stuff, I will update this page when I figure out what I did.

Friday, March 13, 2015

NetApp - Using SnapMirror to Migrate a 7mode NAS Volume to cDOT

At work, we're in the process of migrating from two FAS3250 running 7mode to two FAS8040's running clustered data on tap. Both sets of controllers are running side by side and we need to migrate everything from the 3250 to the 8040.

First hurdle, we have a bunch of ESXi hosts that boot from SAN.  Then we find out that it is impossible to SnapMirror SAN (FCP or iSCSI) volumes between 7mode and cDOT.  Had to rebuild all our ESXi hosts and do a full restore of our SQL cluster database since it was also on SAN volumes.

Next challenge, getting our many NFS volumes transferred over.  NetApp offers a transition tool to help with this process, but I did not try it. I found a bunch of guides on the NOW site and random blogs. They all missed a couple important steps which I wanted to document.

Here is the process I followed for migrating a 7mode NAS (nfs) volume to cDOT. DOES NOT WORK FOR SAN(fcp,iscsi) VOLUMES! I'm assuming the reader of this already has their cDOT system up and running with SVM's and LIFS configured.

CLUSTER = cDOT cluster name
NODE = individual controller node in the cDOT cluster
VSERVER = NAS SVM name
7MODE = 7mode controller with the source volume on it

Tab completion is your friend!

  1. SSH into the cluster using your favorite SSH client. Sign in with admin account.
  2. Set up a peer relationship
    1. vserver peer transition create -local-vserver VSERVER -src-filer-name 7MODE
  3. Create destination volume to snap mirror to
    1. volume create -volume <volume name> -aggregate <aggrname> -size 100GB -type DP
  4. Create an Intercluster LIF
    1. network interface create -vserver NODE -lif intclust01 -role intercluster -home-node NODE -home-port <port that has connectivity to 7MODE> -address <ip address> -netmask <netmask>
    2. network routing-groups route create -vserver NODE -routing-group <name> -destination 0.0.0.0/0 -gateway <gateway of the intercluster LIF IP>
  5. Verify communication
    1. network pint -lif intclust01 -lif-owner NODE -destination 7MODE
      1. should say that it is alive.
  6. IMPORTANT! This is where the other guides I found on the Internet failed.
  7. On 7MODE, edit the /etc/snapmirror.allow file.
  8. On a new line, put
    1. interclusterLIF:<volume to SnapMirror>
    2. example: fas8040-01:nfs_volume_01
    3. The interclusterLIF name needs to resolve to the 7MODE system exactly how you type this name in here. I had to edit the /etc/hosts file on the 7MODE system and add an entry for my intercluster IP address to resolve to fas8040-01 before it would work for me
  9. Back on the CLUSTER shell. Create the SnapMirror relationship
    1. snapmirror create -source-path 7MODE:<src_volume> -destination-path VSERVER:<volume created in step 3> -type TDP
  10. Initialize the SnapMirror
    1. snapmirror initialize -destination-path VSERVER:<volume>
  11. View progress
    1. snapmirror show -destination-path VSERVER:<volume>
  12. Status should be transferring. It took a few minutes for it to actually start transferring data for me.
  13. If it is failed, log onto the 7MODE node and try to initialize the cluster again from the cDOT system. The 7MODE system might give you a clue as to what is wrong. Might have to go mess with /etc/snapmirror.allow or hosts.
  14. The volume should now be SnapMirrored

Thursday, February 19, 2015

Red Hat Satellite - 5.7 - Kickstart hangs on running post installation scripts

After I upgraded from Satellite 5.6 to 5.7, I was unable to kickstart any new servers using my existing kickstart scripts.  The kickstart would start, do all my partitioning and package installation and then hang on "Running post installation scripts."  I let it sit for over 30 minutes, no go.

It did not register to my Satellite server yet, so I knew it had not processed the entire kickstart file yet.  I started making a new kickstart profile to see if it was something that didn't convert well in the upgrade.  When I got to the scripts part, I saw a new post script listed called "Registration and server actions"

I could not do anything to the script besides reorder. I reordered the scripts so that the weird script was on the bottom, updated the profile and tried to kickstart again. This time, it worked.

TLDR: Put the "Registration and server actions" post script on the bottom of your kickstart script listing.

Example:

Monday, February 2, 2015

Red Hat Satellite 5 - Ran out of disk after 5.7 update

I'm still using Satellite 5 in production because Satellite 6 is just not ready yet for prime time use. At least, not in my opinion.  The recent Satellite 5.7 update change the entire interface to look a lot more modern as well as a bunch of bug fixes.

The 5.6 to 5.7 update went perfectly for me. Nothing broke and I did not lose any functionality.  However, today my Satellite 5 server stopped working. I could not get to the web interface and client servers could not install packages.  Sign into the server and see that the / partition was full.

Some searching turned up, that one of the 5.7 changes was that they moved the location of where postgres was keeping its data. The upgrade process moved all of the data out of the old location (where I had a separate mount point) and moved it to a new location in /opt, where I don't have enough disk.

  • Old location in 5.6
    • /var/lib/pgsql
  • New location in 5.7
    • /opt/rh/postgresql92/root/var/lib/pgsql
I made sure Satellite was fully stopped
  • rhn-satellite stop
Then, I copied the data directory from the new location to the old location. After that, I umounted /var/lib/pgsql, changed the mount point in fstab and remounted it in the new location.  Started satellite up
  • rhn-satellite start
Once I verified that Satellite was back and happy. I stopped the Satellite, umounted the partition and deleted everything that was in that data directory so my / partition would get some space back. Remounted pgsql and started satellite. Everything is happy.

Friday, January 30, 2015

Red Hat Satellite 6.0 - Error 400 on Server

I believe that I have ran into a bug in Red Hat Satellite 6 and puppet.  None of my puppet agents are able to retrieve the catalog from the puppet master anymore.

[root@server ~]# puppet agent -tv
Info: Retrieving plugin
Info: Loading facts in /var/lib/puppet/lib/facter/pe_version.rb
Info: Loading facts in /var/lib/puppet/lib/facter/facter_dot_d.rb
Info: Loading facts in /var/lib/puppet/lib/facter/root_home.rb
Info: Loading facts in /var/lib/puppet/lib/facter/puppet_vardir.rb
Error: Could not retrieve catalog from remote server: Error 400 on SERVER: wrong number of arguments (2 for 1)
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

After much digging around, I noticed that puppet on my satellite server was at a much newer version (3.7.x) than my client (3.6.x). I think what happened is that when I was messing around with building puppet modules, a tutorial had me create a .gemfile which installed the latest puppet through the gem command.

To resolve the issue, I did:

  • gem uninstall puppet
  • yum reinstall puppet
  • log out and back in
Now if I run "puppet help" it says that it is 3.6.2. I am also able to run puppet agent -tv without error on a client server as well.

Monday, January 26, 2015

Red Hat Satellite 6 - Puppet, What the Hell do I do Now?

One of my major issues with Satellite 6 has been centralized configuration management.  In fact, not having configuration management working correctly is the major reason why my production systems are still using 5.7.

All the documentation I have found talks about what puppet is, where to get modules and how to get the modules into a content view. That's great.  After I get a module added to a content view, what then?  How do I actually make it do something?

After much Internet research, I have finally made the ntp puppet module actually configure something on a remote server through Satellite 6.  Here is what I did:

This document assumes you have already downloaded the ntp puppet module from the puppet forge (https://forge.puppetlabs.com/puppetlabs/ntp), uploaded it to a custom repository and published to a content view the server is subscribed to. Pretty much everything the official documentation has you do, besides actually doing something with it.

Configuring the ntpd pool server using the puppet module in Satellite 6.

  1. Sign into Satellite
  2. Click on Configure > Puppet classes
  3. Click on the ntp class name. (If there is nothing on this screen, you didn't do something correctly)
  4. Click on the Smart Class Parameter tab
  5. This screen shows all the available classes which can be set for this puppet module. 
  6. I want to make sure ntp is installed, is running and has my pool servers.
  7. Scroll down to "package ensure" and click on it.
  8. Check the box next to "Override." Set "Parameter type" to string. In "Default Value", type latest
  9. Scroll down to "servers" and click on it.
  10. Click the "Override" check box
  11. Set "Paramater type" to "array"
  12. In "Default value" type in a comma separated list of your ntp pool servers as follows:
    1. ["ntp1.example.com","ntp2.example.com"]
  13. Scroll to "service enable" and click on it
  14. Check the "Override" box
  15. Change "Parameter type" to "boolean"
  16. Type "true" into default value
  17. Scroll all the way down to the bottom of the page and press submit.
  18. Browse to Hosts > All Hosts
  19. Click on the server
  20. Click on Edit
  21. Click on Puppet Classes
  22. Expand the ntp class, click on ntp. ntp should show up under the "Included classes" area. Do not select any of the other things. I'm still looking into what those do.
  23. Click submit
  24. Click on the YAML button, this will show you exactly what is being passed to the puppet agent. You should see your additional configuration settings listed
  25. Click back
  26. Go onto the console of the server and run "puppet agent -tv" as root.  If everything is correct, your /etc/ntp.conf file should now have your servers in there. 
  27. I uninstalled the ntpd package and ran puppet, it installed the service, started it and setup my configuration file.
  28. Success!