Wednesday, July 23, 2008

Change SHMMAX without rebooting

Shmmax parameter is supposed to be the maximum size of a single shared memory segments (and the oracle sga is build out of these shared memory segments).

New to the Solaris 8 release is the modular debugger, mdb(1), which is unique among
available Solaris debuggers because it is easily extensible. Mdb(1) also includes a number of desirable usability features including command-line editing, command history, built-in output pager, syntax checking, and command pipelining. This is the recommended post-mortem debugger for the kernel.

To change the value of the integer variable shmmax from 8GB to 10 GB without Reboot Server, do the following

# cp /etc/system /etc/system_old

# grep shminfo_shmmax /etc/system
set shmsys:shminfo_shmmax=81920
# mdb -k
Loading modules: [ unix krtld genunix ip usba s1394 ipc nfs ptm logindmux random ]
> shminfo_shmmax /D
shminfo_shmmax:
shminfo_shmmax: 1
> shminfo_shmmax /E
shminfo_shmmax:
shminfo_shmmax: 81920
> $q

at we can see the “shminfo_shmmax” use a 64 bit value, let’s start to change the value

# mdb -kw
Loading modules: [ unix krtld genunix ip usba s1394 ipc nfs ptm logindmux random ]
> shminfo_shmmax /Z 0t102400
shminfo_shmmax: 0x5f5e10000 = 0x19000
> shminfo_shmmax /E
shminfo_shmmax:
shminfo_shmmax: 102400
> $q

After successfully, change the parameter “shminfo_shmmax” at /etc/system with same value on mdb

# vi /etc/system
set shmsys:shminfo_shmmax=102400


Taken from: http://sysinfo.bascomp.org/2008/02/21/change-shmmax-without-rebooting/

Thursday, June 19, 2008

How to submit dump/snap file to IBM

How to submit dump/snap file to IBM


Open a case with IBM.
Log in as root.

At the command line, enter:
# sysdumpdev -L

Look at the dump size and then execute #df -Im to find a filesystem with enough space to proceed to packaging. These directions assume /tmp has enough space.

# snap -gfkGLDN
# cd /tmp/ibmsupt/dump
# ls

Ensure that unix.Z, dump.snap and dump.Z(or dump.BZ) are present.
# cd /tmp/ibmsupt
# snap -c

This will create a snap.pax.Z file in the /tmp/ibmsupt directory.

The snap file will need to be renamed to pmr#.branch#.countrycode.snap.pax.Z (US=000)
# mv snap.pax.Z

After the snap files have been renamed and you have a PMR number, ftp it to IBM:
ftp testcase.software.ibm.com
login: anonymous
password:
ftp> cd /toibm/aix
ftp> bin
ftp> put
ftp> quit

How to replace mirrored hard disk in Sun Solaris 8 Server

How to replace mirrored hard disk in Sun Solaris 8 Server

error hard disk = c1t1d0

if the harddisk is hotswapable, the hard disk can be replace without server downtime. Otherwise, you need to break the mirrored harddisk before replacing the hard disk.

# metastat -p
d10 -m d11 d12 1
d11 1 1 c1t0d0s0
d12 1 1 c1t1d0s0
d23 -m d22 d21 1
d22 1 1 c1t1d0s1
d21 1 1 c1t0d0s1
d30 -m d31 d32 1
d31 1 1 c3t8d0s0
d32 1 1 c4t8d0s0
d40 -m d41 d42 1
d41 1 1 c3t8d0s1
d42 1 1 c4t8d0s1
d50 -m d51 d52 1
d51 1 1 c1t0d0s4
d52 1 1 c1t1d0s4
d60 -m d61 d62 1
d61 1 1 c1t0d0s3
d62 1 1 c1t1d0s3
d90 -m d91 d92 1
d91 1 1 c3t8d0s5
d92 1 1 c4t8d0s5
d100 -m d101 d102 1
d101 1 2 c3t9d0s0 c3t11d0s0 -i 32b
d102 1 2 c4t9d0s0 c4t11d0s0 -i 32b
d110 -m d111 d112 1
d111 1 1 c3t12d0s0
d112 1 1 c4t12d0s0


# metadetach -f d50 d52
d50: submirror d52 is detached


# metadetach -f d60 d62
d60: submirror d62 is detached


# metadetach -f d23 d22
d23: submirror d22 is detached


# metadetach -f d10 d12
d10: submirror d12 is detached


# metastat -p
d10 -m d11 1
d11 1 1 c1t0d0s0
d23 -m d21 1
d21 1 1 c1t0d0s1
d30 -m d31 d32 1
d31 1 1 c3t8d0s0
d32 1 1 c4t8d0s0
d40 -m d41 d42 1
d41 1 1 c3t8d0s1
d42 1 1 c4t8d0s1
d50 -m d51 1
d51 1 1 c1t0d0s4
d60 -m d61 1
d61 1 1 c1t0d0s3
d90 -m d91 d92 1
d91 1 1 c3t8d0s5
d92 1 1 c4t8d0s5
d100 -m d101 d102 1
d101 1 2 c3t9d0s0 c3t11d0s0 -i 32b
d102 1 2 c4t9d0s0 c4t11d0s0 -i 32b
d110 -m d111 d112 1
d111 1 1 c3t12d0s0
d112 1 1 c4t12d0s0
d12 1 1 c1t1d0s0
d22 1 1 c1t1d0s1
d52 1 1 c1t1d0s4
d62 1 1 c1t1d0s3


# metaclear d12
d12: Concat/Stripe is cleared


# metaclear d22
d22: Concat/Stripe is cleared


# metaclear d52
d52: Concat/Stripe is cleared


# metaclear d62
d62: Concat/Stripe is cleared


# metadb -i
flags first blk block count
a m p luo 16 1034 /dev/dsk/c1t0d0s7
a p luo 1050 1034 /dev/dsk/c1t0d0s7
a p luo 2084 1034 /dev/dsk/c1t0d0s7
W p l 16 1034 /dev/dsk/c1t1d0s7
W p l 1050 1034 /dev/dsk/c1t1d0s7
W p l 2084 1034 /dev/dsk/c1t1d0s7

a p luo 16 1034 /dev/dsk/c3t9d0s7
a p luo 1050 1034 /dev/dsk/c3t9d0s7
a p luo 2084 1034 /dev/dsk/c3t9d0s7
a p luo 16 1034 /dev/dsk/c3t11d0s7
a p luo 1050 1034 /dev/dsk/c3t11d0s7
a p luo 2084 1034 /dev/dsk/c3t11d0s7
a p luo 16 1034 /dev/dsk/c3t12d0s7
a p luo 1050 1034 /dev/dsk/c3t12d0s7
a p luo 2084 1034 /dev/dsk/c3t12d0s7
a p luo 16 1034 /dev/dsk/c4t8d0s7
a p luo 1050 1034 /dev/dsk/c4t8d0s7
a p luo 2084 1034 /dev/dsk/c4t8d0s7
a p luo 16 1034 /dev/dsk/c4t9d0s7
a p luo 1050 1034 /dev/dsk/c4t9d0s7
a p luo 2084 1034 /dev/dsk/c4t9d0s7
a p luo 16 1034 /dev/dsk/c4t11d0s7
a p luo 1050 1034 /dev/dsk/c4t11d0s7
a p luo 2084 1034 /dev/dsk/c4t11d0s7
a p luo 16 1034 /dev/dsk/c4t12d0s7
a p luo 1050 1034 /dev/dsk/c4t12d0s7
a p luo 2084 1034 /dev/dsk/c4t12d0s7
o - replica active prior to last mddb configuration change
u - replica is up to date
l - locator for this replica was read successfully
c - replica's location was in /etc/lvm/mddb.cf
p - replica's location was patched in kernel
m - replica is master, this is replica selected as input
W - replica has device write errors
a - replica is active, commits are occurring to this replica
M - replica had problem with master blocks
D - replica had problem with data blocks
F - replica had format problems
S - replica is too small to hold current data base
R - replica had device read errors


# metadb -d c1t1d0s7


# metadb -i
flags first blk block count
a m p luo 16 1034 /dev/dsk/c1t0d0s7
a p luo 1050 1034 /dev/dsk/c1t0d0s7
a p luo 2084 1034 /dev/dsk/c1t0d0s7
a p luo 16 1034 /dev/dsk/c3t9d0s7
a p luo 1050 1034 /dev/dsk/c3t9d0s7
a p luo 2084 1034 /dev/dsk/c3t9d0s7
a p luo 16 1034 /dev/dsk/c3t11d0s7
a p luo 1050 1034 /dev/dsk/c3t11d0s7
a p luo 2084 1034 /dev/dsk/c3t11d0s7
a p luo 16 1034 /dev/dsk/c3t12d0s7
a p luo 1050 1034 /dev/dsk/c3t12d0s7
a p luo 2084 1034 /dev/dsk/c3t12d0s7
a p luo 16 1034 /dev/dsk/c4t8d0s7
a p luo 1050 1034 /dev/dsk/c4t8d0s7
a p luo 2084 1034 /dev/dsk/c4t8d0s7
a p luo 16 1034 /dev/dsk/c4t9d0s7
a p luo 1050 1034 /dev/dsk/c4t9d0s7
a p luo 2084 1034 /dev/dsk/c4t9d0s7
a p luo 16 1034 /dev/dsk/c4t11d0s7
a p luo 1050 1034 /dev/dsk/c4t11d0s7
a p luo 2084 1034 /dev/dsk/c4t11d0s7
a p luo 16 1034 /dev/dsk/c4t12d0s7
a p luo 1050 1034 /dev/dsk/c4t12d0s7
a p luo 2084 1034 /dev/dsk/c4t12d0s7
o - replica active prior to last mddb configuration change
u - replica is up to date
l - locator for this replica was read successfully
c - replica's location was in /etc/lvm/mddb.cf
p - replica's location was patched in kernel
m - replica is master, this is replica selected as input
W - replica has device write errors
a - replica is active, commits are occurring to this replica
M - replica had problem with master blocks
D - replica had problem with data blocks
F - replica had format problems
S - replica is too small to hold current data base
R - replica had device read errors


# cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 CD-ROM connected configured unknown
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t1d0 disk connected configured unknown
c1::dsk/c1t2d0 disk connected configured unknown
c1::dsk/c1t3d0 disk connected configured unknown
c2 scsi-bus connected unconfigured unknown
c3 scsi-bus connected configured unknown
c3::dsk/c3t11d0 disk connected configured unknown
c3::dsk/c3t12d0 disk connected configured unknown
c3::dsk/c3t8d0 disk connected configured unknown
c3::dsk/c3t9d0 disk connected configured unknown
c3::es/ses0 processor connected configured unknown
c4 scsi-bus connected configured unknown
c4::dsk/c4t11d0 disk connected configured unknown
c4::dsk/c4t12d0 disk connected configured unknown
c4::dsk/c4t8d0 disk connected configured unknown
c4::dsk/c4t9d0 disk connected configured unknown


# cfgadm -c unconfigure c1::dsk/c1t1d0


# cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 CD-ROM connected configured unknown
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t1d0 unavailable connected unconfigured unknown
c1::dsk/c1t2d0 disk connected configured unknown
c1::dsk/c1t3d0 disk connected configured unknown
c2 scsi-bus connected unconfigured unknown
c3 scsi-bus connected configured unknown
c3::dsk/c3t11d0 disk connected configured unknown
c3::dsk/c3t12d0 disk connected configured unknown
c3::dsk/c3t8d0 disk connected configured unknown
c3::dsk/c3t9d0 disk connected configured unknown
c3::es/ses0 processor connected configured unknown
c4 scsi-bus connected configured unknown
c4::dsk/c4t11d0 disk connected configured unknown
c4::dsk/c4t12d0 disk connected configured unknown
c4::dsk/c4t8d0 disk connected configured unknown
c4::dsk/c4t9d0 disk connected configured unknown


Shut down server and replace hard disk here if the server does not support hotswapable


#devfsadm

# cfgadm -c configure c1::dsk/c1t1d0
or
# cfgadm -x replace_device c1::sd1
Replacing SCSI device: /devices/pci@1c,600000/scsi@2/sd@1,0
This operation will suspend activity on SCSI bus: c1
Continue (yes/no)? yes
SCSI bus quiesced successfully.
It is now safe to proceed with hotplug operation.
Enter y if operation is complete or n to abort (yes/no)? y


# cfgadm -al
Ap_Id Type Receptacle Occupant Condition
c0 scsi-bus connected configured unknown
c0::dsk/c0t0d0 CD-ROM connected configured unknown
c1 scsi-bus connected configured unknown
c1::dsk/c1t0d0 disk connected configured unknown
c1::dsk/c1t1d0 disk connected configured unknown
c1::dsk/c1t2d0 disk connected configured unknown
c1::dsk/c1t3d0 disk connected configured unknown
c2 scsi-bus connected unconfigured unknown
c3 scsi-bus connected configured unknown
c3::dsk/c3t11d0 disk connected configured unknown
c3::dsk/c3t12d0 disk connected configured unknown
c3::dsk/c3t8d0 disk connected configured unknown
c3::dsk/c3t9d0 disk connected configured unknown
c3::es/ses0 processor connected configured unknown
c4 scsi-bus connected configured unknown
c4::dsk/c4t11d0 disk connected configured unknown
c4::dsk/c4t12d0 disk connected configured unknown
c4::dsk/c4t8d0 disk connected configured unknown
c4::dsk/c4t9d0 disk connected configured unknown


# format
Searching for disks...done


AVAILABLE DISK SELECTIONS:
0. c1t0d0
/pci@1c,600000/scsi@2/sd@0,0
1. c1t1d0
/pci@1c,600000/scsi@2/sd@1,0
2. c1t2d0
/pci@1c,600000/scsi@2/sd@2,0
3. c1t3d0
/pci@1c,600000/scsi@2/sd@3,0
4. c3t8d0
/pci@1d,700000/pci@1/scsi@4/sd@8,0
5. c3t9d0
/pci@1d,700000/pci@1/scsi@4/sd@9,0
6. c3t11d0
/pci@1d,700000/pci@1/scsi@4/sd@b,0
7. c3t12d0
/pci@1d,700000/pci@1/scsi@4/sd@c,0
8. c4t8d0
/pci@1d,700000/pci@1/scsi@5/sd@8,0
9. c4t9d0
/pci@1d,700000/pci@1/scsi@5/sd@9,0
10. c4t11d0
/pci@1d,700000/pci@1/scsi@5/sd@b,0
11. c4t12d0
/pci@1d,700000/pci@1/scsi@5/sd@c,0
Specify disk (enter its number): ^D


# prtvtoc /dev/rdsk/c1t0d0s2 fmthard -s - /dev/rdsk/c1t1d0s2
fmthard: New volume table of contents now in place.

# metadb -a c1t1d0s7

# metainit d12 1 1 c1t1d0s0
d12: Concat/Stripe is setup

# metainit d22 1 1 c1t1d0s1
d22: Concat/Stripe is setup

# metainit d52 1 1 c1t1d0s4
d52: Concat/Stripe is setup

# metainit d62 1 1 c1t1d0s3
d62: Concat/Stripe is setup

# metattach d60 d62
d60: submirror d62 is attached

# metattach d50 d52
d50: submirror d52 is attached

# metattach d23 d22
d23: submirror d22 is attached

# metattach d10 d12
d10: submirror d12 is attached

# metastat -p
d10 -m d11 d12 1
d11 1 1 c1t0d0s0
d12 1 1 c1t1d0s0
d23 -m d22 d21 1
d22 1 1 c1t1d0s1
d21 1 1 c1t0d0s1
d30 -m d31 d32 1
d31 1 1 c3t8d0s0
d32 1 1 c4t8d0s0
d40 -m d41 d42 1
d41 1 1 c3t8d0s1
d42 1 1 c4t8d0s1
d50 -m d51 d52 1
d51 1 1 c1t0d0s4
d52 1 1 c1t1d0s4
d60 -m d61 d62 1
d61 1 1 c1t0d0s3
d62 1 1 c1t1d0s3
d90 -m d91 d92 1
d91 1 1 c3t8d0s5
d92 1 1 c4t8d0s5
d100 -m d101 d102 1
d101 1 2 c3t9d0s0 c3t11d0s0 -i 32b
d102 1 2 c4t9d0s0 c4t11d0s0 -i 32b
d110 -m d111 d112 1
d111 1 1 c3t12d0s0
d112 1 1 c4t12d0s0

Check metastat for the mirror resync status.

# metastat

Friday, May 16, 2008

Setting Up a Solaris DHCP Client

Introduction

One of the problems that can arise when trying to use a Solaris box as a DHCP client is that by default, the server is expected to supply a hostname, in addition to all the other stuff (like IP address, DNS servers, etc.). Most cable modems and home routers don't supply a (usable) hostname, so it gets set to "unknown". This page describes how to get around that. (Where this page says "cable modem", "DSL modem" can be substituted.)

This page assumes that le0 is the interface you using for your DHCP connection. Substitute hme0 or whatever interface you're actually using in the examples below.

Setting up DHCP

There are two ways of using DHCP:

  • DHCP has limited control
  • DHCP has full control

The first case may be where you want to use your own /etc/resolv.conf and so on, with a minimum of hassle.

The second case would be the normal situation, especially if your cable modem provider has a habit of changing DNS name server IP addresses on you (like mine does!), so I'll concentrate on that here. I have a script to automate the first method, should you want to use it. You'll need to change the DEFAULT_ADDR and INTERFACE variables as required.

The first thing to do is to create an empty /etc/hostname.le0, like this:

> /etc/hostname.le0

Creating this file ensures that the interface gets plumbed, ready for the DHCP software to do its stuff.

Next, you create /etc/dhcp.le0. This file can be empty if you want to accept the defaults, but may also contain one or both of these directives:

  • wait time, and
  • primary

By default, ifconfig will wait 30 seconds for the DHCP server to respond (after which time, the boot will continue, while the interface gets configured in the background). Specifying the wait directive tells ifconfig not to return until the DHCP has responded. time can be set to the special value of forever, with obvious meaning. I use a time value of 300, which seems to be long enough for my cable provider.

The primary directive indicates to ifconfig that the current interface is the primary one, if you have more than one interface under DHCP control. If you only have one interface under DHCP control, then it is automatically the primary one, so primary is redundant (although it's permissible).

With these files in place, subsequent reboots will place le0 under DHCP control: you're ready to go!

Unknown hostname

Actually, there's one snag: most (if not all) cable modem DHCP servers don't provide you with a hostname (even if they did, odds are it won't be one you want anyway!). This wouldn't be a problem, except that the boot scripts (/etc/init.d/rootusr in particular) try to be clever, and set your hostname to "unknown" in this case, which is not at all useful!

The trick is to change your hostname back to the right one, preferably without changing any of the supplied start-up scripts, which are liable to be being stomped on when you upgrade or install a patch. You've also got to do it early enough in the boot process, so that rpcbind, sendmail and friends don't get confused by using the wrong hostname. To solve this problem, put this little script in to /etc/init.d/set_hostname, with a symbolic link to it from /etc/rc2.d/S70set_hostname.

Starting with Solaris 10, the preceding paragraph can be ignored. Instead, just make sure that the hostname you want to use is in /etc/nodename; the contents of that file will then be used to set the hostname. (Note that it is essential that the hostname you put into /etc/nodename is terminated with a carriage return. Breakage will happen if this is not the case.) Also, from Solaris 8 it is possible to tell the DHCP software not to request a hostname from the DHCP server. To do this, remove the token 12 from the PARAM_REQUEST_LIST line in /etc/default/dhcpagent. (/etc/default/dhcpagent describes what the default tokens are; 12 is the hostname, 3 is the default router, 6 is the DNS server, and so on.)

With these modifications in place, reboot, and you'll be using your cable modem in no time!


Taken from: http://www.rite-group.com/rich/solaris_dhcp.html

Wednesday, May 14, 2008

Quick Tips to Find Files on Linux File System

One of the first hurdles that every Linux newbie working on Command Line Interface (CLI) bumps into is finding files on the file system. Administrators who switch from Windows environment are so much used to the click-n-find mentality that discovering files via Linux CLI is painful for them. This tutorial is written for those friends who work on Linux and don’t have the luxury of Graphical User Interface (GUI).

I started playing with Linux during my internship, working with Snort (Intrusion Detection System), Nessus (Vulnerability Scanner) and IPTables (Firewall). Like most of programs, these tools also have quite a few configuration files. Initially, it was difficult for me to remember path to each file and I started to use the power of ‘find’ and ‘locate’ commands which I will share with you in this tutorial.

Method 1: LOCATE
Before we start playing around with LOCATE command, it’s important to learn about “updatedb”. Every day, your system automatically via cron runs updatedb command to create or update a database that keeps a record of all filenames. The locate command then searches through this database to find files.

This database is by default stored at /var/lib/mlocate/mlocate.db. Obviously we are curious to what this database looks like, so first I do ls -lh to find the size of this file.

Since this is in db format, I doubt if we would see anything legible with a “cat” command. So instead I used a string command, which threw a lot of file names on the string (132516 to be exact). Hence, I used grep to only see filenames which have lighttpd – a web server installed on my system.


But, of course this is not the right way to do searches. This we did just to see what updatedb is doing. Now let’s get back to “locate”. Remember that since locate is reading the database created by updatedb, so your results would be as new as the last run of updatedb command. You can always run updatedb manually from the CLI and then use the locate command.

Let’s start exercising this command by searching for commands. I start by looking for pdf documentation files for “snort”. If I just type in “locate snort” it gives me 1179 file names in result.

[root@localhost:~] locate snort less
/etc/snort
/etc/snort/rules
/etc/snort/rules/VRT-License.txt
/etc/snort/rules/attack-responses.rules
/etc/snort/rules/backdoor.rules
/etc/snort/rules/bad-traffic.rules
/etc/snort/rules/cgi-bin.list
/etc/snort/rules/chat.rules
/etc/snort/rules/classification.config
/etc/snort/rules/ddos.rules
/etc/snort/rules/deleted.rules
....

But, I want the documentation files which I already know are in PDF format. So now I will use power or regular expressions to further narrow down my results.

The “–r” options is used to tell “locate” command to expect a regular expression. In the above case, I use pdf$ in regex to only show me files which end with pdf.

Remember that updatedb exclude temporary folders, so it may not give you results as you expect. To remove these bottlenecks comes the command “find”.

Method 2: Find
Find command is the most useful of all commands I have used in my few years of managing Linux machines. Still this command is not fully understood and utilized by many administrators. Unlike “locate” command, “find” command actually goes through the file-system and looks for the pattern you define while running the command.

Most common usage of “find” command is to search for a file with specific file name.

Like “-name” find command has other qualifiers based on time as show below. These are also very helpful if you are doing forensic analysis on your Linux machine.

-iname = same, as name but case insensitive
-atime n = true, if file was accessed n days ago
-amin n = true, if file was accessed n minutes ago
-mtime n = true, if file contents were changed n days ago
-mmin n = true, if file content were changed n minutes ago
-ctime n = true, if file attributes were changed n days ago
-cmin n = true, if file attributes were changed n minutes ago

To make reader understand these qualifiers, I created a file with name “foobar.txt” four minutes back and then I run “find /root -mmin -5” to show me all files in /root folder where last modification time is less than 5 minutes and it shows me the foobar.txt file. However, if I change the value of –mmin to less than 2 minutes, it shows me nothing.


There is another very useful qualifier, which searches on file size.

Some other qualifiers that I always use while administering Linux servers are:

-regex expression = select files which match the regular expression
-iregex expression = same as above but case insensitive
-empty = select files and directories which are empty
-type filetype = Select file by Linux file types
-user username = Select files owned by the given user
-group groupname = Select files owned by the given group

There are few more qualifiers, but I leave those as homework for you to read the manpage and enhance your knowledge.

NOTE: One thing you will notice is that “locate” runs at super fast, that’s because it is looking from a database file rather than actually traversing the file system.

This was a very short and crisp introduction to find and locate commands, but these are the most important commands for any administrator. Once you get used to them, you will wish there was something similar and so powerful in windows.

Taken from: http://www.secguru.com/article/quick_tips_find_files_linux_file_system

Configuring sar for your system

Configuring sar for your system

Before you can tune a system properly, you must decide which system characteristics are important, and which ones are less so. Once you decide your priorities, you then need to find a way
to measure the system performance according to those priorities. In fact, the system activity reporter programs are a good measuring tool for many aspects of system performance. In this article, we'll introduce you to the sar utility, which can give you detailed performance information about your system.


What does sar measure?

Since system tuning involves the art of finding acceptable compromises, you need the ability see the impact of your changes on multiple subsystems. System activity reporter (SAR) programs
collect system-performance information in distinct groups. Table A shows how sar groups the performance information. The first column shows the switch you give to sar in order to request that particular information group, and the second column briefly describes the information group.

Table A

Switch Performance Monitoring Group
A All monitoring groups
a File access statistics
b Buffer activity
c System call activity
d Block device activity
g Paging out activity
k Kernel memory allocation
m Message and semaphores
p Paging in activity
q CPU Run queue statistics
r Unused memory and disk pages
u CPU usage statistics (default)
v Report status of system tables
w System swapping and switching
y TTY device activity



One way you can run sar is to specify a sampling interval and the number of times
you want it to run. So, if you want to check the file-access statistics every
20 seconds for the next five minutes, you'd run sar like this:This whole
listing is the command? It's just the first row, isn't it? What follows is the
results of the command?


$ sar -a 20 15

SunOS Devo 5.5.1 Generic_103641-08 i86pc    11/05/97
01:06:02  iget/s namei/s dirbk/s
01:06:22     270     397     278
01:06:42     602     785     685
01:07:02     194     238     215

Configuring sar to collect data

Notice that you can't just run sar right now. If you try to run the sar command without first configuring
it, it gives you an error message like this:

$ sar -a 20 15
sar: can't open /var/adm/sa/sa03
No such file or directory

Sure enough, if you look at the /var/adm/sa directory, you won't see any files in it, much less that
sa03 file it's complaining about. If you create a blank file, using touch, for example, sar will start to work. However, why must you do something so strange to make sar work? And if you try to run sar tomorrow, you'll get a similar error, but this time it will complain about a different file, such as sa04.


It turns out that the sar program is only one part of the performance monitoring package. Three commands in the /usr/lib/sa directory also contribute to the whole. The sadc command collects system data and stores it to a binary file, suitable for sar to use. The shell script sa1 is a wrapper for sadc, suitable for use in cron jobs, so it can be run automatically. The sa2 script is a wrapper for sar that forces it to print a report in ASCII format from the binary information in the files sadc creates.


If you run the sa1 script as intended, it creates a binary file containing all the performance statistics for the day. This file allows sar to read the data and report on it without forcing you to wait and collect it. Since you may want to investigate the data a bit later, or compare one days' worth of information against another, the sar, sa1, and sa2 programs name the data file using the
same format: /var/adm/sa/saX, where X is the day number. Therefore, when you run sar, one of the first things it does is look for today's binary file. When it doesn't find the file, it prints the error.


The best way to run sa1 and sa2 is from a cron job. Sun provides an example of how to create the cron job instead of forcing you to figure it out for yourself. Thus, if you edit the crontab for the account sys, you'll see commented-out sample cron schedules for sa1 and sa2, as shown in Figure A.


Figure A: The sys account already has prototype entries for running sa1 and sa2, which you can uncomment and use.


#ident  "@(#)sys        1.5     92/07/14 SMI"   /* SVr4.0 1.2   */
#
# The sys crontab should be used to do performance collection. See cron
# and performance manual pages for details on startup.
#
#0 * * * 0-6 /usr/lib/sa/sa1
#20,40 8-17 * * 1-5 /usr/lib/sa/sa1
#5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A

The first cron schedule uses sa1 to take a snapshot of system performance at the beginning of every hour every day. The second cron schedule adds a snapshot at 20 minutes (:20) and 40
minutes (:40) after the hour between 8:00A.M. and 5:00 P.M., every Monday through Friday. As a result, you get more detail during business hours, and less during the evenings and weekends.

The final line schedules sa2 to run at 6:05 every Monday through Friday to create an ASCII report from the data collected by sa1. This ASCII data is stored using a similar filename convention: /var/adm/sa/sarX, again where X is the day number.


The simplest way to configure sar to run is to edit the sys account's crontab and remove the # signs from the start of the sa1 and sa2 command lines. However, you may want to customize the cron schedules to suit your own preferences. For example, your company might run multiple shifts, and you may want more detailed data. Thus, you can modify the cron job to run sa1 at 15-minute intervals, every business day.


You can't just log into the sys account and edit the cron job, though, because the sys account is usually locked. Instead, you must log in as root, then su to the sys account, like so:


$ su
Password:
# su sys
#
 At this point, be sure to set the EDITOR environment variable to your favorite editor, and edit the
crontab file, like this:
# EDITOR=vi
# export EDITOR
# crontab –e
Now, your favorite editor (vi, in this case) comes up, and you can edit the cron schedules. For our
example, we just want to run sa1 every 15 minutes every day, and the sa2 program should generate ASCII versions of the data just before midnight. So
we'll change the cron schedule to look like this:
0,15,30,45 * * * 0-6 /usr/lib/sa/sa1
55 23 * * 0-6 /usr/lib/sa/sa2 –A
Next, we save the file and exit, and crontab will start the appropriate cron jobs for us. That's all you
must do to configure sar. Once you do so, you can use sar without worrying about the file open errors any more.

Using the binary data files

Once the system is creating the binary data files, you can use sar without specifying the interval between samples and the number of samples you want to take. You can simply specify the
data sets you want to see, and sar will print all that's accumulated thus far for the day. Therefore, if you're interested in CPU use and paging activity, you'd run sar as shown in Figure B. Since we ran sar near the end of the day, and we're sampling every 15 minutes, we're inundated with details. That's the major problem with detail--it's easy to get swamped.

Figure B: The sar -up command reports detailed information about the CPU and paging use up to the current time.


$ sar -up
SunOS Devo 5.5.1 Generic_103641-08 i86pc 11/04/97
00:00:01    %usr    %sys    %wio   %idle
00:15:00       0       0       0      99
00:30:00       0       0       0      99
00:45:00       0       1       0      99
22:15:00       0       0       0      99
22:30:00       0       0       0      99
22:45:00       1       1       3      95
Average        3       1       4      92
 
00:00:01  atch/s  pgin/s ppgin/s  pflt/s  vflt/s slock/s
00:15:00    0.00    0.02    0.03    1.82    2.93    0.00
00:30:00    0.00    0.00    0.00    4.35    6.15    0.00
00:45:00    0.00    0.02    0.02   38.95   44.79    0.00

Getting the bigger picture

While getting a detailed picture of your system is wonderful, you probably don't need or want such a detailed report very often. After all, your job is to manage the system, not micromanage it. Do you think the president of your company monitors the details of the day-to-day operations of the company? Of course not--the president is happy to see the weekly reports showing that the business is chugging along smoothly. It's only when the business is having problems that the president starts to examine and analyze details. Your role as system administrator is similar to that of the company president: As long as the system is running smoothly, you merely want to glance at a report to see that everything is going nicely. You don't want to delve into a morass of details unless something's awry. Consequently, what we usually want from sar isn't a detailed report on
all the system statistics, but rather a simple summary.

The sar command provides three command-line switches to let you control how you want sar to summarize its data. The -s and -e options allow you to select the starting and ending times of the report, and the -i option allows you to specify the reporting interval. So you can see an hourly summary of CPU usage during working hours by using sar like this:


$ sar -s 08 -e 18 -i 3600 -u
SunOS Devo 5.5.1 Generic_103641-08 i86pc    11/03/97
08:00:00    %usr    %sys    %wio   %idle
09:00:01       0       1       2      97
10:00:00       3       3       1      94
11:00:00       0       0       0     100
12:00:00       0       0       0     100
13:00:00       0       0       0     100
14:00:00       0       0       0     100
15:00:00       5      56      30       8
16:00:01       3      68      24       5
17:00:00       0      11      10      79
18:00:00       0       0       0     100
 
Average        1      14       7      78

If we had a performance problem during the day, we could quickly tell when it occurred using this summary report. Then, we'd adjust our s, e, and i options to focus on the details we're actually interested in seeing. Instead of wading through pages of data, we can be selective.

Conclusion

Once you get sar configured, it can capture all the performance statistics for your machine. It's a good idea to browse through the man page for sar a few times to get acquainted with the values it can capture. You don't have to understand all of it, especially at the beginning. To start with, it's a good policy to become familiar with the numbers when your system is operating normally, because then you'll be able to pinpoint which system characteristics are degrading, and begin addressing the problems.

Taken from: http://members.tripod.com/Dennis_Caparas/Configuring_sar_for_your_system.html

Friday, May 9, 2008

How to Expand a Solaris File System

Note

Solaris Volume Manager volumes can be expanded. However, volumes cannot be reduced in size.

· A volume can be expanded whether it is used for a file system, application, or database. You can expand RAID-0 (stripe and concatenation) volumes, RAID-1 (mirror) volumes, and RAID-5 volumes and soft partitions.

· You can concatenate a volume that contains an existing file system while the file system is in use. As long as the file system is a UFS file system, the file system can be expanded (with the growfs command) to fill the larger space. You can expand the file system without interrupting read access to the data.

· Once a file system is expanded, it cannot be reduced in size, due to constraints in the UFS file system.

· Applications and databases that use the raw device must have their own method to expand the added space so that they can recognize it. Solaris Volume Manager does not provide this capability.

· When a component is added to a RAID-5 volume, it becomes a concatenation to the volume. The new component does not contain parity information. However, data on the new component is protected by the overall parity calculation that takes place for the volume.

· You can expand a log device by adding additional components. You do not need to run the growfs command, as Solaris Volume Manager automatically recognizes the additional space on reboot.

· Soft partitions can be expanded by adding space from the underlying volume or slice. All other volumes can be expanded by adding slices.

Taken from: http://docs.huihoo.com/opensolaris/solaris-volume-manager-administration-guide/html/ch20s06.html


Steps for expanding filesystem with soft partition:

1. Check Prerequisites

# df -k /local
Filesystem kbytes used avail capacity Mounted on
/dev/md/dsk/d46 62992061 58223588 4138553 94% /local

# metastat -p d46
d46 -p d50 -o 763363392 -b 117440512 -o 922747008 -b 10485760
d50 -m d49 1
d49 2 2 c8t60060E8004EAEA000000EAEA000027FFd0s6 c8t60060E8004EAEA000000EAEA0000273Bd0s6 -i 32b \
1 c8t60060E8004EAEA000000EAEA000027F7d0s6

# metarecover -n -v /dev/md/rdsk/d50 -p -m | grep FREE
NONE 0 FREE 0 31
NONE 0 FREE 763363360 31
NONE 0 FREE 880803904 31
NONE 0 FREE 922746976 31
NONE 0 FREE 933232768 31
NONE 0 FREE 1008730272 391510367

2. Expand the soft partition

# metattach d46 24gb
d46: Soft Partition has been grown

3. Expand the filesystem

# growfs -M /local /dev/md/rdsk/d46
Warning: 2560 sector(s) in last cylinder unallocated
/dev/md/rdsk/d46: 178257920 sectors in 23211 cylinders of 15 tracks, 512 sectors
87040.0MB in 1658 cyl groups (14 c/g, 52.50MB/g, 6400 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 108064, 216096, 324128, 432160, 540192, 648224, 756256, 864288, 972320,
Initializing cylinder groups:
................................
super-block backups for last 10 cylinder groups at:
177192992, 177301024, 177409056, 177517088, 177625120, 177733152, 177841184,
177949216, 178057248, 178165280,

4. Verify filesystem status

# df -k /local
Filesystem kbytes used avail capacity Mounted on
/dev/md/dsk/d46 87775990 58226660 28919410 67% /local
Copyright ©2008 PreciousTulips. All rights reserved.