System Administrator's Collection: May 2008

Friday, May 16, 2008

Setting Up a Solaris DHCP Client

Introduction

One of the problems that can arise when trying to use a Solaris box as a DHCP client is that by default, the server is expected to supply a hostname, in addition to all the other stuff (like IP address, DNS servers, etc.). Most cable modems and home routers don't supply a (usable) hostname, so it gets set to "unknown". This page describes how to get around that. (Where this page says "cable modem", "DSL modem" can be substituted.)

This page assumes that le0 is the interface you using for your DHCP connection. Substitute hme0 or whatever interface you're actually using in the examples below.

Setting up DHCP

There are two ways of using DHCP:

DHCP has limited control
DHCP has full control

The first case may be where you want to use your own /etc/resolv.conf and so on, with a minimum of hassle.

The second case would be the normal situation, especially if your cable modem provider has a habit of changing DNS name server IP addresses on you (like mine does!), so I'll concentrate on that here. I have a script to automate the first method, should you want to use it. You'll need to change the DEFAULT_ADDR and INTERFACE variables as required.

The first thing to do is to create an empty /etc/hostname.le0, like this:

> /etc/hostname.le0

Creating this file ensures that the interface gets plumbed, ready for the DHCP software to do its stuff.

Next, you create /etc/dhcp.le0. This file can be empty if you want to accept the defaults, but may also contain one or both of these directives:

wait time, and
primary

By default, ifconfig will wait 30 seconds for the DHCP server to respond (after which time, the boot will continue, while the interface gets configured in the background). Specifying the wait directive tells ifconfig not to return until the DHCP has responded. time can be set to the special value of forever, with obvious meaning. I use a time value of 300, which seems to be long enough for my cable provider.

The primary directive indicates to ifconfig that the current interface is the primary one, if you have more than one interface under DHCP control. If you only have one interface under DHCP control, then it is automatically the primary one, so primary is redundant (although it's permissible).

With these files in place, subsequent reboots will place le0 under DHCP control: you're ready to go!

Unknown hostname

Actually, there's one snag: most (if not all) cable modem DHCP servers don't provide you with a hostname (even if they did, odds are it won't be one you want anyway!). This wouldn't be a problem, except that the boot scripts (/etc/init.d/rootusr in particular) try to be clever, and set your hostname to "unknown" in this case, which is not at all useful!

The trick is to change your hostname back to the right one, preferably without changing any of the supplied start-up scripts, which are liable to be being stomped on when you upgrade or install a patch. You've also got to do it early enough in the boot process, so that rpcbind, sendmail and friends don't get confused by using the wrong hostname. To solve this problem, put this little script in to /etc/init.d/set_hostname, with a symbolic link to it from /etc/rc2.d/S70set_hostname.

Starting with Solaris 10, the preceding paragraph can be ignored. Instead, just make sure that the hostname you want to use is in /etc/nodename; the contents of that file will then be used to set the hostname. (Note that it is essential that the hostname you put into /etc/nodename is terminated with a carriage return. Breakage will happen if this is not the case.) Also, from Solaris 8 it is possible to tell the DHCP software not to request a hostname from the DHCP server. To do this, remove the token 12 from the PARAM_REQUEST_LIST line in /etc/default/dhcpagent. (/etc/default/dhcpagent describes what the default tokens are; 12 is the hostname, 3 is the default router, 6 is the DNS server, and so on.)

With these modifications in place, reboot, and you'll be using your cable modem in no time!

Taken from: http://www.rite-group.com/rich/solaris_dhcp.html

Wednesday, May 14, 2008

Quick Tips to Find Files on Linux File System

One of the first hurdles that every Linux newbie working on Command Line Interface (CLI) bumps into is finding files on the file system. Administrators who switch from Windows environment are so much used to the click-n-find mentality that discovering files via Linux CLI is painful for them. This tutorial is written for those friends who work on Linux and don’t have the luxury of Graphical User Interface (GUI).

I started playing with Linux during my internship, working with Snort (Intrusion Detection System), Nessus (Vulnerability Scanner) and IPTables (Firewall). Like most of programs, these tools also have quite a few configuration files. Initially, it was difficult for me to remember path to each file and I started to use the power of ‘find’ and ‘locate’ commands which I will share with you in this tutorial.

Method 1: LOCATE
Before we start playing around with LOCATE command, it’s important to learn about “updatedb”. Every day, your system automatically via cron runs updatedb command to create or update a database that keeps a record of all filenames. The locate command then searches through this database to find files.

This database is by default stored at /var/lib/mlocate/mlocate.db. Obviously we are curious to what this database looks like, so first I do ls -lh to find the size of this file.

Since this is in db format, I doubt if we would see anything legible with a “cat” command. So instead I used a string command, which threw a lot of file names on the string (132516 to be exact). Hence, I used grep to only see filenames which have lighttpd – a web server installed on my system.

But, of course this is not the right way to do searches. This we did just to see what updatedb is doing. Now let’s get back to “locate”. Remember that since locate is reading the database created by updatedb, so your results would be as new as the last run of updatedb command. You can always run updatedb manually from the CLI and then use the locate command.

Let’s start exercising this command by searching for commands. I start by looking for pdf documentation files for “snort”. If I just type in “locate snort” it gives me 1179 file names in result.

[root@localhost:~] locate snort less
/etc/snort
/etc/snort/rules
/etc/snort/rules/VRT-License.txt
/etc/snort/rules/attack-responses.rules
/etc/snort/rules/backdoor.rules
/etc/snort/rules/bad-traffic.rules
/etc/snort/rules/cgi-bin.list
/etc/snort/rules/chat.rules
/etc/snort/rules/classification.config
/etc/snort/rules/ddos.rules
/etc/snort/rules/deleted.rules
....

But, I want the documentation files which I already know are in PDF format. So now I will use power or regular expressions to further narrow down my results.

The “–r” options is used to tell “locate” command to expect a regular expression. In the above case, I use pdf$ in regex to only show me files which end with pdf.

Remember that updatedb exclude temporary folders, so it may not give you results as you expect. To remove these bottlenecks comes the command “find”.

Method 2: Find
Find command is the most useful of all commands I have used in my few years of managing Linux machines. Still this command is not fully understood and utilized by many administrators. Unlike “locate” command, “find” command actually goes through the file-system and looks for the pattern you define while running the command.

Most common usage of “find” command is to search for a file with specific file name.

Like “-name” find command has other qualifiers based on time as show below. These are also very helpful if you are doing forensic analysis on your Linux machine.

-iname = same, as name but case insensitive
-atime n = true, if file was accessed n days ago
-amin n = true, if file was accessed n minutes ago
-mtime n = true, if file contents were changed n days ago
-mmin n = true, if file content were changed n minutes ago
-ctime n = true, if file attributes were changed n days ago
-cmin n = true, if file attributes were changed n minutes ago

To make reader understand these qualifiers, I created a file with name “foobar.txt” four minutes back and then I run “find /root -mmin -5” to show me all files in /root folder where last modification time is less than 5 minutes and it shows me the foobar.txt file. However, if I change the value of –mmin to less than 2 minutes, it shows me nothing.

There is another very useful qualifier, which searches on file size.

Some other qualifiers that I always use while administering Linux servers are:

-regex expression = select files which match the regular expression
-iregex expression = same as above but case insensitive
-empty = select files and directories which are empty
-type filetype = Select file by Linux file types
-user username = Select files owned by the given user
-group groupname = Select files owned by the given group

There are few more qualifiers, but I leave those as homework for you to read the manpage and enhance your knowledge.

NOTE: One thing you will notice is that “locate” runs at super fast, that’s because it is looking from a database file rather than actually traversing the file system.

This was a very short and crisp introduction to find and locate commands, but these are the most important commands for any administrator. Once you get used to them, you will wish there was something similar and so powerful in windows.

Taken from: http://www.secguru.com/article/quick_tips_find_files_linux_file_system

Configuring sar for your system

Before you can tune a system properly, you must decide which system characteristics are important, and which ones are less so. Once you decide your priorities, you then need to find a way
to measure the system performance according to those priorities. In fact, the system activity reporter programs are a good measuring tool for many aspects of system performance. In this article, we'll introduce you to the sar utility, which can give you detailed performance information about your system.

What does sar measure?

Since system tuning involves the art of finding acceptable compromises, you need the ability see the impact of your changes on multiple subsystems. System activity reporter (SAR) programs
collect system-performance information in distinct groups. Table A shows how sar groups the performance information. The first column shows the switch you give to sar in order to request that particular information group, and the second column briefly describes the information group.

Table A

Switch Performance Monitoring Group
A All monitoring groups
a File access statistics
b Buffer activity
c System call activity
d Block device activity
g Paging out activity
k Kernel memory allocation
m Message and semaphores
p Paging in activity
q CPU Run queue statistics
r Unused memory and disk pages
u CPU usage statistics (default)
v Report status of system tables
w System swapping and switching
y TTY device activity

One way you can run sar is to specify a sampling interval and the number of times
you want it to run. So, if you want to check the file-access statistics every
20 seconds for the next five minutes, you'd run sar like this:This whole
listing is the command? It's just the first row, isn't it? What follows is the
results of the command?

$ sar -a 20 15

SunOS Devo 5.5.1 Generic_103641-08 i86pc    11/05/97

01:06:02  iget/s namei/s dirbk/s

01:06:22     270     397     278

01:06:42     602     785     685

01:07:02     194     238     215

Configuring sar to collect data

Notice that you can't just run sar right now. If you try to run the sar command without first configuring
it, it gives you an error message like this:

$ sar -a 20 15

sar: can't open /var/adm/sa/sa03

No such file or directory

Sure enough, if you look at the /var/adm/sa directory, you won't see any files in it, much less that
sa03 file it's complaining about. If you create a blank file, using touch, for example, sar will start to work. However, why must you do something so strange to make sar work? And if you try to run sar tomorrow, you'll get a similar error, but this time it will complain about a different file, such as sa04.

It turns out that the sar program is only one part of the performance monitoring package. Three commands in the /usr/lib/sa directory also contribute to the whole. The sadc command collects system data and stores it to a binary file, suitable for sar to use. The shell script sa1 is a wrapper for sadc, suitable for use in cron jobs, so it can be run automatically. The sa2 script is a wrapper for sar that forces it to print a report in ASCII format from the binary information in the files sadc creates.

If you run the sa1 script as intended, it creates a binary file containing all the performance statistics for the day. This file allows sar to read the data and report on it without forcing you to wait and collect it. Since you may want to investigate the data a bit later, or compare one days' worth of information against another, the sar, sa1, and sa2 programs name the data file using the
same format: /var/adm/sa/saX, where X is the day number. Therefore, when you run sar, one of the first things it does is look for today's binary file. When it doesn't find the file, it prints the error.

The best way to run sa1 and sa2 is from a cron job. Sun provides an example of how to create the cron job instead of forcing you to figure it out for yourself. Thus, if you edit the crontab for the account sys, you'll see commented-out sample cron schedules for sa1 and sa2, as shown in Figure A.

Figure A: The sys account already has prototype entries for running sa1 and sa2, which you can uncomment and use.

#ident  "@(#)sys        1.5     92/07/14 SMI"   /* SVr4.0 1.2   */

# The sys crontab should be used to do performance collection. See cron

# and performance manual pages for details on startup.

#0 * * * 0-6 /usr/lib/sa/sa1

#20,40 8-17 * * 1-5 /usr/lib/sa/sa1

#5 18 * * 1-5 /usr/lib/sa/sa2 -s 8:00 -e 18:01 -i 1200 -A

The first cron schedule uses sa1 to take a snapshot of system performance at the beginning of every hour every day. The second cron schedule adds a snapshot at 20 minutes (:20) and 40
minutes (:40) after the hour between 8:00A.M. and 5:00 P.M., every Monday through Friday. As a result, you get more detail during business hours, and less during the evenings and weekends.

The final line schedules sa2 to run at 6:05 every Monday through Friday to create an ASCII report from the data collected by sa1. This ASCII data is stored using a similar filename convention: /var/adm/sa/sarX, again where X is the day number.

The simplest way to configure sar to run is to edit the sys account's crontab and remove the # signs from the start of the sa1 and sa2 command lines. However, you may want to customize the cron schedules to suit your own preferences. For example, your company might run multiple shifts, and you may want more detailed data. Thus, you can modify the cron job to run sa1 at 15-minute intervals, every business day.

You can't just log into the sys account and edit the cron job, though, because the sys account is usually locked. Instead, you must log in as root, then su to the sys account, like so:

$ su

Password:

# su sys

 At this point, be sure to set the EDITOR environment variable to your favorite editor, and edit the
crontab file, like this:

# EDITOR=vi

# export EDITOR

# crontab –e

Now, your favorite editor (vi, in this case) comes up, and you can edit the cron schedules. For our
example, we just want to run sa1 every 15 minutes every day, and the sa2 program should generate ASCII versions of the data just before midnight. So
we'll change the cron schedule to look like this:

0,15,30,45 * * * 0-6 /usr/lib/sa/sa1

55 23 * * 0-6 /usr/lib/sa/sa2 –A

Next, we save the file and exit, and crontab will start the appropriate cron jobs for us. That's all you
must do to configure sar. Once you do so, you can use sar without worrying about the file open errors any more.

Using the binary data files

Once the system is creating the binary data files, you can use sar without specifying the interval between samples and the number of samples you want to take. You can simply specify the
data sets you want to see, and sar will print all that's accumulated thus far for the day. Therefore, if you're interested in CPU use and paging activity, you'd run sar as shown in Figure B. Since we ran sar near the end of the day, and we're sampling every 15 minutes, we're inundated with details. That's the major problem with detail--it's easy to get swamped.

Figure B: The sar -up command reports detailed information about the CPU and paging use up to the current time.

$ sar -up

SunOS Devo 5.5.1 Generic_103641-08 i86pc 11/04/97

00:00:01    %usr    %sys    %wio   %idle

00:15:00       0       0       0      99

00:30:00       0       0       0      99

00:45:00       0       1       0      99

22:15:00       0       0       0      99

22:30:00       0       0       0      99

22:45:00       1       1       3      95

Average        3       1       4      92

00:00:01  atch/s  pgin/s ppgin/s  pflt/s  vflt/s slock/s

00:15:00    0.00    0.02    0.03    1.82    2.93    0.00

00:30:00    0.00    0.00    0.00    4.35    6.15    0.00

00:45:00    0.00    0.02    0.02   38.95   44.79    0.00

Getting the bigger picture

While getting a detailed picture of your system is wonderful, you probably don't need or want such a detailed report very often. After all, your job is to manage the system, not micromanage it. Do you think the president of your company monitors the details of the day-to-day operations of the company? Of course not--the president is happy to see the weekly reports showing that the business is chugging along smoothly. It's only when the business is having problems that the president starts to examine and analyze details. Your role as system administrator is similar to that of the company president: As long as the system is running smoothly, you merely want to glance at a report to see that everything is going nicely. You don't want to delve into a morass of details unless something's awry. Consequently, what we usually want from sar isn't a detailed report on
all the system statistics, but rather a simple summary.

The sar command provides three command-line switches to let you control how you want sar to summarize its data. The -s and -e options allow you to select the starting and ending times of the report, and the -i option allows you to specify the reporting interval. So you can see an hourly summary of CPU usage during working hours by using sar like this:

$ sar -s 08 -e 18 -i 3600 -u

SunOS Devo 5.5.1 Generic_103641-08 i86pc    11/03/97

08:00:00    %usr    %sys    %wio   %idle

09:00:01       0       1       2      97

10:00:00       3       3       1      94

11:00:00       0       0       0     100

12:00:00       0       0       0     100

13:00:00       0       0       0     100

14:00:00       0       0       0     100

15:00:00       5      56      30       8

16:00:01       3      68      24       5

17:00:00       0      11      10      79

18:00:00       0       0       0     100

Average        1      14       7      78

If we had a performance problem during the day, we could quickly tell when it occurred using this summary report. Then, we'd adjust our s, e, and i options to focus on the details we're actually interested in seeing. Instead of wading through pages of data, we can be selective.

Conclusion

Once you get sar configured, it can capture all the performance statistics for your machine. It's a good idea to browse through the man page for sar a few times to get acquainted with the values it can capture. You don't have to understand all of it, especially at the beginning. To start with, it's a good policy to become familiar with the numbers when your system is operating normally, because then you'll be able to pinpoint which system characteristics are degrading, and begin addressing the problems.

Taken from: http://members.tripod.com/Dennis_Caparas/Configuring_sar_for_your_system.html

Friday, May 9, 2008

How to Expand a Solaris File System

Note

Solaris Volume Manager volumes can be expanded. However, volumes cannot be reduced in size.

· A volume can be expanded whether it is used for a file system, application, or database. You can expand RAID-0 (stripe and concatenation) volumes, RAID-1 (mirror) volumes, and RAID-5 volumes and soft partitions.

· You can concatenate a volume that contains an existing file system while the file system is in use. As long as the file system is a UFS file system, the file system can be expanded (with the growfs command) to fill the larger space. You can expand the file system without interrupting read access to the data.

· Once a file system is expanded, it cannot be reduced in size, due to constraints in the UFS file system.

· Applications and databases that use the raw device must have their own method to expand the added space so that they can recognize it. Solaris Volume Manager does not provide this capability.

· When a component is added to a RAID-5 volume, it becomes a concatenation to the volume. The new component does not contain parity information. However, data on the new component is protected by the overall parity calculation that takes place for the volume.

· You can expand a log device by adding additional components. You do not need to run the growfs command, as Solaris Volume Manager automatically recognizes the additional space on reboot.

· Soft partitions can be expanded by adding space from the underlying volume or slice. All other volumes can be expanded by adding slices.

Taken from: http://docs.huihoo.com/opensolaris/solaris-volume-manager-administration-guide/html/ch20s06.html

Steps for expanding filesystem with soft partition:

1. Check Prerequisites

# df -k /local
Filesystem kbytes used avail capacity Mounted on
/dev/md/dsk/d46 62992061 58223588 4138553 94% /local

# metastat -p d46
d46 -p d50 -o 763363392 -b 117440512 -o 922747008 -b 10485760
d50 -m d49 1
d49 2 2 c8t60060E8004EAEA000000EAEA000027FFd0s6 c8t60060E8004EAEA000000EAEA0000273Bd0s6 -i 32b \
1 c8t60060E8004EAEA000000EAEA000027F7d0s6

# metarecover -n -v /dev/md/rdsk/d50 -p -m | grep FREE
NONE 0 FREE 0 31
NONE 0 FREE 763363360 31
NONE 0 FREE 880803904 31
NONE 0 FREE 922746976 31
NONE 0 FREE 933232768 31
NONE 0 FREE 1008730272 391510367

2. Expand the soft partition

# metattach d46 24gb
d46: Soft Partition has been grown

3. Expand the filesystem

# growfs -M /local /dev/md/rdsk/d46
Warning: 2560 sector(s) in last cylinder unallocated
/dev/md/rdsk/d46: 178257920 sectors in 23211 cylinders of 15 tracks, 512 sectors
87040.0MB in 1658 cyl groups (14 c/g, 52.50MB/g, 6400 i/g)
super-block backups (for fsck -F ufs -o b=#) at:
32, 108064, 216096, 324128, 432160, 540192, 648224, 756256, 864288, 972320,
Initializing cylinder groups:
................................
super-block backups for last 10 cylinder groups at:
177192992, 177301024, 177409056, 177517088, 177625120, 177733152, 177841184,
177949216, 178057248, 178165280,

4. Verify filesystem status

# df -k /local
Filesystem kbytes used avail capacity Mounted on
/dev/md/dsk/d46 87775990 58226660 28919410 67% /local

NFS Mount Point Permission Issue (ls: ..: Permission denied)

This is to fix the moint point permission found by nfs user.

# su - nfsuser
$ cd /var
$ ls -la
ls: ..: Permission denied

For Example:

1. Boot to OK prompt
# init 0

2. Boot to maintenance mode
OK> boot –s

3. Make sure /var is not mounted
# mount grep var

4. If mounted:

Solaris 8 has an option:
# umount -f

which will forcibly unmount any partition from it's mount point. This should only be used in extreme circumstances, since anyone accessing a file from that partition will now get an error (EOI)

In other releases of Solaris:
If you get the following error umount: /var busy you can use the command >fuser -f /var.

This will return the following:
# fuser -f /var
/var: 2885c 2857c

Next perform the following:
# ps -ef grep 2857
root 2890 2857 0 13:40:30 pts/2 0:00 -sh
root 2857 2855 0 13:30:10 pts/2 0:00 -sh

if you kill the following shell with a kill or kill -9 you should be able to unmount the partition

5. Set permissions to 755 for /var mount point
# chmod 755 /var

6. Exit to multiuser mode
# exit

7. Verify that permission has been corrected:
# su – nfsuser
$ cd /var
$ ls –la

8. User should have no permission issue to ls now.

Saturday, May 3, 2008

Managing swap in the Solaris OS

Amit Dixit, October 2006

Installation of the Solaris OS creates /swap space and allocates 512 Mbyte by default. The Solaris OS supports applying swap to raw disk partitions and to file systems, and it also uses physical RAM as a swap area. Usually physical memory is more efficient, but we are always restricted with the amount of physical memory installed on the system.

It's always a good idea to apply swap to a raw partition, as compared to a file system, because a raw partition doesn't involve the overhead of the file system.

(Note: I've written this for Solaris versions 7, 8, 9, and 10. That said, I am pretty sure this is applicable to all the versions.)

Adding Raw Partition swap Space

To add a raw swap partition you need to perform the following steps on your system:

1. Identify a free disk partition on your system.

2. Add an entry to /etc/vfstab for the new raw partition as a swap partition:
/dev/dsk/c0t1d0s0 - - swap - no -

3. To enable this swap partition, issue the following command:
#swap -a /dev/desk/c0t1d0s0

4. To view the current swap details, use the following command:#swap -l

Adding File System swap

The Solaris OS supports applying swap to a file. To enable a file system swap you need to perform the following tasks:

1. Create a file using mkfile:
#mkfile 250m /opt/myswapfile

This will create a 250 Meg file, which the Solaris OS can use for swap.

2. To use this swap file, enable it with the following command:
#swap -a /opt/myswapfile

3. Check your change:
#swap -l

Note: To enable the new swap file at the next system boot, add the following entry to /etc/vfstab:
/opt/swapfile - - swap - no -

Disabling swap Space

The Solaris OS provides the ability to disable a swap file while the system is running. This is done with the -d option for swap. All allocated blocks are copied to other swap areas.

solaris# swap -d /opt/myswapfile

To check your change, type this:

solaris# swap -l

Monitoring swap

It's always important to configure the right amount of swap space: Too little will result in poor performance and too much will waste disk space.

The Solaris OS starts using swap if it's running out of physical memory. This is called paging.

Here's how to get a summary of swap space:

solaris#swap -s
total: 3500744k bytes allocated + 3048720k reserved = 6549464k used,23869824k available

And here's how to get details on the individual device or file that constitutes swap space:

solaris#swap -l
swapfile dev swaplo blocks free
/dev/md/dsk/d1 85,1 16 41945456 41945456

If your system is running out of swap space you will see the following errors:

Not Enough Space

or

WARNING /tmp: File system full, swap space limit exceeded

To see if the system is running short of physical memory you can use vmstat and iostat.

solaris#vmstatkthr memory page disk faults cpu
r b w swap free re mf pi po fr de sr m0 m1 m3 m4 in sy cs us sy id
0 0 0 24137360 6421168 70 179 21 14 14 0 0 0 0 0 0 472 3363 1776 4 2 94
0 0 0 23869912 5953040 11 13 0 0 0 0 0 0 0 0 0 430 1071 1545 7 1 92
0 0 0 23870896 5953904 58 313 0 2 2 0 0 0 0 0 0 578 2369 1798 20 1 78
0 0 0 23874712 5957216 11 11 0 0 0 0 0 0 0 0 0 417 1325 1648 0 0 100
0 0 0 23874744 5957248 22 64 0 3 3 0 0 0 0 0 0 423 1578 1629 1 2 97

Watch the column sr (Scan Rate) in the vmstat output.

solaris#iostat -Pxn

extended device statistics
r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device
0.1 2.7 1.1 5.6 0.0 0.1 0.2 25.5 0 2 c1t0d0s0
0.0 0.0 0.0 0.0 0.0 0.0 0.0 9.7 0 0 c1t0d0s1
0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.2 0 0 c1t0d0s2

Watch the r/s and w/s columns in the iostat output for the device, which is configured as the swap device. If the values are high this means that a large amount of I/O is generated to free up pages.

If physical memory is too low, the system will be busy paging to the swap device with a heavy I/O on the swap device. In this state the system's CPU utilization will also increase.

Summary

For improved system performance it's important that you have allocated sufficient swap space to the system. To start with, configure 1.5 times the physical memory installed on the system. If required, allocate more swap space.

Taken from: http://www.sun.com/bigadmin/content/submitted/manage_swap.html

Understanding and setting up Solstice DiskSuite in Solaris

About Solstice DiskSuite:

SolsticeTM DiskSuiteTM 4.2.1 is a software product that manages data and disk drives.
Solstice DiskSuite 4.2.1 runs on all SPARCTM systems running SolarisTM 8, and on all x86 systems running Solaris 8.

DiskSuite's diskset feature is supported only on the SPARC platform edition of Solaris. This feature is not supported on x86 systems.

1. Advantages of Disksuite
Solstice disk suite provides three major functionalities :
1. Over come the disk size limitation by providing for joining of multiple disk slices to form a bigger volume.
2. Fault Tolerance by allowing mirroring of data from one disk to another and keeping parity information in RAID5.
3. Performance enhancement by allowing spreading the data space over multiple disks.

2. Disksuite terms
Metadevice :A virtual device composed of several physical devices - slices/disks . All the operations are carried out using metadevice name and transparently implemented on the individual device.

RAID : A group of disks used for creating a virtual volume is called array and depending on disk/slice arrangement these are called various types of RAID (Redundant Array of Independent Disk ).
RAID 0 Concatenation/Striping
RAID 1 Mirroring
RAID 5 Striped array with rotating parity.

Concatenation :Concatenation is joining of two or more disk slices to add up the disk space . Concatenation is serial in nature i.e. sequential data operation are performed serially on first disk then second disk and so on . Due to serial nature new slices can be added up without having to take the backup of entire concatenated volume ,adding slice and restoring backup .

Striping :Spreading of data over multiple disk drives mainly to enhance the performance by distributing data in alternating chunks - 16 k interleave across the stripes . Sequential data operations are performed in parallel on all the stripes by reading/writing 16k data blocks alternatively form the disk stripes.
Mirroring : Mirroring provides data redundancy by simultaneously writing data on to two sub mirrors of a mirrored device . A submirror can be a stripe or concatenated volume and a mirror can have three mirrors . Main concern here is that a mirror needs as much as the volume to be mirrored.

RAID 5 : RAID 5 provides data redundancy and advantage of striping and uses less space than mirroring . A RAID 5 is made up of at least three disk which are striped with parity information written alternately on all the disks . In case of a single disk failure the data can be rebuild using the parity information from the remaining disks .

3. Disksuite Packages :

Solstice disk suite is a part of server edition of the Solaris OS and is not included with desktop edition . The software is in pkgadd format & can be found in following locations in CD :
Solaris 2.6 - “Solaris Server Intranet Extensions 1.0” CD.
Solaris 7 - “Solaris Easy Access Server 3.0”
Solaris 8 - “Solaris 8 Software 2 of 2”

Solaris 2.6 & 2.7 Solstice Disk suite version is 4.2 . Following packages are part of it but only the "SUNWmd" is the minimum required package and a patch.
SUNWmd - Solstice DiskSuite
SUNWmdg - Solstice DiskSuite Tool
SUNWmdn - Solstice DiskSuite Log Daemon
Patch No. 106627-04 (obtain latest revision)

Solaris 8 DiskSuite version is 4.2.1 .Following are the minimum required packages ..
SUNWmdr Solstice DiskSuite Drivers (root)
SUNWmdu Solstice DiskSuite Commands
SUNWmdx Solstice DiskSuite Drivers (64-bit)

4. Installing DiskSuite 4.2.1 in Solaris 8

# cd /cdrom/sol_8_401_sparc_2/Solaris_8/EA/products/DiskSuite_4.2.1/sparc/Packages

# pkgadd -d .
The following packages are available:
1 SUNWmdg Solstice DiskSuite Tool
(sparc) 4.2.1,REV=1999.11.04.18.29
2 SUNWmdja Solstice DiskSuite Japanese localization
(sparc) 4.2.1,REV=1999.12.09.15.37
3 SUNWmdnr Solstice DiskSuite Log Daemon Configuration Files
(sparc) 4.2.1,REV=1999.11.04.18.29
4 SUNWmdnu Solstice DiskSuite Log Daemon
(sparc) 4.2.1,REV=1999.11.04.18.29
5 SUNWmdr Solstice DiskSuite Drivers
(sparc) 4.2.1,REV=1999.12.03.10.00
6 SUNWmdu Solstice DiskSuite Commands
(sparc) 4.2.1,REV=1999.11.04.18.29
7 SUNWmdx Solstice DiskSuite Drivers(64-bit)
(sparc) 4.2.1,REV=1999.11.04.18.29
Select 1,3,4,5,6,7 packages .

Enter ‘yes’ for the questions asked during installation and reboot the system after installation .

Put /usr/opt/SUNWmd/bin in root PATH as the DISKSUITE commands are located in this directory

5. Creating State Database :

State meta database , metadb , keeps information of the metadevices and is needed for Disksuite operation . Disksuite can not function without metadb so a copy of replica databases is placed on different disks to ensure that a copy is available in case of a complete disk failure .

Metadb needs a dedicated disk slice so create partitions of about 5 Meg. on the disks for metadb. If there is no space available for metadb then it can be taken from swap. Having metadb on two disks can create problems as DISKSUITE looks for database replica number > 50% of total replicas and if one of the two disks crashes the replica falls at 50%. On next reboot system will go to single user mode and one has to recreate additional replicas to correct the metadb errors.

The following command creates three replicas of metadb on three disk slices.

#metadb -a -f -c 3 /dev/dsk/c0t1d0s6 /dev/dsk/c0t2d0s6 /dev/dsk/c0t3d0s6

6. Creating MetaDevices :
Metadevices can be created in two ways
1. Directly from the command line
2. Editing the /etc/opt/SUNWmd/ file as per example given in the md.tab and initializing devices on command line using metainit .

6.1 ) Creating a concatenated Metadevice :
#metainit d0 3 1 /dev/dsk/c0t0d0s4 1 /dev/dsk/c0t0d0s4 1 /dev/dsk/c0t0d0s4

d0 - metadevice name
3 - Total Number of Slices
1 - Number of Slices to be added followed by slice name.

6.2 ) Creating a stripe of 32k interleave
# metainit d10 1 2 c0t1d0s2 c0t2d0s2 -i 32k

d0 - metadevice name
1 - Total Number of Stripe
2- Number of Slices to be added to stripe followed by slice name .
-i chunks of data written alternatively on stripes.

6.3 ) Creating a Mirror :
A mirror is a metadevice composed of one or more submirrors. A submirror is made of one or more striped or concatenated metadevices.
Mirroring data provides you with maximum data availability by maintaining multiple copies of your data. The system must contain at least three state database replicas before you can create mirrors. Any file system including root (/), swap, and /usr, or any application such as a database, can use a mirror.
6.3.1 ) Creating a simple mirror from new partitions

1.Create two stripes for two submirors as d21 & d22

# metainit d21 1 1 c0t0d0s2
d21: Concat/Stripe is setup
# metainit t d22 1 1 c1t0d0s2
d22: Concat/Stripe is setup

2. Create a mirror device (d20) using one of the submirror (d21)

# metainit d20 -m d21
d20: Mirror is setup

3. Attach the second submirror (D21) to the main mirror device (D20)

# metattach d20 d22
d50: Submirror d52 is attached.

4. Make file system on new metadevice

#newfs /dev/md/rdsk/d20
edit /etc/vfstab to mount the /dev/dsk/d20 on a mount point.

6.3.2.) Mirroring a Partitions with data which can be unmounted

# metainit f d1 1 1 c1t0d0s0
d1: Concat/Stripe is setup
# metainit d2 1 1 c2t0d0s0
d2: Concat/Stripe is setup
# metainit d0 -m d1
d0: Mirror is setup
# umount /local
(Edit the /etc/vfstab file so that the file system references the mirror)
#mount /local
#metattach d0 d2
d0: Submirror d2 is attached

6.3.3 ) Mirroring a Partitions with data which can not be unmounted - root and /usr
· /usr mirroring
# metainit -f d12 1 1 c0t3d0s6
d12: Concat/Stripe is setup
# metainit d22 1 1 c1t0d0s6
d22: Concat/Stripe is setup
# metainit d2 -m d12
d2: Mirror is setup
(Edit the /etc/vfstab file so that /usr references the mirror)
# reboot
...
...
# metattach d2 d22
d2: Submirror d22 is attached
· root mirroring
# metainit -f d11 1 1 c0t3d0s0
d11: Concat/Stripe is setup
# metainit d12 1 1 c1t3d0s0
d12: Concat/Stripe is setup
# metainit d10 -m d11
d10: Mirror is setup
# metaroot d10
# lockfs -fa
# reboot
…
…
# metattach d10 d12
d10: Submirror d12 is attached

6.3.4 ) Making Mirrored disk bootable
a.) # installboot /usr/platform/`uname -i`/lib/fs/ufs/bootblk /dev/rdsk/c0t1d0s0

6.3.5 ) Creating alterbate name for Mirrored boot disk

a.) Find physical path name for the second boot disk
# ls -l /dev/rdsk/c1t3d0s0
lrwxrwxrwx 1 root root 55 Sep 12 11:19 /dev/rdsk/c1t3d0s0 ->../../devices/sbus@1,f8000000/esp@1,200000/sd@3,0:a

b.) Create an alias for booting from disk2
ok> nvalias bootdisk2 /sbus@1,f8000000/esp@1,200000/sd@3,0:a
ok> boot bootdisk2

6.4 ) Creating a RAID 5 volume :

The system must contain at least three state database replicas before you can create RAID5 metadevices.

A RAID5 metadevice can only handle a single slice failure.A RAID5 metadevice can be grown by concatenating additional slices to the metadevice. The new slices do not store parity information, however they are parity protected. The resulting RAID5 metadevice continues to handle a single slice failure. Create a RAID5 metadevice from a slice that contains an existing file system.will erase the data during the RAID5 initialization process .The interlace value is key to RAID5 performance. It is configurable at the time the metadevice is created; thereafter, the value cannot be modified. The default interlace value is 16 Kbytes which is reasonable for most of the applications.

6.4.1.) To setup raid5 on three slices of different disks .

# metainit d45 -r c2t3d0s2 c3t0d0s2 c4t0d0s2
d45: RAID is setup

6.5.) Creating a Trans Meta Device :

Trans meta devices enables ufs logging . There is one logging device and a master device and all file system changes are written into logging device and posted on to master device. This greatly reduces the fsck time for very large file systems as fsck has to check only the logging device which is usually of 64 M. maximum size.Logging device preferably should be mirrored and located on a different drive and controller than the master device .

Ufs logging can not be done for root partition.

6.5.1) Trans Metadevice for a File System That Can Be Unmounted
· /home2
1. Setup metadevice

# umount /home2
# metainit d63 -t c0t2d0s2 c2t2d0s1
d63: Trans is setup
Logging becomes effective for the file system when it is remounted

2. Change vfstab entry & reboot

from
/dev/md/dsk/d2 /dev/md/rdsk/d2 /home2 ufs 2 yes -
to
/dev/md/dsk/d63 /dev/md/rdsk/d63 /home2 ufs 2 yes -
# mount /home2

Next reboot displays the following message for logging device
# reboot
...
/dev/md/rdsk/d63: is logging

6.5.2 ) Trans Metadevice for a File System That Cannot Be Unmounted
· /usr
1.) Setup metadevice
# metainit -f d20 -t c0t3d0s6 c1t2d0s1
d20: Trans is setup

2.) Change vfstab entry & reboot:
from
/dev/dsk/c0t3d0s6 /dev/rdsk/c0t3d0s6 /usr ufs 1 no -
to
/dev/md/dsk/d20 /dev/md/rdsk/d20 /usr ufs 1 no -
# reboot

6.5.3 ) TransMeta device using Mirrors

1.) Setup metadevice

#umount /home2
#metainit d64 -t d30 d12
d64 trans is setup

2.) Change vfstab entry & reboot:
from
/dev/md/dsk/d30 /dev/md/rdsk/d30 /home2 ufs 2 yes
to
/dev/md/dsk/d64 /dev/md/rdsk/d64 /home2 ufs 2 yes

6.6 ) HotSpare Pool

A hot spare pool is a collection of slices reserved by DiskSuite to be automatically substituted in case of a slice failure in either a submirror or RAID5 metadevice . A hot spare cannot be a metadevice and it can be associated with multiple submirrors or RAID5 metadevices. However, a submirror or RAID5 metadevice can only be asociated with one hot spare pool. .Replacement is based on a first fit for the failed slice and they need to be replaced with repaired or new slices. Hot spare pools may be allocated, deallocated, or reassigned at any time unless a slice in the hot spare pool is being used to replace damaged slice of its associated metadevice.

6.6.1) Associating a Hot Spare Pool with Submirrors

# metaparam -h hsp100 d10
# metaparam -h hsp100 d11
# metastat d0
d0: Mirror
Submirror 0: d10
State: Okay
Submirror 1: d11
State: Okay
...
d10: Submirror of d0
State: Okay
Hot spare pool: hsp100
...
d11: Submirror of d0
State: Okay
Hot spare pool: hsp100

6.6.2 ) Associating or changing a Hot Spare Pool with a RAID5 Metadevice

#metaparam -h hsp001 d10
#metastat d10
d10:RAID
State: Okay
Hot spare Pool: hsp001

6.6.3 ) Adding a Hot Spare Slice to All Hot Spare Pools

# metahs -a -all /dev/dsk/c3t0d0s2
hsp001: Hotspare is added
hsp002: Hotspare is added
hsp003: Hotspare is added

6.7 ) Disksets

Few important points about disksets :
A diskset is a set of shared disk drives containing DiskSuite objects that can be shared exclusively (but not concurrently) by one or two hosts. Disksets are used in high availability failover situations where the ownership of the failed machine’s diskset is transferred to other machine . Disksets are connected to two hosts for sharing and must have same attributes , controller/target/drive , in both machines except for the ownership .
DiskSuite must be installed on each host that will be connected to the diskset.There is one metadevice state database per shared diskset and one on the "local" diskset. Each host must have its local metadevice state database set up before you can create disksets. Each host in a diskset must have a local diskset besides a shared diskset.A diskset can be created seprately on one host & then added to the second host later.
Drive should not be in use by a file system, database, or any other application for adding in diskset .
When a drive is added to disksuite it is repartitioned so that the metadevice state database replica for the diskset can be placed on the drive. Drives are repartitioned when they are added to a diskset only if Slice 7 is not set up correctly. A small portion of each drive is reserved in Slice 7 for use by DiskSuite. The remainder of the space on each drive is placed into Slice 0.. After adding a drive to a diskset, it may be repartitioned as necessary, provided that no changes are made to Slice 7 . If Slice 7 starts at cylinder 0, and is large enough to contain a state database replica, the disk is not repartitioned.
When drives are added to a diskset, DiskSuite re-balances the state database replicas across the remaining drives. Later, if necessary, you can change the replica layout with the metadb(1M) command.
To create a diskset, root must be a member of Group 14, or the ./rhosts file must contain an entry for each host.

6.7.1 ) Creating Two Disksets

host1# metaset -s diskset0 -a -h host1 host2
host1# metaset -s diskset1 -a -h host1 host2
host1# metaset
Set name = diskset0, Set number = 1
Host Owner
host1
host2
Set name = diskset1, Set number = 2
Host Owner
host1
host2

6.7.2 ) Adding Drives to a Diskset

host1# metaset -s diskset0 -a c1t2d0 c1t3d0 c2t2d0 c2t3d0 c2t4d0 c2t5d0

host1# metaset
Set name = diskset0, Set number = 1
Host Owner
host1 Yes
host2

Drive Dbase
c1t2d0 Yes
c1t3d0 Yes
c2t2d0 Yes
c2t3d0 Yes
c2t4d0 Yes
c2t5d0 Yes

Set name = diskset1, Set number = 2
Host Owner
host1
host2

6.7.3 ) Creating a Mirror in a Diskset

# metainit -s diskset0 d51 1 1 /dev/dsk/c0t0d0s2
diskset0/d51: Concat/Stripe is setup

# metainit -s diskset0 d52 1 1 /dev/dsk/c1t0d0s2
diskset0/d52: Concat/Stripe is setup

# metainit -s diskset0 d50 -m d51
diskset0/d50: mirror is setup

# metattach -s diskset0 d50 d52
diskset0/d50: Submirror d52 is attached

7.0 Trouble Shooting

7.1 ) Recovering from Stale State Database Replicas

Problem : State database corrupted or unavailable .
Causes : Disk failure , Disk I/O error.
Symptoms : Error message at the booting time if databases are <= 50% of total database. System comes to Single user mode.
ok boot...Hostname: host1metainit: Host1: stale databasesInsufficient metadevice database replicas located.Use metadb to delete databases which are broken.Ignore any "Read-only file system" error messages.Reboot the system when finished to reload the metadevicedatabase.After reboot, repair any broken database replicas which weredeleted.Type Ctrl-d to proceed with normal startup,(or give root password for system maintenance): Entering System Maintenance Mode.

1.) Use the metadb command to look at the metadevice state database and see which state database replicas are not available. Marked by unknown and M flag.
# /usr/opt/SUNWmd/metadb -i flags first blk block count a m p lu 16 1034 /dev/dsk/c0t3d0s3 a p l 1050 1034 /dev/dsk/c0t3d0s3 M p unknown unknown /dev/dsk/c1t2d0s3 M p unknown unknown

2.) Delete the state database replicas on the bad disk using the -d option to the metadb(1M) command.
At this point, the root (/) file system is read-only. You can ignore the mddb.cf error messages:

# /usr/opt/SUNWmd/metadb -d -f c1t2d0s3metadb: demo: /etc/opt/SUNWmd/mddb.cf.new: Read-only file system .

Verify deletion
# /usr/opt/SUNWmd/metadb -i flags first blk block count a m p lu 16 1034 /dev/dsk/c0t3d0s3 a p l 1050 1034 /dev/dsk/c0t3d0s3

3.) Reboot.

4.) Use the metadb command to add back the state database replicas and to see that the state database replicas are correct.# /usr/opt/SUNWmd/metadb -a -c 2 c1t2d0s3# /usr/opt/SUNWmd/metadb flags first blk block count a m p luo 16 1034 dev/dsk/c0t3d0s3 a p luo 1050 1034 dev/dsk/c0t3d0s3 a u 16 1034 dev/dsk/c1t2d0s3 a u 1050 1034 dev/dsk/c1t2d0s3
7.2 ) Metadevice Errors :

Problem : Sub Mirrors out of sync in "Needs maintainence" state ,
Causes : Disk problem / failure , improper shutdown , communication problems between two mirrored disks .
symptoms : "Needs maintainence" errors in metastat output
# /usr/opt/SUNWmd/metastatd0: Mirror Submirror 0: d10 State: Needs maintenance Submirror 1: d20 State: Okay...d10: Submirror of d0 State: Needs maintenance Invoke: "metareplace d0 /dev/dsk/c0t3d0s0 " Size: 47628 blocks Stripe 0:Device Start Block Dbase State Hot Spare/dev/dsk/c0t3d0s0 0 No Maintenance d20: Submirror of d0 State: Okay Size: 47628 blocks Stripe 0:Device Start Block Dbase State Hot Spare/dev/dsk/c0t2d0s0 0 No Okay

Solution :

1.) If disk is all right - enable the failed metadevice with metareplace command .
If disk is failed - Replace disk create similar partitions as in failed disk and enable new device with metareplace command.
# /usr/opt/SUNWmd/metareplace -e d0 c0t3d0s0 Device /dev/dsk/c0t3d0s0 is enabled
2.) If disk has failed and you want to move the failed devices to new disk with different id (CnTnDn) - add new disk ,
format to create a similar partition scheme as in failed disk and use metarepalce command
# /usr/opt/SUNWmd/metareplace d0 c0t3d0s0

The metareplace command above can also be used for concate or strip replacement in a volume but that would involve restoring the backup if it is not mirrored.

Taken from: http://www.adminschoice.com/docs/solstice_disksuite.htm

System Administrator's Collection

Friday, May 16, 2008

Setting Up a Solaris DHCP Client

Introduction

Setting up DHCP

Unknown hostname

Wednesday, May 14, 2008

Quick Tips to Find Files on Linux File System

Configuring sar for your system

Configuring sar for your system

What does sar measure?

Configuring sar to collect data

Using the binary data files

Getting the bigger picture

Conclusion

Friday, May 9, 2008

How to Expand a Solaris File System

NFS Mount Point Permission Issue (ls: ..: Permission denied)

Saturday, May 3, 2008

Managing swap in the Solaris OS

Understanding and setting up Solstice DiskSuite in Solaris

Labels

Blog Archive

Useful Links