<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ardent Performance Computing</title>
	<atom:link href="http://www.ardentperf.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ardentperf.com</link>
	<description>Jeremy Schneider</description>
	<lastBuildDate>Thu, 02 Sep 2010 22:32:08 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0.1</generator>
		<item>
		<title>RAC Investigation on Low-Memory Linux</title>
		<link>http://www.ardentperf.com/2010/09/02/rac-investigation-on-low-memory-linux/</link>
		<comments>http://www.ardentperf.com/2010/09/02/rac-investigation-on-low-memory-linux/#comments</comments>
		<pubDate>Thu, 02 Sep 2010 19:05:44 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=987</guid>
		<description><![CDATA[Back in the Oracle 9i days, I was one of those people who got on eBay to buy firewire PCI cards and disks that could do non-exclusive login.  Remember that?  The first time a little test cluster could be cheap enough for the home enthusiast?  I still have the parts in my closet. Of course, [...]]]></description>
			<content:encoded><![CDATA[<p>Back in the Oracle 9i days, I was one of those people who got on eBay to buy firewire PCI cards and disks that could do non-exclusive login.  Remember that?  The first time a little test cluster could be cheap enough for the home enthusiast?  I still have the parts in my closet.</p>
<p>Of course, we all know what happened after that &#8211; virtualization.  It didn&#8217;t take long before my home-built test clusters were running on VMware.  (Personally, I think that virtualization really started because of those NES and SNES emulators.  Most great achievements start with a geek who wants to play more video games.)  There are lots of people now who run RAC on virtual environments and it&#8217;s easy to find tutorials on the web for many different OS and VM combinations.</p>
<h4>Low-Memory Linux</h4>
<p>Something I haven&#8217;t seen many other people do is RAC with a very small memory configuration.  Like 760M of memory per server.  (!)  Of course you&#8217;d only do this for a hobby setup &#8211; never on a system where you want any kind of support.  But I&#8217;m kinda cheap&#8230; and running RAC on these small VMs means that I don&#8217;t have to go buy an expensive new home computer.  My current gateway laptop with Vista Home does the job quite nicely!</p>
<p>10.2 and 11.1 RAC will install and run on servers with 760M of memory.  But things were a little unstable at first.  Now I&#8217;m the curious type&#8230; I like to fiddle with things&#8230; so I investigated a little bit.</p>
<h4>Basic Unix Investigation</h4>
<p>There are two basic investigation scenarios:</p>
<table>
<tr>
<td>what happened in the past</td>
<td>My main tool is <strong>sar</strong> (System Activity Reporter).  Or <a href="http://sourceforge.net/projects/ksar/">Java-based ksar</a> on my desktop &#8211; it gets data via ssh and graphs it.</td>
</tr>
<tr>
<td>what is happening now</td>
<td>My starting point is <strong>vmstat</strong> and <strong>top</strong>.  To dig a little deeper, I might then use other tools like <strong>ps</strong>, <strong>free</strong>, <strong>iostat</strong> or <strong>netstat</strong>.</td>
</tr>
</table>
<p>In this particular case, I noticed pretty quickly from the <strong>top</strong> utility that one process was consuming over 30% of the system&#8217;s memory!  <span id="more-987"></span><em>(Note: in <strong>top</strong>, you can press the &#8216;&lt;&#8217; and &#8216;&gt;&#8217; keys to move the sort column left and right.  The initial sort column is %CPU.  I moved it one column to the right, sorting by %MEM.)</em></p>
<pre><code>top - 18:41:13 up  5:28,  3 users,  load average: 0.04, 0.35, 0.60
Tasks: 180 total,   2 running, 178 sleeping,   0 stopped,   0 zombie
Cpu(s):  0.7%us,  3.7%sy,  0.0%ni, 89.4%id,  6.3%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:    767020k total,   754296k used,    12724k free,     7084k buffers
Swap:  1540088k total,   654696k used,   885392k free,   361800k cached

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
<span style='background-color:#DFDFDF'>17435 oracle    RT   0  230m 229m  31m S  0.3 30.6   0:26.21 /u01/crs/oracle/product/11.1.0/crs/bin/ocssd.bin</span>
 7765 oracle    15   0  464m 111m 102m S  0.0 14.9   0:04.52 ora_smon_RAC1
 7743 oracle    -2   0  445m  90m  83m S  0.0 12.1   0:12.01 ora_lms0_RAC1
 7783 oracle    15   0  440m  70m  66m S  0.0  9.4   0:05.35 ora_mmon_RAC1
20321 oracle    15   0  438m  48m  45m S  0.0  6.5   0:00.88 oracleRAC1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 8676 oracle    16   0  437m  46m  45m S  0.0  6.2   0:02.69 oracleRAC1 (DESCRIPTION=(LOCAL=YES)(ADDRESS=(PROTOCOL=beq)))
 8801 oracle    15   0  440m  44m  41m S  0.0  6.0   0:04.57 ora_cjq0_RAC1</code></pre>
<p>This process (ocssd) is Oracle&#8217;s <em>Cluster Synchronization Services</em> Daemon.  It&#8217;s the process that sends and receives heartbeats from other nodes.  Any delays sending or receiving those heartbeats can cause node evictions (a.k.a. server reboots) &#8211; so it&#8217;s a pretty important process!  That&#8217;s why it runs with realtime (RT) scheduling priority, as you can see in the above output from <strong>top</strong>.</p>
<p>I was surprised that CSS uses so much physical memory &#8211; usually Linux is very good at memory management.  In <strong>top</strong>, the VIRT column shows how much total memory each process is using, while the RES column shows how much actual physical memory Linux has allocated to it.  It&#8217;s clear that Linux is pretty actively managing the physical memory for other processes.</p>
<p>A quick glance at <strong>vmstat</strong> shows that although we are actively swapping, it seems under control.  This is about what I&#8217;d expect when we&#8217;re idle and all of the processes except CSS are sharing only 500M of memory:</p>
<pre><code>collabn1:/home/oracle[RAC1]$ vmstat 5
procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0 604144  14024  34948 300632    0    0    32    86 1044 1797  1  8 84  7  0
 2  1 603912  11224  34968 303968   87    0   763    81 1083 1902  3 11 56 30  0
 0  0 604840  12660  34912 303904   25    0   253    38 1050 1975  2 11 69 18  0
 0  0 604840  12728  34924 303900    0    0    68   106 1043 1975  1  9 82  8  0
 0  0 604784  14536  34936 304140   19    0    89    71 1043 2042  1  5 81 13  0
 1  0 604752  14168  34944 304496    6    0   119    21 1040 2016  2  8 80 11  0
 0  1 604736  18732  35020 306212    0    0   384    73 1050 2489  1 10 70 19  0
 1  1 604736  11144  35232 311252  104    0  1209   128 1074 2024  2 12 38 47  0
 3  1 607900   8056  30788 307352   77 1500   504  1661 1055 2642  9 16 56 19  0
 0  0 607900   9836  30800 307360    0    0    71    70 1031 1798  1  6 85  8  0
 1  0 607884  10536  30812 307400    8    0    44    93 1031 1832  1  4 87  8  0</code></pre>
<p>The SO column tells us when memory is written to disk (and removed from physical).  The SI column tells us when memory is read from disk (and put back in physical).  On a side note, remember that on a healthy Unix system the free memory is always small.  Sometimes this is confusing at first.</p>
<h4>Linux Process Memory Investigation</h4>
<p>Nonetheless, I&#8217;m not happy that CSS is using 30% of my physical memory in this highly-constrained hobby environment.  Why is Linux allowing this?  The first clue comes simply from the output of the familiar unix <strong>ps</strong> utility:</p>
<pre><code>collabn1:/home/oracle[RAC1]$ ps v -C ocssd.bin
  PID TTY      STAT   TIME  MAJFL   TRS   DRS   RSS %MEM COMMAND
17435 ?        SLl    0:08      7   588 235331 234840 30.6 /u01/crs/oracle/product/11.1.0/crs/bin/ocssd.bin</code></pre>
<p>On Linux, the &#8220;v&#8221; flag tells ps to give information relevant to virtual memory.  TRS and DRS tell me how much physical (resident) memory is used for machine executable code (text) and data, respectively.  But more importantly &#8211; the STAT column gives some informative BSD-style flags about the process.  That capital-L indicates that CSS has some pages that are locked into physical memory.  Bingo.</p>
<p>If I have root access, then I can get a very detailed report on process memory usage with the <strong>pmap</strong> command.  The output was a little long, so I&#8217;ve abbreviated it here:</p>
<pre><code>[root@collabn1 ~]# pmap -x 17435
17435:   /u01/crs/oracle/product/11.1.0/crs/bin/ocssd.bin
Address   Kbytes     RSS    Anon  Locked Mode   Mapping
00110000     656       -       -       - r-x--  libhasgen11.so
001b4000       8       -       -       - rwx--  libhasgen11.so
<span style='background-color:#DFDFDF'>  ... 37 more library blocks</span>
02a0d000     100       -       -       - rwx--    [ anon ]
02a26000       4       -       -       - --x--    [ anon ]
02a27000   10240       -       -       - rwx--    [ anon ]
03427000       4       -       -       - --x--    [ anon ]
03428000   10240       -       -       - rwx--    [ anon ]
<span style='background-color:#DFDFDF'>  ... 10 more anonymous blocks, half are 10240K</span>
08048000     592       -       -       - r-x--  ocssd.bin
080dc000       4       -       -       - rwx--  ocssd.bin
<span style='background-color:#DFDFDF'>  ... 30 more anonymous blocks, half are 10240K</span>
bfe40000     148       -       -       - rwx--    [ stack ]
bfe65000       8       -       -       - rw---    [ anon ]
-------- ------- ------- ------- -------
total kB  235920       -       -       -</code></pre>
<p>Interestingly, the linux <strong>pmap</strong> utility does not indicate any locked memory!  I don&#8217;t know whether that output column is non-functional or if it refers only to some particular kind of locking.  But at any rate, I know <em>something</em> is locked.  I couldn&#8217;t think of anything better, so the next place I looked was in the Linux /proc pseudo-filesystem.</p>
<pre><code>collabn1:/home/oracle[RAC1]$ grep Vm /proc/17435/status
VmPeak:   235924 kB
VmSize:   235920 kB
<span style='background-color:#DFDFDF'>VmLck:    235920 kB</span>
VmHWM:    234844 kB
VmRSS:    234840 kB
VmData:   202272 kB
VmStk:       156 kB
VmExe:       592 kB
VmLib:     31960 kB
VmPTE:       268 kB</code></pre>
<p>Now we&#8217;re talking. The process has a total of 235920 kB of memory &#8211; and it&#8217;s <em>ALL</em> locked.  On a normal RAC system you&#8217;d want this. Generally, important realtime processes should be locked so that they are never delayed by paging or swapping.  (Remember how that could cause node reboots?)</p>
<p>But I personally doubt that all of the memory really NEEDS to be locked, and I think that Linux will actually do a decent job of not swapping the most important parts.  And my highly constrained hobby environment will probably run much smoother if Linux has more flexibility when managing a measly 760M of memory.</p>
<h4>Unlocking Linux Process Memory</h4>
<p>But is it actually possible to unlock the process memory?  As far as I know, Oracle provides no option to disable CSS memory locking.  (For good reason.)  There is the <a href="http://linux.die.net/man/2/munlockall">system call munlockall()</a> &#8211; which unlocks all of a particular processes&#8217; memory.  But the CSS process itself would have to call this function.  And of course it will not.  Or will it?</p>
<p>If you&#8217;ve got root, then there&#8217;s a hacker-back-door way of doing this.  Remember, you&#8217;d be crazy to try this anywhere besides a dark closet at home.  And if you type too slow then CSS could reboot your machine.  </p>
<p>But watch this&#8230;</p>
<pre><code>[root@collabn1 ~]# grep Vm /proc/17435/status
VmPeak:   235924 kB
VmSize:   235920 kB
VmLck:    235920 kB
VmHWM:    234844 kB
VmRSS:    234840 kB
VmData:   202272 kB
VmStk:       156 kB
VmExe:       592 kB
VmLib:     31960 kB
VmPTE:       268 kB

[root@collabn1 ~]# gdb -p 17435 &lt;&lt;EOF
&gt; call munlockall()
&gt; quit
&gt; EOF

GNU gdb Fedora (6.8-27.el5)
Copyright (C) 2008 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later &lt;http://gnu.org/licenses/gpl.html&gt;
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "i386-redhat-linux-gnu".
Attaching to process 17435
Reading symbols from /u01/crs/oracle/product/11.1.0/crs/bin/ocssd.bin...done.
<span style='background-color:#DFDFDF'>  ... 17 more Reading/Loading symbols.</span>
[Thread debugging using libthread_db enabled]
[New Thread 0xb7f629f0 (LWP 17435)]
<span style='background-color:#DFDFDF'>  ... 19 more New Threads.</span>
Loaded symbols for /lib/libpthread.so.0
Reading symbols from /lib/libnsl.so.1...done.
<span style='background-color:#DFDFDF'>  ... 13 more Loaded/Reading symbols.</span>
0x008e7402 in __kernel_vsyscall ()
(gdb) $1 = 0
(gdb) The program is running.  Quit anyway (and detach it)? (y or n) [answered Y; input not from terminal]
Detaching from program: /u01/crs/oracle/product/11.1.0/crs/bin/ocssd.bin, process 17435

[root@collabn1 ~]# grep Vm /proc/17435/status
VmPeak:   235924 kB
VmSize:   235920 kB
VmLck:         0 kB
VmHWM:    234844 kB
VmRSS:    234840 kB
VmData:   202272 kB
VmStk:       156 kB
VmExe:       592 kB
VmLib:     31960 kB
VmPTE:       268 kB</code></pre>
<p>Ha. This is something you won&#8217;t find on metalink. After generating some activity on the system, the <strong>top</strong> utility shows me that Linux has significantly reduced the physical memory used by CSS.</p>
<p>These days I&#8217;ve actually scripted this for my home and classroom VM environments.  I haven&#8217;t done a careful comparison or analysis, but it really has seemed to me that my low-memory Linux systems run noticeably smoother.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2010/09/02/rac-investigation-on-low-memory-linux/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>DBCA Missing – Oracle 11.2 ASM/Grid</title>
		<link>http://www.ardentperf.com/2010/08/16/dbca-missing-oracle-11-2-asmgrid/</link>
		<comments>http://www.ardentperf.com/2010/08/16/dbca-missing-oracle-11-2-asmgrid/#comments</comments>
		<pubDate>Mon, 16 Aug 2010 21:54:04 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=969</guid>
		<description><![CDATA[Oracle provides three ways to manage ASM: (1) through SQL, (2) through the web-based database console or grid control, and (3) through the server-based java GUI tool DBCA.  These are your choices for adding storage, replacing a disk, growing a volume, etc. But if you&#8217;re an experienced DBA who recently started playing with 11gR2 ASM, [...]]]></description>
			<content:encoded><![CDATA[<p>Oracle provides three ways to manage ASM: (1) through SQL, (2) through the web-based database console or grid control, and (3) through the server-based java GUI tool DBCA.  These are your choices for adding storage, replacing a disk, growing a volume, etc. But if you&#8217;re an experienced DBA who recently started playing with 11gR2 ASM, you may have been surprised to find that DBCA is missing from the ASM/grid installation!</p>
<p>DBCA is not missing by mistake &#8211; this is a change in Oracle 11g Release 2.  There is a new server-based java GUI tool for ASM management: <strong>ASMCA</strong>.</p>
<h4>A Few ASM-related 11gR2 Changes</h4>
<p>You might remember that there haven&#8217;t been <em>major</em> changes in Oracle clustering since version 10gR1 &#8211; when Oracle introduced their Clusterware package. But 11gR2 brings big changes and big new features to database clustering.</p>
<p>Here are a few that I found very noticeable when installing a cluster:</p>
<div id="attachment_972" class="wp-caption alignright" style="width: 160px"><a href="http://www.ardentperf.com/wp-content/uploads/2010/08/asmca.png"><img class="size-thumbnail wp-image-972" title="ASMCA" src="http://www.ardentperf.com/wp-content/uploads/2010/08/asmca-150x150.png" alt="" width="150" height="150" /></a><p class="wp-caption-text">New Tool: ASMCA</p></div>
<ul>
<li>You can&#8217;t defer storage configuration until after CRS installation.  Furthermore, raw/block storage is no longer supported.  Before you begin the CRS install you must either (1) install and configure a Cluster Filesystem or (2) prepare storage for ASM.</li>
<li>The CRS install includes ASM.  This means that the bare-minimum space requirements are higher by about 3 or 4 GB.  (I used to run ASM and DB from a single home for VMware test/educational environments.)</li>
<li>You can&#8217;t do any ASM operations when connected as sysdba!  A new role called &#8220;sysasm&#8221; was introduced in 11gR1&#8230; it&#8217;s now the <em>only</em> way to manage ASM.  Connect / as sysasm.</li>
<li>There&#8217;s a new tool called ASMCA to manage ASM.  DBCA isn&#8217;t even installed in the ASM/Grid home.  This tool is pretty slick &#8211; it also manages the new ASM dynamic volumes and ASM cluster filesystems.</li>
</ul>
<p>I noticed that there wasn&#8217;t much on google about ASM and DBCA.  Hopefully this will save you a few minutes of head-scratching if you&#8217;re trying out 11g Release 2 for the first time.  Happy testing!<img class="zemanta-pixie-img" style="border: medium none; float: right;" src="http://img.zemanta.com/pixy.gif?x-id=56f324d1-9afc-4786-b39b-eef795e6308e" alt="" /></p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2010/08/16/dbca-missing-oracle-11-2-asmgrid/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Michigan OakTable &#8211; Illinois Visitors?</title>
		<link>http://www.ardentperf.com/2010/07/29/michigan-oaktable-illinois-visitors/</link>
		<comments>http://www.ardentperf.com/2010/07/29/michigan-oaktable-illinois-visitors/#comments</comments>
		<pubDate>Thu, 29 Jul 2010 15:39:28 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=879</guid>
		<description><![CDATA[Quite a few people have already plugged this event (Charles Hooper, Jonathan Lewis, Tanel Poder, Doug Burns, Christian Antognini, Randolf Geist, etc) &#8211; but I want to chime in anyway.  It&#8217;s so closeby and the price goes up in two days! The Thursday and Friday before OpenWorld (September 16-17), some kind folks in Michigan are [...]]]></description>
			<content:encoded><![CDATA[<p>Quite a few people have already plugged this event (<a href="http://hoopercharles.wordpress.com/2010/01/08/the-oaktable-network-invades-michigan-usa-advert/">Charles Hooper</a>, <a href="http://jonathanlewis.wordpress.com/2010/07/15/advert-odtug/">Jonathan Lewis</a>, <a href="http://blog.tanelpoder.com/2010/02/11/future-appearances-conferences-and-seminars/">Tanel Poder</a>, <a href="http://www.oracledoug.com/serendipity/index.php?/archives/1566-Advert-Michigan-Oak-Table-Symposium-registration-open.html">Doug Burns</a>, <a href="http://antognini.ch/2010/04/michigan-oaktable-symposium-2010/">Christian Antognini</a>, <a href="http://oracle-randolf.blogspot.com/2010/04/michigan-oaktable-symposium-mots-2010.html">Randolf Geist</a>, etc) &#8211; but I want to chime in anyway.  It&#8217;s <a href="http://maps.google.com/maps?f=q&amp;source=s_q&amp;hl=en&amp;geocode=FbGUfgId_JDG-inty_TQPCwOiDEAwMAJrabgrw%3BFZgAhQIdB3AC-ykzH0PUDbA8iDHitciGRvkJ2w&amp;q=chicago+to+ann+arbor&amp;sll=41.997718,-87.689879&amp;sspn=0.008005,0.01929&amp;ie=UTF8&amp;z=8&amp;saddr=chicago&amp;daddr=ann+arbor">so closeby</a> and the price goes up in two days!</p>
<p>The Thursday and Friday before OpenWorld (September 16-17), some kind folks in Michigan are bringing many of the best and brightest minds (see them <a href="http://michiganoaktable.intuitwebsites.com/about.html">here</a> and <a href="https://www.speak-tech.com/seminars/seminar_detail.php?SeminarID=84">here</a>) to the Midwest to share their experience.  It&#8217;s like attending the best sessions from San Francisco &#8211; for less than a third of the cost.  The event is being called the <a href="http://michiganoaktable.intuitwebsites.com/">Michigan OakTable Symposium</a> or MOTS.</p>
<p>If anyone lives around the midwest and won&#8217;t be at OpenWorld, then I think this would be especially good.  It&#8217;s only a two day event!  But of course I&#8217;m sure there&#8217;ll also be folks who go to the Michigan Symposium and still catch a last-minute flight to San Francisco&#8230;</p>
<p>I&#8217;ve always been very enthusiastic about local user groups and independent talent, so this is really my kind of event.  I&#8217;m pretty sure that any enthusiasm you hear will be based solely on proven real-world usefulness &#8211; with no influence from sales commissions (direct or indirect).  It&#8217;ll probably be just as much relational as it is technical.  You can bring an especially hard problem that you can&#8217;t solve &#8211; maybe someone will have a useful tip or new approach.  But taking home an &#8220;answer&#8221; is insignificant compared with taking home a better problem-solving process.  If I were you, I&#8217;d come to <a href="http://michiganoaktable.intuitwebsites.com/Abstracts.html">these sessions</a> looking for that.</p>
<p>Anyone in Illinois who is heading over for this?  I think it would be worth the drive (only 4 hours from Chicago).  Maybe we could even arrange for Illinois folks to hang out together at some point.</p>
<p>Reminder &#8211; seating is limited and prices go up Sunday.  If you are interested, then you might want to try for management sign-off and event registration by tomorrow.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2010/07/29/michigan-oaktable-illinois-visitors/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ASM Mirroring &#8211; No Hot Spare Disk</title>
		<link>http://www.ardentperf.com/2010/07/15/asm-mirroring-no-hot-spare-disk/</link>
		<comments>http://www.ardentperf.com/2010/07/15/asm-mirroring-no-hot-spare-disk/#comments</comments>
		<pubDate>Thu, 15 Jul 2010 23:36:49 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=690</guid>
		<description><![CDATA[Some time ago, we installed Oracle on a Sun box with 48 local 1TB disks (spread across six controllers). I explained to the storage and system teams that we would use ASM as the volume manager &#8211; and as such, it would take care of mirroring. One storage admin asked me which disk was the [...]]]></description>
			<content:encoded><![CDATA[<p>Some time ago, we installed Oracle on <a href="http://www.oracle.com/us/products/servers-storage/servers/x86/031210.htm">a Sun box with 48 local 1TB disks (spread across six controllers)</a>.  I explained to the storage and system teams that we would use ASM as the volume manager &#8211; and as such, it would take care of mirroring.  One storage admin asked me which disk was the hot spare. I replied that there was no hot spare.</p>
<p>What?!!  You call this a volume manager and it doesn&#8217;t support a hot spare disk?!  That&#8217;s it, we&#8217;re going back to filesystems!</p>
<p>Not so fast.</p>
<p>You&#8217;ll remember that the purpose of a hot spare disk is to restore protection in the event of a disk failure.  When the RAID controller or volume manager detects a failed disk (R.I.P.), it uses the spare and reproduces a copy of the complete now-missing disk (R.I.P.) by reading the mirror or parity data.</p>
<h4>Restoring Protection Without A Hot Spare Disk</h4>
<p>In fact, ASM does even better.  Instead of a <em>hot spare disk</em>, it has <strong>hot spare capacity</strong> &#8211; spread across all of the disks.  There are two immediate effects of this: (1) no loss of potential performance from a usable disk head sitting idle and (2) the hot spare capacity is available for an emergency situation (like the archivelog volume filling up).</p>
<p>How is this possible?  The key of course is understanding how ASM mirrors data &#8211; by <em>extent</em> rather than <em>disk</em>. <span id="more-690"></span>So if I allocate three extents on disk 1, the first could be mirrored on disk 10, the second on disk 11 and the third on disk 12. </p>
<p>Now if disk 1 fails, I don&#8217;t need to rebuild the disk &#8211; I only need to ensure that each extent is mirrored. For example: lets copy the first extent to disk 11, the second to disk 12, and the third to disk 10.  Once everything has two copies, my data is protected &#8211; even though the first disk was never rebuilt!</p>
<p>You&#8217;ll also notice that the I/O for the rebuild operation is spread across all of the disks.  Faster and better.</p>
<p>So there actually is a &#8220;hot spare&#8221; of sorts &#8211; spare capacity instead of a spare disk. And unlike RAID, this hot spare isn&#8217;t optional: ASM will <em>always</em> reserve enough spare capacity to bring the diskgroup back to its proper level of redundancy after a failure.</p>
<h4>Failure Groups Mean A Little Less Usable Space</h4>
<p>WAIT WAIT WAIT!!  I know you&#8217;re excited, but before you press enter and execute that migration script, I need to tell you about one more very important detail.  How does ASM know that amount of spare capacity to reserve?  The size of the largest disk, right?  NO!!!</p>
<p>This is, in fact, what <strong>failure groups</strong> are for!!  (You always wondered, right?!)  Because ASM doesn&#8217;t mirror across <em>disks</em> &#8211; it mirrors across <em>failure groups</em>.  And accordingly, ASM will always reserve enough spare capacity to restore protection after losing the largest failure group.</p>
<p>Now it&#8217;s important to understand how ASM will report this.  Lets look quickly at asmcmd&#8217;s <strong>lsdg</strong> command.  This command (11gR1) gives us four important numbers: Total_MB, Free_MB, Req_mir_free_MB and Usable_file_MB.  The <a href="http://download.oracle.com/docs/cd/E11882_01/server.112/e10500/asm_util004.htm#BABJBAEG">11gR2 Storage Administrator&#8217;s Guide</a> gives good descriptions of the output:</p>
<table>
<tr>
<td>
Total_MB
</td>
<td>
Size of the disk group in megabytes.
</td>
</tr>
<tr>
<td>
Free_MB
</td>
<td>
Free space in the disk group in megabytes, <strong>without regard to redundancy</strong>. From the V$ASM_DISKGROUP view.
</td>
</tr>
<tr>
<td>
Req_mir_free_MB
</td>
<td>
Amount of space that must be available in the disk group to restore full redundancy after the most severe failure that can be tolerated by the disk group. This is the REQUIRED_MIRROR_FREE_MB column from the V$ASM_DISKGROUP view.
</td>
</tr>
<tr>
<td>
Usable_file_MB
</td>
<td>
Amount of free space, <strong>adjusted for mirroring</strong>, that is available for new files. From the V$ASM_DISKGROUP view.
</td>
</tr>
</table>
<p>Putting it all together, we see that Req_mir_free_MB should be equal to the largest failure group.  Secondly, Usable_file_MB is the most important number &#8212; that&#8217;s how much space we can <em>actually</em> use.  And with normal (2-way) mirroring, it should always be half of the available raw free space.  And available raw free space is total raw free space minus Req_mir_free_MB.</p>
<h4>Example</h4>
<p>So let&#8217;s see if we can observe this on a system in the wild.  We&#8217;ll revisit one that I <a href="http://www.ardentperf.com/2009/03/12/asm-space-calculations-and-hot-spares/">briefly described last year</a>.</p>
<p>The system had 46 local disks (500M each).  The disks were attached to six controllers: two controllers had six disks each, the other four controllers had eight disks each.  We configured ASM so that each controller is a failure group.</p>
<p>First lets get the output of lsdg, which we&#8217;ll verify step-by-step:</p>
<pre><code>ASMCMD&gt; lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB  Req_mir_free_MB  Usable_file_MB  Offline_disks  Name
MOUNTED  NORMAL  N         512   4096  1048576  18041706  18041326          3137688         7451819              0  DATA01/</code></pre>
<p>Now &#8211; to begin our verification &#8211; lets look at the exact size of a single disk:</p>
<pre><code># sfdisk -suM /dev/sdav1
401624968</code></pre>
<p>The Linux kernel is reporting KB and we need MB &#8211; so we&#8217;ll divide by 1024.</p>
<p>401624968 /1024 = 392211.8828125 M</p>
<p>ASM uses &#8220;allocation units&#8221; of 1M and can&#8217;t use a partial AU &#8211; so we truncate that to 392211.  Now there were 46 disks &#8211; so the total amount of space available should be 392211 * 46 = 18041706.  And this perfectly matches Total_MB from the lsdg output!  So far, so good!</p>
<p>Next, lets look at the largest failure group.  The four large failure groups were each the same size with eight disks: 392211 * 8 = 3137688.  This perfectly matches Req_mir_free_MB from the lsdg output!  Lovely!</p>
<p>Finally, lets see how much space I can actually use for datafiles and such &#8211; accounting for mirroring.  The available raw free space should be Total_MB minus our calculated Req_mir_free_MB: 18041326 &#8211; 3137688 = 14903638.  I&#8217;m using normal mirroring, so my available/usable free space should be half of that: 14903638 / 2 = 7451819.  Perfectly matches Usable_file_MB from the lsdg output.  Wow.</p>
<h4>The Big Caveat</h4>
<p>Sounds amazing, right?  Well there is one very important caveat.  I alluded to it earlier: Oracle will allow you to use this reserved &#8220;hot spare&#8221; capacity.  This is very different from a traditional &#8220;hot spare&#8221; disk.  What happens if you use the space?  (1) The &#8220;usable free space&#8221; <a href="http://www.ardentperf.com/2009/03/17/asm-negative-free-space/">reports a negative number</a>!!!  (2) If a disk fails, then you will not lose any data &#8211; but ASM cannot automatically restore protection until you replace the disk or give ASM additional storage somewhere else.</p>
<p>That sound serious!  What&#8217;s worse, you can get into this situation without even realizing it.  If you create a datafile that&#8217;s bigger than the available space then ASM won&#8217;t warn you &#8211; it will just create the datafile and you lose your protection from double-failure.</p>
<p>You get alerts when your traditional filesystems are filling up.  (How many times has your inbox been flooded because of those stupid archivelogs?!)  If you deploy ASM, then MAKE SURE YOU HAVE ADEQUATE ALERTING.  Oracle&#8217;s management tools can help out, and there are other solutions too.  Don&#8217;t overlook it!</p>
<h4>Conclusion</h4>
<p>At this point, it should be evident that anyone with storage background needs to be careful around ASM.  It doesn&#8217;t work the same as a traditional volume manager or RAID controller.  Oracle Corp really does love SUPER EXCITING NEW WORDS (aka marketing buzz)&#8230; but this one isn&#8217;t just a matter of terminology. There is some real difference in the underlying concepts.</p>
<p>A few key takeaways:</p>
<ul>
<li>
<p>To automatically restore protection after a failure, ASM reserves spare capacity instead of a spare disk.</p>
</li>
<li>
<p>In V$ASM_DISKGROUP and asmcmd&#8217;s lsdg command, Total_MB and Free_MB report <strong>raw</strong> capacity while Usable_file_MB reports <strong>usable</strong> capacity based on the default mirroring level.*</p>
</li>
<li>
<p>ASM will allow you to use the spare capacity, after which it will report negative free space and won&#8217;t tolerate a double-failure.  Make sure you have adequate reporting so that this is always caught and addressed quickly.</p>
</li>
</ul>
<p><small>*Note: Various objects can have different mirroring levels in ASM.</small></p>
<p>Well I think that this should serve as a good primer on ASM and mirroring.  Let me know if I&#8217;ve left anything out!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2010/07/15/asm-mirroring-no-hot-spare-disk/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Five Reasons to do RAC Attack at Collaborate</title>
		<link>http://www.ardentperf.com/2009/04/06/five-reasons-to-do-rac-attack-at-collaborate/</link>
		<comments>http://www.ardentperf.com/2009/04/06/five-reasons-to-do-rac-attack-at-collaborate/#comments</comments>
		<pubDate>Mon, 06 Apr 2009 12:17:34 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=738</guid>
		<description><![CDATA[Last week Dan broke the news that we&#8217;re bringing RAC Attack back to Collaborate! We&#8217;ve run this workshop several times now: it&#8217;s gotten better every time and we&#8217;ve always received overwhelmingly positive feedback. This is going to be a great workshop in Orlando that you don&#8217;t want to miss! A Short History We first ran [...]]]></description>
			<content:encoded><![CDATA[<p>Last week <a href="http://www.dannorris.com/2009/03/30/adv-rac-attack-hands-on-event-at-collaborate09/">Dan broke the news that we&#8217;re bringing RAC Attack back to Collaborate</a>! We&#8217;ve run this workshop several times now: it&#8217;s gotten better every time and we&#8217;ve always received overwhelmingly positive feedback. This is going to be a great workshop in Orlando that you don&#8217;t want to miss!</p>
<h4>A Short History</h4>
<p>We first ran the RAC Attack <a href="http://iougew.prod.web.sba.com/displaymod/SearchEvents.cfm?code=739&#038;detail=1&#038;conference_id=52&#038;searchType=8">last year at Collaborate in Denver</a>. On the first day of Collaborate we heard that one of the computer labs was going to be sitting empty for several hours on Wednesday. Being such a fervent believer in hands-on learning, this sounded like a terrible waste!  So at the last minute we arranged to deliver the RAC Attack hands-on lab <a href="http://iougew.prod.web.sba.com/displaymod/SearchEvents.cfm?code=739&#038;detail=1&#038;conference_id=52&#038;searchType=8">during the open slot</a>.  If I remember right, we had something like 20 people and everyone seemed to really come away having learned something new. The lab focused on getting a running system.</p>
<p>In August when we ran <a href="http://www.ioug.org/networking/SIGs/RACAttackInformation.cfm">RAC Attack as a standalone workshop in Chicago</a>. In addition to having several excellent speakers, we dramatically expanded the scope of the lab to include topics like backup &#038; recovery, services, and clusterware. During the full day allocated, only a few people completed every possible exercise.</p>
<h4>Five Reasons To Sign Up</h4>
<p>This year <a href="http://www.ioug.org/collaborate09/attending/university.cfm#u18">at Collaborate we&#8217;re doing it again</a>. The lab will again be expanded and improved when we bring it to Orlando. Furthermore, <strong>I&#8217;m taking requests</strong> &#8211; if there&#8217;s a topic you&#8217;d like to learn about then leave a comment and I&#8217;ll try to get it included!  I&#8217;m not sure how quickly the session will fill, but I know that there are a limited number of seats &#8211; so you should sign up soon.</p>
<p>Here are five reasons I think you should come:</p>
<ol>
<li>
<p>Get exposure to and practice on all of the latest software, including VMware Server 2.0.1, Oracle Enterprise Linux (same as RedHat) 5.3, and Oracle RAC 11.1.0.7</p>
</li>
<li>
<p>There are hands-on challenges for every experience level. Meticulous step-by-step guide for beginners (with the choice to skip difficult steps), start-from-scratch or instant running environment for masters.</p>
</li>
<li>
<p>Learn from professionals with years of RAC experience. Several of us will be on-hand to answer questions and assist with the labs. If you have never touched a cluster database then we are here to help you complete the labs and understand what you&#8217;re doing!</p>
</li>
<li>
<p>Choose your own adventure. ASM or OCFS2. Parallel SQL or clusterware callouts. Got something specific you want to learn about? Leave a comment and I&#8217;ll try to get it included.</p>
</li>
<li>
<p>Walk out with a lab binder that tells you how to do it all on your own laptop after you get home. It won&#8217;t be possible to finish all the labs during the timeslot we have at Collaborate, but you can finish them later!</p>
</li>
</ol>
<p>Finances are tight and many companies are cutting training budgets &#8211; but remember that generally speaking, Collaborate is one of the best value opportunities for Oracle training that&#8217;s available in the United States. You have world-class educational sessions and networking opportunities, complimented by top notch deep-dive university seminars such as RAC Attack.  (Well&#8230; I&#8217;m a little biased about this session!)  And with a slightly greater degree of independence you won&#8217;t just hear the corporate messages &#8211; it&#8217;ll be a little easier to find real-life users telling the great, the good, the bad and the ugly.</p>
<p>Hope to see you there!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2009/04/06/five-reasons-to-do-rac-attack-at-collaborate/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>ASM Negative Free Space</title>
		<link>http://www.ardentperf.com/2009/03/17/asm-negative-free-space/</link>
		<comments>http://www.ardentperf.com/2009/03/17/asm-negative-free-space/#comments</comments>
		<pubDate>Tue, 17 Mar 2009 13:03:57 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=717</guid>
		<description><![CDATA[In the last post I showed mathematically how ASM calculates the free usable space that it displays. However one of the first questions I received after presenting that internally was from someone concerned about all of the unusable space being saved just in case we lose a disk. (In my worked example it was 1/6 [...]]]></description>
			<content:encoded><![CDATA[<p>In the last post I showed mathematically how ASM calculates the free usable space that it displays. However one of the first questions I received after presenting that internally was from someone concerned about all of the unusable space being saved just in case we lose a disk. (In my worked example it was 1/6 of the disk capacity! And what if we don&#8217;t care about restoring redundancy before we replace the failed disk?) </p>
<p>But it turns out that ASM actually handles this quite nicely &#8211; the &#8220;usable space&#8221; is used only for display and alerting. If a datafile needs to autoextend or if an archivelog needs to be created then ASM will keep using space until it completely exhausts everything physically available. It will go ahead and use the &#8220;reserved space&#8221; and just report a negative amount of free space! Here&#8217;s the output of my test confirming this behavior:</p>
<pre><code>SQL&gt; alter database datafile
2&gt;  '+TEST01/js1_rh5lab20/datafile/filler.256.681667743' resize 18G;

Database altered.

.

rh5lab20$ asmcmd dgs
State    Type    Rebal  Sector  Block       AU  Total_MB  Free_MB
MOUNTED  NORMAL  N         512   4096  1048576     38561     1529

  Req_mir_free_MB  Usable_file_MB  Offline_disks  Name
             7844           -3157              0  TEST01/

.

rh5lab20$ asmdf
Filesystem        SizeKB     UsedKB    AvailKB    Use% Mounted on
+TEST01         15727104   18959872   -3232768 120.56% TEST01</code></pre>
<p>There is also a second way to get negative usable space in ASM. If you do experience a failure, then ASM will restore your redundancy level making new copies of the data that was on the failed disk(s). It will use the &#8220;reserved space&#8221; on your disks. After the rebuild is complete, ASM will recalculate the amount of reserved space it needs to protect from a <em>second</em> failure. If you were using most of the space before the first failure then ASM might report a negative amount of usable space after the rebuild.</p>
<p>It&#8217;s important for you to remember that negative free space doesn&#8217;t mean your data isn&#8217;t safe &#8211; you just won&#8217;t be able to restore redundancy after the failure. In other words, if you have negative free space then you cannot physically survive <em>two</em> subsequent failures without data loss (or three with high redundancy/triple mirroring). But ASM allows the negative numbers because it&#8217;s smart enough not to hang your database when it can be avoided (for example if archivelogs fill up the reported usable space).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2009/03/17/asm-negative-free-space/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>ASM Space Calculations and Hot Spares</title>
		<link>http://www.ardentperf.com/2009/03/12/asm-space-calculations-and-hot-spares/</link>
		<comments>http://www.ardentperf.com/2009/03/12/asm-space-calculations-and-hot-spares/#comments</comments>
		<pubDate>Thu, 12 Mar 2009 18:16:08 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=710</guid>
		<description><![CDATA[This is quick and dirty. I hope to get a more polished write-up of what it means and how it works, but since I&#8217;m so busy right now I don&#8217;t know if I&#8217;ll get to it soon. In the meantime someone just might find this useful or informative so I&#8217;m just going to put it [...]]]></description>
			<content:encoded><![CDATA[<p>This is quick and dirty. I hope to get a more polished write-up of what it means and how it works, but since I&#8217;m so busy right now I don&#8217;t know if I&#8217;ll get to it soon. In the meantime someone just might find this useful or informative so I&#8217;m just going to put it out there as it is.</p>
<p>This is an worked-through example of how ASM reports free space, taking into consideration &#8220;hot spare capacity&#8221; that is reserved to restore redundancy after the largest failure group dies.</p>
<pre><code>ASM has 46 local disks attached to 6 controllers
- two controllers have 6 disks, other four have 8
- ASM has failure groups by controller

===========================================================================

Each disk is partitioned. Get the size of the ASM partition according to
the Linux kernel. All partitions are the same size.

# sfdisk -s /dev/sdav1
401624968

Now get the size info from ASM.

ASMCMD&gt; lsdg
State    Type    Rebal  Sector  Block       AU  Total_MB   Free_MB
MOUNTED  NORMAL  N         512   4096  1048576  18041706  18041326
Req_mir_free_MB  Usable_file_MB  Offline_disks  Name
        3137688         7451819              0  DATA01/

===========================================================================

-- size of individual disk in MB is kernel-reported KB divided by 1024
401624968 /1024 = 392211.8828125 M

-- ASM can't use partial AU's so truncate and we get Total_MB
392211 * 46 = 18041706

-- largest failure group [controller] has 8 disks so get Req_mir_free_MB
     (amount of "hot spare" capacity that is reserved to restore redundancy
     after a controller failure)
392211 * 8 = 3137688

-- asm reports that some space is used; use "Free_MB" to see total free space
= 18041326

-- with normal-mirroring, usable space is (totalfree-spare)/2
(18041326 -  3137688) / 2 = 7451819</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2009/03/12/asm-space-calculations-and-hot-spares/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Future of OCFS2</title>
		<link>http://www.ardentperf.com/2009/02/05/future-of-ocfs2/</link>
		<comments>http://www.ardentperf.com/2009/02/05/future-of-ocfs2/#comments</comments>
		<pubDate>Fri, 06 Feb 2009 01:01:16 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=698</guid>
		<description><![CDATA[At the company where I&#8217;m working right now, I&#8217;m part of an architecture effort to come up with our standard design for RAC on Linux across the firm. There will be dozens or possibly hundreds of deployments globally using the design we settle on. We&#8217;re internally debating whether or not we should include OCFS2 in [...]]]></description>
			<content:encoded><![CDATA[<p>At the company where I&#8217;m working right now, I&#8217;m part of an architecture effort to come up with our standard design for RAC on Linux across the firm. There will be dozens or possibly hundreds of deployments globally using the design we settle on.</p>
<p>We&#8217;re internally debating whether or not we should include OCFS2 in this design right now, and I&#8217;m curious if anyone has arguments one way or the other to share. Our standard design on Solaris does utilize a cluster filesystem and we would welcome a similar design, but there are some concerns about the readiness, stability and future of OCFS2.</p>
<p>OCFS2 is being considered for these four use cases:</p>
<ul>
<li>database binaries (vs local files or NFS)</li>
<li>diag top (11g) or admin tree (10g) (vs local files or NFS)</li>
<li>archived logs</li>
<li>backups</li>
</ul>
<p><br/></p>
<p>Other files will be stored in ASM.</p>
<p>I have seen mention in blogs such as <a href="http://bigdaveroberts.wordpress.com/ ">http://bigdaveroberts.wordpress.com/</a> of something called ASMFS in 11gR2 and I&#8217;m wondering &#8211; will this feature (if included) have any impact on Oracle&#8217;s commitment to OCFS2 development? Could Oracle conceivably develop a whole new cluster filesystem and put their full weight behind it as they did for ASM storage, leaving OCFS2 as a lower priority for new features and improvements? Has Oracle demonstrated significant commitment to OCFS2 development and support in the past, and is this a mature enough technology for wide-scale deployment?</p>
<p>Just looking for opinions. :)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2009/02/05/future-of-ocfs2/feed/</wfw:commentRss>
		<slash:comments>6</slash:comments>
		</item>
		<item>
		<title>Oracle ASM Stripe Size</title>
		<link>http://www.ardentperf.com/2008/12/10/oracle-asm-stripe-size/</link>
		<comments>http://www.ardentperf.com/2008/12/10/oracle-asm-stripe-size/#comments</comments>
		<pubDate>Wed, 10 Dec 2008 21:48:35 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=665</guid>
		<description><![CDATA[Right now I&#8217;m working on some internal documentation and referencing Oracle 11g&#8217;s Storage Administrators Guide (ASM manual). And I think there&#8217;s a bit of a contradiction here about the stripe size. ASM uses a variable &#8220;data extent&#8221; size when allocating space for files (like any sane filesystem) &#8211; the first 20000 data extents are equal [...]]]></description>
			<content:encoded><![CDATA[<p>Right now I&#8217;m working on some internal documentation and referencing Oracle 11g&#8217;s <a href="http://download.oracle.com/docs/cd/B28359_01/server.111/b31107/toc.htm">Storage Administrators Guide (ASM manual)</a>. And I think there&#8217;s a bit of a contradiction here about the stripe size. ASM uses a variable &#8220;data extent&#8221; size when allocating space for files (like any sane filesystem) &#8211; the first 20000 data extents are equal to the diskgroup allocation unit (AU), the next 20000 are AU*8, etc. But it&#8217;s a little unclear to me how the stripe size is determined. Seems to me that you&#8217;d stripe &#8220;data extents&#8221; rather than allocation units and I think I&#8217;ve even heard before that &#8220;<em>extents</em> are mirrored [and presumably striped]&#8221; in ASM. In fact the illustration in the manual shows the stripes in exactly that manner:</p>
<div id="attachment_666" class="wp-caption alignnone" style="width: 310px"><a href="http://download.oracle.com/docs/cd/B28359_01/server.111/b31107/asmcon.htm#BABHCCDB"><img src="http://www.ardentperf.com/wp-content/uploads/2008/12/extents-300x194.gif" alt="ASM Extents" title="extents" width="300" height="194" class="size-medium wp-image-666" /></a><p class="wp-caption-text">ASM Extents</p></div>
<p>However the manual states that the opposite is true &#8211; striping based on AU instead of data extent &#8211; in two different places on that page. <span id="more-665"></span> First, <a href="http://download.oracle.com/docs/cd/B28359_01/server.111/b31107/asmcon.htm#BABCGDBF">just before the illustration</a>:</p>
<blockquote><p>The ASM coarse striping is always equal to the disk group AU size, but fine striping size always remains 128KB in any configuration (not shown in the figure). The AU size is determined at creation time with the allocation unit size (AU_SIZE) disk group attribute. The values can be 1, 2, 4, 8, 16, 32, and 64 MB.</p></blockquote>
<p>And again <a href="http://download.oracle.com/docs/cd/B28359_01/server.111/b31107/asmcon.htm#CJHCFBBI">right after it</a>:</p>
<blockquote><p>To stripe data, ASM separates files into stripes and spreads data evenly across all of the disks in a disk group. The stripes are equal in size to the effective AU. The coarse-grained stripe size is always equal to the AU size. The fine-grained stripe size always equals 128 KB; this provides lower I/O latency for small I/O operations such as redo log writes.</p></blockquote>
<p>As far as I can tell, it seems to be a contradiction in the manual. I commented on the documentation page and I&#8217;ve opened an SR; I&#8217;ll post an update when I get a reply.</p>
<p><strong>Update 4:45pm:</strong></p>
<p>The SR got answered very quickly (less than an hour) &#8211; and said that striping is based on the data extent size. The usage of the term &#8220;allocation unit&#8221; was what I found confusing (even in the SR resolution)&#8230; but at least we have an answer. :)</p>
<p>Suppose you have a diskgroup with 1MB AU size. After a particular file has 20000 extents, <em>the allocation unit for that file</em> becomes 8MB.</p>
<pre><code>allocation unit = extent size
40000&gt;extent&gt;20000 =&gt; FILE_AU=DISKGROUP_AU*8.</code></pre>
<p>Here&#8217;s the definitive quote from the SR: &#8220;[After 20000 extents] the 8meg AU will be striped, with the first 8meg AU on disk one, the second 8meg AU on disk 2 and so on.&#8221;</p>
<p>This would apply for both striping and mirroring. My gut feeling was right; I was just a bit confused by the <a href="http://download.oracle.com/docs/cd/B28359_01/server.111/b31107/asmcon.htm#sthref33">definitions of extents and allocation units in the manual</a> (which are incorrect).</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2008/12/10/oracle-asm-stripe-size/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Robust Software Version Numbering</title>
		<link>http://www.ardentperf.com/2008/12/05/robust-software-version-numbering/</link>
		<comments>http://www.ardentperf.com/2008/12/05/robust-software-version-numbering/#comments</comments>
		<pubDate>Sat, 06 Dec 2008 01:55:51 +0000</pubDate>
		<dc:creator>Jeremy</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=658</guid>
		<description><![CDATA[This article isn&#8217;t directly database-related, but I think it&#8217;s a great software engineering topic so it seemed worth writing about. Right now I&#8217;m involved in a project that involves releasing software packages of a few different flavors. Some of them are other people&#8217;s software that we&#8217;re re-packaging (like oracle database binaries) and some are code [...]]]></description>
			<content:encoded><![CDATA[<p>This article isn&#8217;t directly database-related, but I think it&#8217;s a great software engineering topic so it seemed worth writing about. Right now I&#8217;m involved in a project that involves releasing software packages of a few different flavors. Some of them are other people&#8217;s software that we&#8217;re re-packaging (like oracle database binaries) and some are code that was completely written and is maintained in-house. And as we&#8217;ve been working through the software development process this question has recently come to mind: how should we assign version numbers?</p>
<p>Actually it&#8217;s part of a bigger question of how we develop elegant software &#8211; other related design issues in this project are clear end-to-end traceability (requirement -> source code version -> released binary package), solid change control, integrated testing framework and readiness for automation (build and test). But the one I&#8217;m interested in right now is this: how can I develop a robust version numbering strategy?</p>
<p>First of all, there really isn&#8217;t a one-size-fits-all &#8220;correct&#8221; version numbering scheme. The best way to choose version numbers is influenced by your <a href="http://en.wikipedia.org/wiki/List_of_software_development_philosophies">software development philosophy</a> (and resultant <a href="http://en.wikipedia.org/wiki/Software_development_process">development lifecycle</a>), <a href="http://en.wikipedia.org/wiki/Project_management">project management approach</a> and development team. (Globally distributed informal volunteer network or local engineers on payroll?) So the best solution for my current project might not be the best solution for my next project.</p>
<p>But lets take a closer look at this project. First, a little background on the art of version numbering. <span id="more-658"></span> In the open-source world, <a href="http://lists.xml.org/archives/xml-dev/200112/msg00327.html">many projects follow a three-digit version</a>: <em>MAJOR.MINOR.PATCH</em>. Also, it&#8217;s clear that <a href="http://hubpages.com/hub/Software_Version_Numbers_Explained">the version number is closely tied to the development lifecycle</a>. You can find plenty of example discussion from open-source projects such as <a href="http://kerneltrap.org/Linux/Kernel_Release_Numbering_Redux">the Linux kernel</a>, <a href="http://arstechnica.com/articles/paedia/track-linux.ars/1">Linux desktop distros</a>,  <a href="http://wiki.eclipse.org/index.php/Version_Numbering">Eclipse</a>, <a href="http://apr.apache.org/versioning.html">Apache APR</a> and <a href="http://blog.daemon.com.au/archives/000276.html">FarCry CMS</a>. Before the 2.6 series the linux kernel used an even-odd scheme to distinguish between unstable and stable releases but that has now been abandoned and most people seem to regard this as a good thing.</p>
<p>The best overview of the topic that I could find was <a href="http://en.wikipedia.org/wiki/Software_versioning">the wikipedia article on Software Versioning</a>. The section about pre-release versions does a good job of describing one requirement of my current project. I also found the section about internal version numbers interesting; I&#8217;ve worked on projects in the past with that strategy.</p>
<p>I found an <a href="http://www.linux.com/feature/45507">article on linux.com written by Nathan Willis a few years ago</a> offering some principles. I agree with his first and third suggestions: &#8220;Pick a numbering scheme, then don&#8217;t change it. Ever.&#8221; and &#8220;Make friends with infinity.&#8221; Best to think this thing out at the front end of a project (before 1.1) and you shouldn&#8217;t be bothered by big version numbers. However I don&#8217;t buy his second point about the meaning of the decimal; I think most everyone today is used to the meaning of the decimal in version numbers.</p>
<p>For some alternate viewpoints I came across <a href="http://www.panix.com/~zackw/exbib/2002/June/20">something Zack Weinberg wrote back in 2002</a>. He states strongly that the decimal point isn&#8217;t mathematical, which I agree with. However contrary to his article, in our development lifecycle the test releases do need version numbers. But he has an interesting suggesting of using date-based numbers for development snapshots &#8211; that&#8217;s an interesting idea which I&#8217;ll have to think about. His list if ways to do it wrong has a good point about distinguishing between version x and version x.0 although I&#8217;m not in full agreement with the other list items. I actually find <a href="http://www.gnu.org/prep/maintain/maintain.html#Test-Releases">the GNU suggestion for numbering test versions</a> very interesting &#8211; I&#8217;ll have to give that some thouht. (Tests versions for 4.6 are numbered 4.5.90 &#8211; 4.5.99)</p>
<p>Ultimately I don&#8217;t think either of these two sets of guidelines is robust enough for me &#8211; especially for repackaging the oracle database binaries.  Some of these might seem obvious, but I thought a good place to start might be by listing what I would consider to be requirements for an elegant and robust version numbering scheme:</p>
<ol>
<li>
elegant, clear, understandable version numbers that are as simple as possible
</li>
<li>
support for single-digit or double-digit versions (1,2,3,4 or 1.0, 1.1, 1.2 ,2.0)
</li>
<li>
versions for pre-release builds such as test builds (for the quality process) and early-access beta builds.
</li>
<li>
very clear difference between pre-release and release versions, so someone doesn&#8217;t mistakenly install a pre-release build when they want a production build.
</li>
<li>
names need to sort correctly &#8211; pre-release before release. it&#8217;s critical that RPM and RHN sort them right; it would be nice if unix &#8220;ls&#8221; did too.
</li>
<li>
support for continuing development.  after release 1.0, the 2.0-beta version should sort after release 1.0 and before release 2.0
</li>
<li>
support for special branch versions &#8211; version 1.0+patch12345, where patch12345 is not included in version 2.0 which has different patches
</li>
<li>
clear difference between pre-release and release versions on special branches.
</li>
</ol>
<p><br/></p>
<p>I have to be honest &#8211; this is tough. I&#8217;ll be giving it some thought over the next few weeks. Let me know if you have any suggestions. :)</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2008/12/05/robust-software-version-numbering/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
