<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ardent Performance Computing &#187; Technical</title>
	<atom:link href="http://www.ardentperf.com/category/technical/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.ardentperf.com</link>
	<description>Jeremy Schneider</description>
	<lastBuildDate>Mon, 13 Aug 2012 16:28:33 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.5.1</generator>
		<item>
		<title>Lessons from Africa, Part 2</title>
		<link>http://www.ardentperf.com/2012/08/13/lessons-from-africa-part-2/</link>
		<comments>http://www.ardentperf.com/2012/08/13/lessons-from-africa-part-2/#comments</comments>
		<pubDate>Mon, 13 Aug 2012 13:00:35 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1528</guid>
		<description><![CDATA[Last week was busy&#8230; making travel arrangements for this week&#8217;s trip to New York (technically Jersey) and some light analysis of AWR reports from exadata RAT runs and some heavy troubleshooting of a Solaris x86 RAC cluster with random node reboots. (I think I finally traced the node reboots to a kernel CPU/scheduling problem). I [...]]]></description>
				<content:encoded><![CDATA[<p>Last week was busy&#8230; making travel arrangements for this week&#8217;s trip to New York (technically Jersey) and some light analysis of AWR reports from exadata RAT runs and some heavy troubleshooting of a Solaris x86 RAC cluster with random node reboots.  (I think I finally traced the node reboots to a kernel CPU/scheduling problem).  I really did thoroughly enjoy my time in <a href="http://www.ardentperf.com/africa/" title="Africa">Africa</a> despite being nowhere near Oracle software &#8211; but it feels good to be working on challenging cluster problems again!</p>
<p>Before I completely forget the details from my work in Africa I want to wrap up my <a href="http://www.ardentperf.com/2012/08/08/lessons-from-rural-africa/" title="Lessons From Rural Africa">article about high-level lessons learned</a> earlier this week.  By the way, I&#8217;m not just stretching obscure aspects of my work in Africa to get stuff that sounds good.  I view these cultural lessons which we learned together as the most central and most important technical aspect of my work at the hospital.  And it might be surprising, but it&#8217;s true: the same cultural adjustments are important and oft-missed here in corporate America.</p>
<p>The <a href="http://www.ardentperf.com/2012/08/08/lessons-from-rural-africa/" title="Lessons From Rural Africa">first two lessons</a> were to (1) understand the fundamentals and (2) avoid unjustifiable complexity.  The remaining two lessons I want to talk about are slightly less technical but equally important.</p>
<ol start="3">
<li><strong>People</strong> first, Technology second</p>
<p>I&#8217;m using the word &#8220;people&#8221; here to sum up three major components of our accomplishments at the hospital: organizational policy, user education and technical training.</p>
<p>First, a fun technical story:</p>
<hr/>
<p>On day 1 after our arrival at the hospital, several issues required immediate attention &#8211; I discussed one example in the previous article.  A second issue involved the network links to the outside world.  In particular, sites like gmail were often completely unreachable.  I worked hard on this one.  Eventually I got the reproducible test case by connecting to an AWS instance very close to the other end of their link.  I compared low-level packet traces from both sides to see what was happening.</p>
<p>I could initiate a very large download and throttle the connection on our end, causing TCP window full messages to cascade all the way up to the AWS source server &#8211; normal behavior for throttling connections.  But I did notice that the source server sent a <em>very large</em> chunk of data before it started throttling to the same rate that I was demanding.</p>
<p>Next I opened a second connection with a different protocol on a different port.  The throttled connection wasn&#8217;t even using a fraction of our contracted bandwidth &#8211; but every single packet on the second port took a consistent 60 seconds or so to get through.  If I killed the download then things would zoom at normal high speeds again.</p>
<p>The killer was that the TLS handshake in HTTPS connections needed a response to the initial packet within 30 seconds in order to continue.  Small file downloads had no impact &#8211; but if a download was large enough, then SSL quit working completely &#8211; although HTTP would still (slowly) connect.</p>
<p>My best guess was that this network provider (not an African company BTW) was running some enormous cache in the middle combined with a single packet queue per customer &#8211; no fair queuing by TCP connection.  It doesn&#8217;t make complete sense but I haven&#8217;t yet thought of anything better.  Maybe those other packets were just stuck in line on a huge cache throttled by packets in front of them?  I actually didn&#8217;t think it was possible to configure network equipment to be this stupid&#8230; but maybe?  Whatever the cause, the end result was that a large download &#8211; even throttled &#8211; generally trashed our whole network connection.</p>
<p><span id="more-1528"></span>I did pursue a ticket with the network provider, but it was a lot of effort to keep the issue moving.  We discussed technical solutions on our end.  Block large downloads?  Sometimes they&#8217;re needed for the business.  I spent quite a bit of time researching various QOS solutions &#8211; and this was when I learned a very important lesson.  I especially learned from and ebook called <a href="http://bwmo.net/">How To Accelerate Your Internet</a> (with several good African case studies) and <a href="http://www2.aau.org/renu/ws/afren07/docs/day2.3.%20afren_bw_opt.pdf">some slides from Christian Benvenuti</a> (International Centre for Theoretical Physics).</p>
<p>Communication and people can do things that technology can&#8217;t.</p>
<hr/>
<p>We weren&#8217;t able to resolve this issue in a &#8220;technical&#8221; way &#8211; so how did we solve it?  We had something that I&#8217;ll call <strong>our PEOPLE strategy</strong> &#8211; it addressed this challenge and many others too.  Here are a few elements of the strategy:</p>
<ol>
<li>We wrote a <strong>computer policy</strong> for the hospital, approved and enforcable at the top.
</li>
<li>Made sure a few tools were in place to identify violations.
</li>
<li>Setup a new wiki for the entire hospital community to use for any purposes.
</li>
<li>Designed the network so that every new device was redirected to a wiki page with the policy.
</li>
<li>Held extensive discussions with all staff and compound residents on several technical subjects.
</li>
<li>Hand-picked a few users for special training on several topics.
</li>
</ol>
<p>We kept the policy short and simple and easy to remember &#8211; it was three rules that I could recite on my fingers. But it covered our needs well, protecting us from bandwidth problems and from malware.</p>
<p>I worked to catalyze broad <strong>user education</strong>.  Lots of conversations with non-technical people.  Basic conversations, not fancy ones!  It began to cultivate a new culture around technology.</p>
<p>I briefly mentioned a new wiki up there &#8211; this was also a big part of our user education strategy.  Except for a small restricted section, the wiki was 100% open to be updated by anybody.  Part of our cultural change included training everyone at the hospital to use this wiki as a central repository for processes and information.  With a high frequency of arrivals and departures, with people often rotating into and out of well-defined functions, the usefulness of the wiki was immediately apparent.</p>
<p>The third important part of our people strategy was the <strong>special training</strong>.  There were three unique qualities of this training:</p>
<ul>
<li><strong>Documentation</strong>: Much like the work I&#8217;ve done for <a href="http://racattack.org">Oracle RAC Attack</a>, we worked very hard to boil down tough concepts and processes into extremely verbose step-by-step instructions on the wiki. Whenever possible, I taught people to start practicing step-by-step documentation for everything they do.
</li>
<li><strong>Reproducibility</strong>: Enabled by the growing documentation library on the wiki, we expected that everyone who received training would be able to train someone else on the same thing.  When we switched to a new internet access system, I trained a few people on setting up accounts and had those few people handle the rest of the compound.
</li>
<li><strong>Functional Selection</strong> (rather than technical selection): also enabled by the documentation library, we started choosing the most logically positioned people for jobs instead of the most technical people.  As one example, several non-IT people helped setup the new internet accounts.  As a second example, the special projects coordinator &#8211; despite not being at IT person at all &#8211; followed extensive documentation to practice a bare-metal server recovery&#8230; and built our new server in the process!  As a previous director, she was logically positioned for training which could potentially provide access to any data at the hospital.
</li>
</ul>
<p>In Corporate America, we typically have more financial resources, more technology at our disposal, and much longer staff retention.  Even if you have the same problems that we had in Africa, your situation will require different strategies.  But the main point here is universally, absolutely <em>crucial</em>: there are two sides to every project, the technical side and the people side.</p>
<p>Even if you&#8217;re working with outside partners to provide technical expertise, there is an in-house &#8220;people&#8221; component to your project.  Even if it&#8217;s being called an appliance, it&#8217;s gonna be your baby to use &#038; maintain &#038; retire someday.  Even if it&#8217;s a cloud-based service, you have to deal with the upgrades &#038; functionality changes &#038; workarounds for buggy, uncommon use cases.  Every new technology you start working with will require somebody in-house to start learning about it.  Never underestimate the people side of technology!  It&#8217;s always there and it&#8217;s always important.</p>
<p>I think that we easily get immersed in the technology side of our projects (myself included) and we often need a reminder to keep the people side in view.</p>
</li>
<li><strong>Adapt the Process</strong> instead of Customizing the Product
<p>Finally, I have one quick word about processes and products.  Now I don&#8217;t want to overstate this point; obviously there&#8217;s a cost-benefit analysis that happens for each case &#8211; and often changing business processes is not an easy thing.  But I do believe that as technologists we have a tendency to favor customization a little more than we should.  And in general (technologists and managers and executives) I think that we tend to <em>not</em> investigate many possible process changes because we <em>think</em> that it&#8217;s not really plausible.  I think that if we start asking, we might be surprised how willing people sometimes are to change the way they work in order to have the business as a whole work better.  If we&#8217;re getting some good new tools then people could totally understand changes to better accommodate those tools.</p>
<p>When I was working at the hospital, there were two specific places where this discussion happened.  The first was around a new pharmacy system that we were putting in place.  The present system is heavily paper-based and for many reasons the hospital is pushing ahead with a new computerized system.  It&#8217;s the classic conversation about software customization and frankly I didn&#8217;t do an awful lot besides convince hospital staff to work more closely with the company who writes the pharmacy software.  (Great company &#8211; and eager to help &#8211; and they totally understand this conversation.)</p>
<p>The second conversation was a little more unusual: server administration.  Specifically, around NTFS file permissions and administrator access to user home directories.  One requirement of the hospital was the ability to audit contents of redirected user home directories.  By default these directories are created so that even Administrators cannot peer into them.</p>
<p>Of course none of us were experienced windows server administrators.  With some fiddling we managed to get folder redirection to work.  I had found that many folder redirections on the old network didn&#8217;t work because of incorrect permissions on home directory folders.  We could have done more testing to figure out what permissions worked &#8211; but we were running short on time.  So this was where I encouraged the staff to &#8220;go light on changing things&#8221;.  In the end we found another very good solution which didn&#8217;t require any NTFS permission changes.</p>
<p>I often aim for environments to be as close as possible to the engineers who are building the software.  I&#8217;ll ask people from my vendors, &#8220;what do your engineers mostly develop on?&#8221;  With some companies it&#8217;s hard to get a straight answer, but I like to ask!</p>
<p>When it comes to customization in Oracle databases, of course underscore parameters come to mind right away.  Those little buggers can be life-savers sometimes&#8230; but their indiscriminately global effects can also be killers.  Be careful!</p>
</li>
</ol>
<p>Well I hope you can see a bit of why I enjoyed the work in Africa.  Even though I wasn&#8217;t working for a fortune 500 company with millions to invest in bleeding-edge technology, there were still interesting and challenging projects.  It was a great opportunity to hone my skills at helping people find the best ways to use technology in a real life business.  And to be honest, I think that&#8217;s the one thing I&#8217;m most passionate about.</p>
<p>I hope that I&#8217;ve challenged your thinking a little bit around how technology serves your company.  I hope that these new ideas push you to the next level in your professional career.  And with any luck I might have even convinced a few people to help out in the non-profit world.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2012/08/13/lessons-from-africa-part-2/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Lessons From Rural Africa</title>
		<link>http://www.ardentperf.com/2012/08/08/lessons-from-rural-africa/</link>
		<comments>http://www.ardentperf.com/2012/08/08/lessons-from-rural-africa/#comments</comments>
		<pubDate>Wed, 08 Aug 2012 16:40:29 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1446</guid>
		<description><![CDATA[It has been nine months since I&#8217;ve written here. Needless to say, a lot has happened! First, my family was living in Africa for three months earlier this year while I did some tech work at an NGO hospital. Second, upon our return I decided to join the good people at Pythian. I&#8217;m not moving [...]]]></description>
				<content:encoded><![CDATA[<p>It has been nine months since I&#8217;ve written here.  Needless to say, a lot has happened!</p>
<p>First, my family was living in <a href="/africa/">Africa</a> for three months earlier this year while I did some tech work at an NGO hospital.  Second, upon our return I decided to join the good people at <a href="http://www.pythian.com/">Pythian</a>.  I&#8217;m not moving to Canada, although I will travel a decent bit as part of the company&#8217;s consulting group.</p>
<p>If you&#8217;re interested in the Africa trip, look at the <a href="/africa/">Africa page</a>.  I wasn&#8217;t working with Oracle technology but it was still a very interesting, challenging and engaging project.</p>
<p>I thought I&#8217;d briefly share a few high-level insights.  You might be surprised how well these lessons apply almost anywhere (even Oracle-related projects)!</p>
<h3>Four Lessons from IT in Rural Africa</h3>
<ol>
<li>Understand the <strong>Fundamentals</strong></p>
<p>Two fun and important projects at the hospital in Africa:</p>
<ul>
<li>protecting assets (both physical equipment and their data) from very unreliable electricity, and
</li>
<li>doing some field testing to see how radio signals from wifi routers would behave in a large compound with concrete walls and tin roofs.
</li>
</ul>
<p>How much more fundamental does it get than copper wires and radio signals?  And in our industry, it doesn&#8217;t matter what you do &#8211; these are also the fundamentals underneath what you&#8217;re building.</p>
<p>In the days of [everything]-as-a-service and engineered-[anything]-appliances, we&#8217;re building at a higher level than ever before.  You might think that building on clouds (or any abstracted/virtualized platform) means we can leave the platform implementation details to specialists.</p>
<p>But smart companies and experienced engineers still pay <strong>a lot of attention</strong> to the fundamentals.  There&#8217;s no magic or voodoo in computer systems.  An experienced engineer can understand how a particular stack works from top to bottom &#8211; and you should be wary of anyone who won&#8217;t explain at some level of detail how their piece works.</p>
<p>Do you remember <a href="http://www.youtube.com/watch?v=tDacjrSCeq4">that youtube video where the data center guy shouts into a bank of hard drives</a>?  Even if you&#8217;re buying pre-engineered, pre-packaged systems that come by the rack and fill half a room, you still need to ask the same basic environmental questions that you would with any other deployment into your datacenter.</p>
<p>Bottom line: everything generally comes down to the same few basic things &#8211; for example processors and I/O and memory/storage hierarchies.  Ninety percent of what you need to know, even for very complex systems, is in chapter one of my computer systems college textbook.  Know your fundamentals and find them even in your complex systems.</p>
</li>
<li>Avoid Unjustifiable <strong>Complexity</strong>
<p>When I arrived in Africa, there were a number of issues crying out for immediate attention.  For example: the head of finance couldn&#8217;t login to his workstation unless it was unplugged from the network.  Every morning, this friendly Canadian guy unplugged his desktop, logged in to his domain account, then re-connected the network cable so he could access his network shares.</p>
<p>This problem had existed almost a year. A sudden power outage had corrupted a virtual server running as a domain controller.  To get systems back up, overseas IT support had directed local employees to restore a previous backup image of the VM.  This did restore that server&#8217;s file shares but caused havoc among the multi-master DC setup &#8211; which was never totally resolved.</p>
<p><span id="more-1446"></span>A simple workaround beyond the unplug-and-login trick was not forthcoming.  But now &#8211; after discussion with the director of the Hospital &#8211; we made a key decision and we changed course.  A major upgrade was on the wish list&#8230; so rather than diving into complex debug &#038; repair operation on multi-master windows domain, we put all our effort on the upgrade.  And we re-architected the system so that this particular problem could never happen again.</p>
<p>When I say &#8220;re-architected&#8221; &#8230; I mean we started at the very beginning.  Did you pay attention during the requirements engineering section of your software engineering college class?</p>
<ul>
<li><strong>Functional Requirements</strong>: What does this hospital use IT for in the first place?  How would we prioritize these functions?  &#8230; <em>(e.g. communication/email, collaboration/file sharing/printing, internet access for guests &#038; residences, specialized programs like accounting &#038; pharm&#8230;)</em>
</li>
<li><strong>Non-Functional Requirements</strong>: Assuming that we must have IT, what specific qualities are important to us?  How would we prioritize these qualities?  &#8230; <em>(e.g. data protection after >24hr power loss or theft or emergency evacuation, data retention after mistakes or theft or evacuation or catastrophe, ability for short-term IT staff to become productive quickly, can be supported remotely in lieu of local IT staff, protection from malware and viruses, etc.)</em>
</li>
</ul>
<p>Asking these simple questions led to two very important findings:</p>
<ol>
<li>There&#8217;s high turnover in the nonprofit world.  We might get people for a few months or a year&#8230; but a lot of the IT support has to be do-able from overseas.
</li>
<li>There is no need for 99.99% uptime.  If there&#8217;s a serious problem and we&#8217;re offline for a day then that&#8217;s really not a big deal at all.
</li>
</ol>
<p>We realized that it was much better to have a simple system which would nonetheless prove reliable and easy to maintain, instead of a complex system which was hard to fix if something ever went wrong.  The result was &#8211; as I wrote <a href="/africa/">in my technical summary of the trip</a> &#8211; we completely got rid of virtualization and the multi-master domain controller &#8220;cluster&#8221;.  We migrated from four operating systems on two servers to a single server with a single operating system.  But we also added a few things: on-disk encryption, RAID mirroring, improved &#038; thoroughly tested backups, a printed &#038; tested DR plan, a test server/domain exactly like the production server/domain, and a wiki.</p>
<p>The most important word here is actually not <em>complexity</em> but rather <strong>unjustifiable</strong>.  We justified everything in the architecture by showing tracability to this organization&#8217;s unique requirements.</p>
<p>&#8230;</p>
<p>Here in the States, I spend a lot of time working with clusters and I really enjoy it.  But I can&#8217;t count how many times I&#8217;ve thought someone forgot these simple questions.  <strong>Do you know your requirements?</strong></p>
<p>Maybe you&#8217;re buying pre-engineered or pre-packaged systems that come by the rack and fill half a room.  You already learned from my first lesson and you have engineers who understand the fundamentals for these systems.  But the most critical part is this step, the second one: now you need to justify those fundamental architectural characteristics by connecting them to the unique requirements of your organisation.</p>
<p>Big job?  Yes.  Somebody else&#8217;s job?  I don&#8217;t buy it.  No matter where you are in your organisation, you can start asking questions and learning.  Don&#8217;t assume it&#8217;s not your problem just because you&#8217;re not the decision-maker.  If you become an expert on both your business and your technology, then before long everyone will be specifically asking for your input!</p>
<p>If it&#8217;s worthwhile for a non-profit hospital in the middle of Africa, then how much more will it be beneficial for you?</p>
</li>
</ol>
<p>&#8230;</p>
<p>I do have four lessons, but I think I&#8217;ll save the other two for another article.  (This got long.  &lt;g&gt;)  Hope it&#8217;s thought-provoking and helpful.</p>
<p>Update: <a href="http://www.ardentperf.com/2012/08/13/lessons-from-africa-part-2/" title="Lessons from Africa, Part 2">Lessons from Africa, Part 2</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2012/08/08/lessons-from-rural-africa/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Making Simple Performance Charts</title>
		<link>http://www.ardentperf.com/2011/11/28/making-simple-performance-charts/</link>
		<comments>http://www.ardentperf.com/2011/11/28/making-simple-performance-charts/#comments</comments>
		<pubDate>Mon, 28 Nov 2011 11:35:46 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1334</guid>
		<description><![CDATA[Before I dive into this blog post, quick heads up for anyone attending UKOUG: on Tuesday only, I&#8217;ll be hanging out with some very smart people from the IOUG RAC Special Interest Group in the &#8220;gallery&#8221; above the exhibition hall. We&#8217;re ready to help anyone run a RAC cluster in a virtual environment on their [...]]]></description>
				<content:encoded><![CDATA[<p>Before I dive into this blog post, quick heads up for anyone attending UKOUG: <a href="http://2011.ukoug.org/default.asp?p=9188">on Tuesday only</a>, I&#8217;ll be <a href="http://racattack.org/e">hanging out with some very smart people</a> from the IOUG RAC Special Interest Group in the &#8220;gallery&#8221; above the exhibition hall.  We&#8217;re ready to help anyone run a RAC cluster in a virtual environment on their own laptop.  And if your laptop <a href="http://racattack.org/o">doesn&#8217;t meet the minimum requirements</a> then you can try with one of our demo workstations.  Come find us!!</p>
<h3>Why Make Charts</h3>
<p>I&#8217;ve heard <a href="http://dboptimizer.com/">Kyle Hailey</a> speak on a few different occasions, and more than once he&#8217;s talked about the power of visualizing data.  (In fact Kyle was a key person behind <a href="https://sites.google.com/site/youvisualize/oem-10g-performance-pages---before-and-after">Grid Control&#8217;s performance screens</a>.)</p>
<p>I couldn&#8217;t agree more.  I regularly visualize data when I&#8217;m working.  Two reasons come immediately to mind:</p>
<ol>
<li>It helps <strong>me</strong> to better understand what&#8217;s happening.  There have been times when I&#8217;ve had an &#8220;aha&#8221; moment very quickly after seeing the picture.
</li>
<li>It helps <strong>others</strong> more easily understand what i&#8217;m trying to communicate.  It&#8217;s great for management reports and such &#8211; not because it&#8217;s fluff, but because it&#8217;s a good communication tool.</li>
</ol>
<p>Last week, I made a few quick charts as illustrations for a performance report.  The process really isn&#8217;t that complicated, but I thought I&#8217;d put the steps into a blog post&#8230; for myself to reference in the future and for anyone else who might find this helpful.  :)</p>
<p><a href="http://www.ardentperf.com/wp-content/uploads/2011/11/more-research_html_m5273695a.jpg"><img src="http://www.ardentperf.com/wp-content/uploads/2011/11/more-research_html_m5273695a-300x175.jpg" alt="" title="more-research_html_m5273695a" width="300" height="175" class="aligncenter size-medium wp-image-1369" /></a></p>
<h3>Making Simple Charts</h3>
<p>This demonstration will use data from the AWR to build graphs.  Note that if you run these queries, Oracle legally requires you to purchase the extra-cost &#8220;diagnostic pack&#8221; license.  But similar queries could be written from free <a href="http://kerryosborne.oracle-guy.com/2008/11/statspack-still-works-in-10g-and-11g/">statspack</a> or <a href="http://ashmasters.com/ash-simulation/simulation-v22/">S-ASH</a> tables.</p>
<p>You need multiple data points to make a graph.  For this demo, The AWR was configured to take snapshotsd every 30 minutes and I&#8217;m looking at a query which ran for about 10 hours.  Also, it was the only query running in the instance for most of that time &#8211; so I will also look at some instance-wide statistics.<br />
<span id="more-1334"></span></p>
<ol>
<li>
<p>The first step is to get any needed parameters for pulling performance data.  In the case of the AWR, I will need the INSTANCE_NUMBER, the SQL_ID and the first/last SNAP_ID.</p>
<p>It&#8217;s pretty easy to get this information from Grid Control or from Database Console.  But if you don&#8217;t have access to the web console then you can still get the info from SQLPlus.</p>
<p>Here&#8217;s a useful query to get an overview of the SNAP_IDs:</p>
<pre><code>SQL&gt; select to_char(BEGIN_INTERVAL_TIME,'MON YYYY') month, 
       min(snap_id) min_snap, max(snap_id) max_snap
     from dba_hist_snapshot 
     where instance_number=4
     group by to_char(BEGIN_INTERVAL_TIME,'MON YYYY');

MONTH        MIN_SNAP     MAX_SNAP
-------- ------------ ------------
APR 2009        10239        10240
DEC 2009        28752        31924
FEB 2010        38115        40939
MAY 2010        47498        48783
AUG 2010        54975        55013
NOV 2010        60979        61986
DEC 2010        61987        64218
JAN 2011        64219        66448
FEB 2011        66449        67803
MAR 2011        67804        69291
APR 2011        69292        70731
MAY 2011        70732        72219
JUN 2011        72220        73655
JUL 2011        73656        75139
AUG 2011        75140        76608
SEP 2011        76609        78048
OCT 2011        78049        79536
NOV 2011        79537        80338

18 rows selected.</code></pre>
<p>Something similar to this might also be useful:</p>
<pre><code>SQL&gt; select snap_id,instance_number,begin_interval_time,snap_level 
     from dba_hist_snapshot
     where begin_interval_time between '11-nov-11 17:30' and '11-nov-11 19:00'
       and instance_number=4
     order by snap_id, instance_number;</code></pre>
<p>For this demo I&#8217;m going to use INSTANCE_NUMBER 4 and SQL_ID 8suhywrkmpj5c between snaps 80298 and 80318.</p>
</li>
<li style="margin-top:4em">
<p>Now create a new spreadsheet in your office suite.  I use the free OpenOffice spreadsheet application, but Excel or iWork Numbers should work pretty much the same.</p>
<p>In the second row of the new spreadsheet, enter the time of the first snapshot you&#8217;re going to analyze.  In the third row, enter this formula:</p>
<pre><code>= A2 + 1/24/60 * [minutes between snaps]</code></pre>
<p>Select several rows below this formula and select Edit > Fill > Down to copy the formula to the following rows.  Repeat this until you have reached the end of your analysis window.</p>
<p><a href="http://www.ardentperf.com/wp-content/uploads/2011/11/ooo-filldown.png"><img src="http://www.ardentperf.com/wp-content/uploads/2011/11/ooo-filldown-286x300.png" alt="" title="ooo-filldown" width="286" height="300" class="aligncenter size-medium wp-image-1358" /></a></p>
</li>
<li style="margin-top:4em">
<p>Open a SQLPlus session.  We will copy-and-paste directly from SQLPlus into the spreadsheet.</p>
<p>Use a SQL like this to retrieve data for one system statistic:</p>
<pre><code>set pagesize 999
col value format 999999999999999

select value from dba_hist_sysstat
 where instance_number=4 and snap_id between 80298 and 80318
   and stat_name='transaction tables consistent read rollbacks'
 order by snap_id</code></pre>
<p>You can copy the statistic name directly from an AWR report of there&#8217;s a certain stat you&#8217;re interested in.  You can find more information about <a href="http://docs.oracle.com/cd/E11882_01/server.112/e25513/stats002.htm#i375475">system statistics in Oracle&#8217;s docs</a>.</p>
<p>Now move right to the next empty column. First, copy the name of this statistic into the first row. Then, in the second box, enter a formula to find the difference between its left peer and the left upper peer.  For cell C3, the formula is B3-B2.  Choose Edit > Fill > Down again, as before.</p>
<pre><code>= B3 - B2</code></pre>
<p><a href="http://www.ardentperf.com/wp-content/uploads/2011/11/ooo-paste.png"><img src="http://www.ardentperf.com/wp-content/uploads/2011/11/ooo-paste-300x194.png" alt="" title="ooo-paste" width="300" height="194" class="aligncenter size-medium wp-image-1373" /></a></p>
<p>You can repeat this step to access further system statistics.  You can also create another column where you divide or multiply each other.</p>
<p><a href="http://www.ardentperf.com/wp-content/uploads/2011/11/more-research_html_7559fa03.jpg"><img src="http://www.ardentperf.com/wp-content/uploads/2011/11/more-research_html_7559fa03-300x175.jpg" alt="" title="more-research_html_7559fa03" width="300" height="175" class="aligncenter size-medium wp-image-1371" /></a></p>
</li>
<li style="margin-top:4em">
<p>The previous SQL statement retrieved system statistics.  Another easy query runs against historical V$SQLSTAT snapshots.  (This only works for long-running queries.)</p>
<pre><code>SQL&gt; select BUFFER_GETS_TOTAL value from dba_hist_sqlstat
     where instance_number=4 and snap_id between 80298 and 80318
       and sql_id='8suhywrkmpj5c'
     order by snap_id</code></pre>
<p>Once again, you can read about the available fields and statistics <a href="http://docs.oracle.com/cd/E11882_01/server.112/e25513/statviews_4052.htm#REFRN23447">in the oracle docs</a>.  You can repeat this step to quickly get additional statistics for a particular SQL, and you can then combine some stats for better graphs.</p>
<p><a href="http://www.ardentperf.com/wp-content/uploads/2011/11/more-research_html_7e8de68c.jpg"><img src="http://www.ardentperf.com/wp-content/uploads/2011/11/more-research_html_7e8de68c-300x177.jpg" alt="" title="more-research_html_7e8de68c" width="300" height="177" class="aligncenter size-medium wp-image-1375" /></a></p>
</li>
</ol>
<p>That&#8217;s it!  I know, really not that complicated.  I hope it&#8217;s helpful.  :)</p>
<h3>Post Script</h3>
<p>For anyone who&#8217;s curious, the charts in this article are related to a SQL report which was <a href="http://www.freelists.org/post/oracle-l/index-with-very-high-LIO">recently discussed on the mailing list</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2011/11/28/making-simple-performance-charts/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>RAC Attack in OTN Lounge at OOW11</title>
		<link>http://www.ardentperf.com/2011/09/21/rac-attack-in-otn-lounge-at-oow11/</link>
		<comments>http://www.ardentperf.com/2011/09/21/rac-attack-in-otn-lounge-at-oow11/#comments</comments>
		<pubDate>Wed, 21 Sep 2011 17:41:38 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1308</guid>
		<description><![CDATA[Want to get your hands on a key technology in both the Exadata Database Machine and the newly announced Oracle Database Appliance? If you&#8217;ll be at OpenWorld &#8211; in just 11 days &#8211; then the IOUG RAC SIG is putting together a special event for you!  (You might have already heard about this on Twitter [...]]]></description>
				<content:encoded><![CDATA[<p>Want to get your hands on a key technology in both the <a href="http://www.oracle.com/us/products/database/exadata-database-machine/index.html">Exadata Database Machine</a> and the newly announced <a href="http://www.oracle.com/technetwork/server-storage/database-appliance/index.html">Oracle Database Appliance</a>?</p>
<p>If you&#8217;ll be at OpenWorld &#8211; in just 11 days &#8211; then the IOUG RAC SIG is putting together a special event for you!  (You might have already heard about this on Twitter or from <a href="http://blogs.oracle.com/otn/entry/oracle_technology_network_lights_up">Justin at the OTN Blog</a>.)</p>
<p>Every day from <strong>9am to 1pm</strong>, find our table in the <strong>OTN Lounge</strong> (on Howard Street) and we&#8217;ll help you get an 11gR2 RAC cluster database running inside virtual machines on your own windows-based laptop. You can experiment boldly &#8211; if you make a mistake then you won&#8217;t have to start over; we can easily &#8220;reset&#8221; your virtual machines to any point.</p>
<p>The basic idea is to facilitate some community-driven direct mentoring.  We&#8217;ll have <a href="http://en.wikibooks.org/wiki/RAC_Attack_-_Oracle_Cluster_Database_at_Home/Events#Volunteers">experienced RAC DBAs on hand</a> to answer any questions you come up with.  We&#8217;ve run limited-seat classes (always full) at several past Collaborate conventions including <a href="http://www.ardentperf.com/2011/05/17/rac-attack-oracle-cluster-database-at-home/">this year in Orlando</a>; but we&#8217;re changing the format at OpenWorld so that more people can participate.</p>
<p>It&#8217;s not required, but to have the best experience we recommend bringing a USB-powered external hard drive with 100G of free space &#8211; you can buy one online for about $40.   Detailed hardware requirements are <a href="http://en.wikibooks.org/wiki/RAC_Attack_-_Oracle_Cluster_Database_at_Home/Hardware_and_Windows_Minimum_Requirements#Hardware_Minimum_Requirements">in the online lab handbook</a>.  But come visit our table and say hello regardless &#8211; besides, who knows what other interesting stuff we might be hacking on&#8230;  :)</p>
<p>Hope to see you there!</p>
<p><a href="http://racattack.org/e">http://racattack.org/e</a></p>
<h3>Post-Script</h3>
<p>If know a DBA who already has experience with RAC, then they can help us meet and mentor newer DBAs.  The last 5 volunteer slots are still open.  (Each volunteer can sign up for 2 max.)  Pass along the word.  Anyone can sign up for a volunteer slot by directly editing the wiki page at <a href="http://racattack.org/e">http://racattack.org/e</a> and emailing me.</p>
<p><a href="http://racattack.org/e"><img class="size-full wp-image-1324 alignleft" title="oow-sign-preview" src="http://www.ardentperf.com/wp-content/uploads/2011/09/oow-sign-preview.png" alt="" width="500" height="673" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2011/09/21/rac-attack-in-otn-lounge-at-oow11/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Performance Tuning for Oracle Developers</title>
		<link>http://www.ardentperf.com/2011/08/30/performance-tuning-for-oracle-developers/</link>
		<comments>http://www.ardentperf.com/2011/08/30/performance-tuning-for-oracle-developers/#comments</comments>
		<pubDate>Tue, 30 Aug 2011 14:50:56 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1278</guid>
		<description><![CDATA[One of my recent customers was a company with a somewhat large warehouse (around 60TB) on Oracle 10gR2.  The system was using RAC, though it was a fairly simple setup: two nodes, very large AIX LPARs, workload manually partitioned between them and somewhat evenly balanced.  The most important demand of their business is a large [...]]]></description>
				<content:encoded><![CDATA[<p>One of my recent customers was a company with a somewhat large warehouse (around 60TB) on Oracle 10gR2.  The system was using RAC, though it was a fairly simple setup: two nodes, very large AIX LPARs, workload manually partitioned between them and somewhat evenly balanced.  The most important demand of their business is a large number of reports that must be generated every day from the warehouse.  These reports were beginning to take most of the day and consume a large amount of resources&#8230; and the current forecast is for dramatic data growth later this year.  So our project goal was to improve performance.</p>
<p>Our project was very successful &#8211; the key changes had dramatic effects and we are now running the same set of reports much faster.  (Some reports went from many hours to less than 30 minutes.)  My final project summary included a small section where I passed along a few suggestions specifically for these developers &#8211; but looking over the suggestions now, I think they might actually be good tips for a wider audience.  At any rate it might be interesting to put them out for comments!</p>
<p>Of course these tips aren&#8217;t my own original work; in fact I think they&#8217;re pretty widely accepted in the Oracle professional community.  We&#8217;ve been hearing these ideas for years now at conferences and user groups.  I just picked a few that could make the biggest impact for this team right now &#8211; and I put the ideas in my own words.  These were my top three:</p>
<ul>
<li>Aggressively search for opportunities to do less computational work.
<ul>
<li>Fewer switches between SQL, PL/SQL and Java.</li>
<li>Java is slowest, PL/SQL is faster, C and SQL are fastest.</li>
<li>Operate in bulk whenever possible.
<ul>
<li>Avoid using any kind of loop in your code whenever possible. If loops are required, operate on large chunks at a time (for example, by increasing fetch sizes).</li>
<li>Avoid coding in multiple steps whenever possible. If multiple steps are faster, ask how you&#8217;re code works and why Oracle isn&#8217;t automatically doing the same.</li>
</ul>
</li>
</ul>
</li>
<li>Always utilize tracing/logging when coding and tuning. Create logs at the most granular level possible, then summarize the results by time (aka profiling). Calibrate everything in terms of wall clock elapsed time for entire job.
<ul>
<li>Utilize database 10046 log files for any jobs where the database is an important component. There are several profilers available with various feature levels and price points (starting with free); utilize one of these to summarize the logs. These logs can safely be generated in production.</li>
<li>Use the Oracle PL/SQL Profiler to generate profiles on critical path PL/SQL code.</li>
</ul>
</li>
<li>When tuning SQL, follow these steps:
<ol>
<li>Remove all hints.</li>
<li>Use dbms_xplan, 10046 and 10053 traces &#8211; in that order &#8211; to determine where the CBO went wrong. Compare CBO row count estimates with actual row counts before time-consuming, in-depth analysis of 10053 logs.</li>
<li>Add hints only with justification about why the hint is preferable to fixing underlying cause. Generally speaking, hints should be viewed as band-aids.</li>
<li>When a good plan is found and underlying causes cannot be addressed, use DBMS_XPLAN functions with the ADVANCED flag to retrieve a full set of hints. Use this to lock in the good plan and document the situation with SQL comments. (11g will offer new options for plan management, but this method will be sufficient for 10g.)</li>
</ol>
</li>
</ul>
<p>I&#8217;m sure that these tips aren&#8217;t perfect &#8211; so please share your thoughts!  How could I better word them?  Do they apply to your development team?</p>
<h3>Post-Script</h3>
<p>I&#8217;m definitely no guru of Oracle performance yet, but I have been involved in a number of performance-oriented projects. My experience has been that it&#8217;s pretty common to have certain assumptions built in to the project through scoping and goal statements before I arrive. Often, project sponsors have already decided that one of the most important items in a performance project is that someone with experience in storage and system configuration for Oracle (this is usually me) will review settings and statistics for things like database parameters, AWR reports, operating system kernel settings, networking, I/O, etc.  The contractual &#8220;statement of work&#8221; might say something like &#8220;review and assess configuration and architecture&#8221; with a list of specific areas of configuration.</p>
<p>I&#8217;m not opposed to periodic configuration audits or having a second set of eyes look over the setup.  But this very customer was a perfect example of how these assumptions about performance tuning through system review really are not the best way to work.  Due to the many parties involved in this contract and various expectations, it was important for me to review the system configuration &#8211; so I did spend time on this review.  However I was lucky that the team and managers who I directly worked with were sharp and open-minded. After some spirited discussion we slightly altered our approach: starting at the top of the stack and working down rather than the other way around.</p>
<p>Our first two weeks were focused on the most important business reports and only during the last two weeks did I take an in-depth look at storage and system configuration.  Can you guess what happened?  Looking at the top business problems led us straight to the most important bottlenecks impacting reports across the board.  It also showed me which areas of system configuration could have the biggest impact on the business.  By the time I started doing a system review, I already knew which areas to focus on.  Furthermore, I could already make some estimations about best-case impact of system changes on overall report run-times.  Not surprisingly, I knew that no system change was going to give us the improvement we wanted &#8211; but tuning a few small pieces of code that were used widely gave us even more improvement than we had thought possible.</p>
<p>Again, this isn&#8217;t new&#8230; it&#8217;s stuff we&#8217;ve been hearing at conferences and user groups for years now.  But it seems to me that it&#8217;s still worth repeating.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2011/08/30/performance-tuning-for-oracle-developers/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Developer Access To 10046 Trace Files</title>
		<link>http://www.ardentperf.com/2011/08/19/developer-access-to-10046-trace-files/</link>
		<comments>http://www.ardentperf.com/2011/08/19/developer-access-to-10046-trace-files/#comments</comments>
		<pubDate>Fri, 19 Aug 2011 16:20:59 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1249</guid>
		<description><![CDATA[Lets suppose you are a DBA at a large company. You have some great developers, and they&#8217;re learning all about how to turn on full logging of their code through the 10046 database trace. They just learned how to use this data in summary form to find out &#8211; at a very detailed level &#8211; [...]]]></description>
				<content:encoded><![CDATA[<p>Lets suppose you are a DBA at a large company.  You have some great developers, and they&#8217;re learning all about how to turn on full logging of their code through the 10046 database trace.  They just learned how to use this data in summary form to find out &#8211; at a very detailed level &#8211; what&#8217;s REALLY taking up all the time during their big batch program which runs too long.  They&#8217;re salivating over this trace data &#8211; but you work for a big company with security policies that can&#8217;t be easily changed, where developers rarely get any kind of shell-level or filesystem-level access to a database server.  You WANT them to have the ability to profile their own database code&#8230; but every time they run a trace, you get dragged into a long email exchange to locate their tracefile and transfer it to a network drive where they can access it.  We&#8217;re so close to a great situation&#8230; but this last part is such a drag!!!</p>
<p><em>However: please notice the Security Addendum at the end of this article.</em></p>
<p>Of course if you&#8217;re lucky enough that your developers use certain tools from Oracle, then there are some slick 3rd party plugins that will help download and manage tracefiles for you.  But what if your developers don&#8217;t want to add a whole new environment to their already memory-bound workstations?  What if there are corporate policies making this difficult &#8211; such as a time-consuming review and approval process for any new software installs?</p>
<p>Wouldn&#8217;t it be nice of Oracle had a system view that your developers could just QUERY to find tracefiles?  Maybe something like this (which of course displayed data in real time, reflecting the current status of the filesystem):</p>
<pre><code>SQL&gt; desc all_trace_files
 Name                          Null?    Type
 ----------------------------- -------- --------------------
 DIRECTORY                              CHAR(5)
 FILE_NAME                              VARCHAR2(400)
 FILE_TIME                              DATE
 FILE_BYTES                             NUMBER
 FILE_TRACEID                           VARCHAR2(100)
 FIRST_TIME                             DATE
 FIRST_ACTION                           VARCHAR2(100)
 FIRST_MODULE                           VARCHAR2(100)
 FIRST_SERVICE                          VARCHAR2(100)
 SID                                    NUMBER
 SERIAL                                 NUMBER
 ERROR                                  VARCHAR2(400)</code></pre>
<p>Sure, this looks nice &#8211; maybe in the next version of Oracle.  When you upgrade to it, 10 years from now.  But what about NOW &#8211; your old 10g database running on a huge mainframe?</p>
<p>We all know how nice it would be if your developers run a query like this&#8230; today&#8230; on almost any database:</p>
<pre style="clear:right"><code>SQL&gt; select directory, file_name, file_traceid, first_action from all_trace_files 
  2  where first_action is not null;

DIREC FILE_NAME                                FILE_TRACE FIRST_ACTION
----- ---------------------------------------- ---------- ----------------------------------------
UDUMP ardentp1_ora_12386502.trc                           Test Window - New
UDUMP ardentp1_ora_22347974.trc                           Test Window - New
UDUMP ardentp1_ora_22675498.trc                           SQL Window - xml_file_writing_te
UDUMP ardentp1_ora_24248536.trc                           SQL Window - test_cases.sql
UDUMP ardentp1_ora_24641726.trc                           DEQ
UDUMP ardentp1_ora_2621852.trc                            Test Window - New
UDUMP ardentp1_ora_3342766.trc                            problemTbsp
UDUMP ardentp1_ora_4063428.trc                            SQL Window - New
UDUMP ardentp1_ora_45350944.trc                           Debug Test Window - New
UDUMP ardentp1_ora_45547538.trc                           Main session
UDUMP ardentp1_ora_48103606.trc                           SQL Window - SELECT * FROM ext.e
UDUMP ardentp1_ora_50593842.trc                           SQL Window - test_cases.sql
UDUMP ardentp1_ora_50790578.trc                           SQL Window - New
UDUMP ardentp1_ora_51380240.trc                           Debug Test Window - New
UDUMP ardentp1_ora_52429026.trc                           UserBlock
UDUMP ardentp1_ora_52691004.trc                           SQL Window - New
UDUMP ardentp1_ora_60817584.trc                           Test Window - New
UDUMP ardentp1_ora_61538330.trc                           Test Window - New
UDUMP ardentp1_ora_62783512_schn_02.trc        schn_02    Test Window - New
UDUMP ardentp1_ora_62783512_schn_03.trc        schn_03    Test Window - New
UDUMP ardentp1_ora_6291778.trc                            Test Window - New
UDUMP ardentp1_ora_63176754.trc                           Test Window - New
UDUMP ardentp1_ora_66388158.trc                           Test Window - New
UDUMP ardentp1_ora_8192480.trc                            Test Window - New
UDUMP ardentp1_ora_8651144.trc                            SQL Window - test_cases.sql
BDUMP ardentp1_m000_12517418.trc                          Monitor Tablespace Thresholds
BDUMP ardentp1_m000_20119784.trc                          Monitor Tablespace Thresholds
BDUMP ardentp1_m000_24248408.trc                          Monitor Tablespace Thresholds
BDUMP ardentp1_m000_45154330.trc                          Monitor Tablespace Thresholds
BDUMP ardentp1_mmnl_52560064.trc                          Monitor Tablespace Thresholds
BDUMP ardentp1_mmon_52756580.trc                          Monitor Tablespace Thresholds

31 rows selected.</code></pre>
<p>And of course &#8211; last but not least, you&#8217;d like your developers to actually be able to download the tracefile themselves, right?  Actually, this has already been done &#8211; Dion Cho blogged some sample code about two and a half years ago.  He has a very elegant solution using a pipelined PL/SQL function and UTL_FILE.</p>
<p><a href="http://dioncho.wordpress.com/2009/03/19/another-way-to-use-trace-file/">http://dioncho.wordpress.com/2009/03/19/another-way-to-use-trace-file/</a></p>
<p>Just one minor tweak would be needed, to grab files from more than one directory.  (In addition to your session traces, you might need to get logs from parallel query slaves or DBMS_JOB/SCHEDULER processes.)</p>
<p>Not only can you write queries to your heart&#8217;s content on this, but just about every developer environment on the planet can easily take this query output and save it to a file &#8211; then they can run the profiler of their choice on the tracefile.  That would be cool, right?!</p>
<pre><code>SQL&gt; select * from table(textfile('ardentp1_ora_52691004.trc','UDUMP')) where rownum&lt;20;

COLUMN_VALUE
-------------------------------------------------------------------------------------------
/u0001/app/oracle/admin/ardentp/udump/ardentp1_ora_52691004.trc
Oracle Database 10g Enterprise Edition Release 10.2.0.5.0 - 64bit Production
With the Partitioning, Real Application Clusters, OLAP, Data Mining
and Real Application Testing options
ORACLE_HOME = /u0001/app/oracle/product/db/10.2.0/ardentp
System name:    AIX
Node name:      ardent4
Release:        1
Version:        6
Machine:        00F66BE22C00
Instance name: ardentp1
Redo thread mounted by this instance: 1
Oracle process number: 59
Unix process pid: 52691004, image: oracle@ardent4

*** ACTION NAME:(SQL Window - New) 2011-08-15 16:35:24.791
*** MODULE NAME:(PL/SQL Developer) 2011-08-15 16:35:24.791
*** SERVICE NAME:(ardentp_dba) 2011-08-15 16:35:24.791
*** SESSION ID:(572.24734) 2011-08-15 16:35:24.791

19 rows selected.</code></pre>
<p>Well I AGREE that this would be cool!  And as it happens, the customer I&#8217;m working with now would find it very useful.  And honestly&#8230; it&#8217;s really not that complicated to code something like this.  So last night I stayed late a few hours to write it up and test it.  It&#8217;s mainly build on a pipelined java function, and of course a minor tweak on Dion&#8217;s code for getting the file.</p>
<p>Basically, every time you query the view it scans the first 30 lines of all files in the bdump and udump directories for timestamps and identification information (module, action, service, sid, serial).  This information is combined with basic file stats (name, size, last-modified-date) and the tracefile_identifier is calculated from the filename.  Any errors encountered are reported in the error column (file permissions, etc).</p>
<p>You can download the entire thing at <a href="https://github.com/ardentperf">my github repo</a> (specifically, the files <a href="https://github.com/ardentperf/tracefile_helper/blob/master/tracefile_find.txt">tracefile_find.txt</a> and <a href="https://github.com/ardentperf/tracefile_helper/blob/master/tracefile_get.txt">tracefile_get.txt</a>).  All examples in this blog post are from a real system &#8211; nothing here is cooked, except that I changed database and host names to protect the innocent.</p>
<p>Enjoy!</p>
<h2>Security Addendum</h2>
<p>As pointed out in the comments after I published this post, there are some important security considerations with allowing access to trace files.  A good explanation can be found in Pete Finnigan&#8217;s blog article: <a href="http://www.petefinnigan.com/weblog/archives/00001234.htm">Is it possible to steal data with just ALTER SESSION?</a>  To summarize: if a clever hacker is able to access tracefiles then it is possible for them to see things that the database normally protects.  This certainly includes table data, and it may even include passwords for other database accounts &#8211; <a href="http://lists.jammed.com/pen-test/2001/08/0138.html">sometimes even in cleartext</a>.  This is very important if you&#8217;re considering any tool that grants access to tracefiles &#8211; including the scripts I&#8217;ve published here.  Although he generally disapproves of any tracefile access, Pete&#8217;s comment suggested a way to reduce the risk. The second table (which actually gives tracefile contents) can be limited to DBAs and then developers can get a view which only shows files that the DBAs explicitly grant.  I think that the most useful part of my program is the first view anyway, which scans tracefiles and lets you use SQL to search on module/action/service.</p>
<p>One further quick postscript&#8230; lest you think I was terribly irresponsible with this client, we did consider security implications before endorsing this script.  Admittedly I didn&#8217;t consider every scenario that Pete has written about, but I do still think we made an appropriate decision.  (Also&#8230; the script wasn&#8217;t yet installed into production when I left the client, because it had to go through an approval process itself.  <g>)  In this particular case, the small group of developers getting access to tracefiles were people who already had full access to all data in the database and they had passwords for all key application accounts.  Although they didn&#8217;t have SYSDBA passwords or shell access, they essentially had everything else &#8211; including all the data.  Furthermore, they were in a stage of performance troubleshooting where their productivity could be significantly increased with the ability to run many traces and quickly access the tracefiles (think Method R).  And finally, these developers were on the same small, tight team as the DBAs who took care of backups and patching, all sitting together in one corner of the office building.  The DBAs were involved in the decision and seemed to find their reduced workload a very positive thing &#8211; especially since they know all the developers who would get access.</p>
<p>One could certainly comment on the security policies of this client, and their decisions about balancing trust and productivity.  But I do not think that we were circumventing the company security policies in this case.  And certainly my scripts are not intended to support such activity.  Nonetheless, I do think there are cases where this is a very useful script.  It&#8217;s much more convenient for developers who &#8211; even if they can get shell access &#8211; may not know much about navigating unix and are more comfortable working in their SQL development environment.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2011/08/19/developer-access-to-10046-trace-files/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Finally on Twitter</title>
		<link>http://www.ardentperf.com/2011/05/24/finally-on-twitter/</link>
		<comments>http://www.ardentperf.com/2011/05/24/finally-on-twitter/#comments</comments>
		<pubDate>Tue, 24 May 2011 16:45:31 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1240</guid>
		<description><![CDATA[Overheard in an IRC chat room (Freenode#oracle) this morning&#8230; [24 May 11 09:30] * cheboygan: glad to see that someone read the  blog post though.  at least i know one person read it. [24 May 11 09:30] * rizzo: it got re-tweeted a lot last week [24 May 11 09:30] * rizzo: get yourself on [...]]]></description>
				<content:encoded><![CDATA[<p>Overheard in an IRC chat room (Freenode#oracle) this morning&#8230;</p>
<pre><code>[24 May 11 09:30] * cheboygan: glad to see that someone read the
  blog post though.  at least i know one person read it.
[24 May 11 09:30] * rizzo: it got re-tweeted a lot last week
[24 May 11 09:30] * rizzo: get yourself on twitter, man
[24 May 11 09:30] * cheboygan: i've been resisting...
[24 May 11 09:30] * rizzo: give in
[24 May 11 09:30] * rizzo: it calls to you
[24 May 11 09:31] * cheboygan is still fighting it
[24 May 11 09:31] * rizzo: resistance is futile
[24 May 11 09:31] * cheboygan will never be assimilated
[24 May 11 09:32] * cheboygan: is there a point to having a twitter
  acct  if it's not tied to some kind of mobile device?
[24 May 11 09:34] * rizzo: cheboygan: sure, most of my twitter
  stuff is done at my desk
[24 May 11 09:34] * rizzo: just reading and posting updates
[24 May 11 09:35] * cheboygan: hm.  i just did a search and it 
  turns out there's a twitter plugin for my chat program.  might
  give that a try actually.  wow, really didn't think i'd be so
  easily persuaded.
[24 May 11 09:49] * rizzo: Dan Norris will thank me for this
[24 May 11 09:49] * cheboygan: have to admit that i did hold out
  for an impressively long time :)

[24 May 11 09:50] * rizzo: what's your twitter handle?
[24 May 11 09:50] * cheboygan: haven't setup the account quite
  yet... give me a few more minutes :)
[24 May 11 09:50] * rizzo: heh ok
[24 May 11 09:50] * rizzo refreshes madly

[24 May 11 09:56] * rizzo: bieber4life?
[24 May 11 09:57] * cheboygan: are you kidding?  bieber4life is
  definitely taken.

[24 May 11 10:04] * cheboygan: the problem with holding out on
  something like twitter, is that the longer you wait the more
  impossible it is to have any readable kind of handle.
[24 May 11 10:06] * cheboygan: maybe i should write a blog post
  and elicit suggestions or something.  this is ridiculous.  in
  fact, even the word ridiculous is taken.
[24 May 11 10:10] * Ramen: use l33t
[24 May 11 10:11] * cheboygan: just dicovered underscore is ok.
  new possibilities.</code></pre>
<p>Just a few minutes after this, I finally caved in and signed up for Twitter.  My account is @jer_s &#8211; feel free to find me over there while I try to figure out what you&#8217;re supposed to do with this Twitter thing.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2011/05/24/finally-on-twitter/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>RAC Attack &#8211; Oracle Cluster Database at Home</title>
		<link>http://www.ardentperf.com/2011/05/17/rac-attack-oracle-cluster-database-at-home/</link>
		<comments>http://www.ardentperf.com/2011/05/17/rac-attack-oracle-cluster-database-at-home/#comments</comments>
		<pubDate>Tue, 17 May 2011 15:17:13 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Oracle]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1219</guid>
		<description><![CDATA[First of all, the RAC Attack deep dive at Collaborate went great &#8211; thanks to everyone who participated! The room was full (20 participants) and I got evaluations from about half of them. Here&#8217;s a summary of the eval results: 100% class met expectations, would recommend to others 66% easy to follow, could use skills [...]]]></description>
				<content:encoded><![CDATA[<p>First of all, the RAC Attack deep dive at Collaborate went great &#8211; thanks to everyone who participated!  The room was full (20 participants) and I got evaluations from about half of them. Here&#8217;s a summary of the eval results:</p>
<ul>
<li>100% class met expectations, would recommend to others</li>
</ul>
<ul>
<li> 66% easy to follow, could use skills in working environment</li>
</ul>
<ul>
<li> 100% already familiar with oracle, 90% use oracle daily</li>
</ul>
<ul>
<li> 0 negative reviews of instructor (phew!)</li>
</ul>
<ul>
<li> 1 negative review of curriculum: said practice exercises weren&#8217;t relevant but training manual was still above average.</li>
</ul>
<ul>
<li>0 negative &#8220;comments&#8221;</li>
</ul>
<p>There were several positive comments such as this: &#8220;I would recommend this class to others. This setup is perfect to     pick up new skills and expose what ifs w/out worrying about pressing     the wrong button.&#8221;</p>
<p>There were also some suggestions for improvements which I&#8217;ve taken note of. The most common suggestion was this: &#8220;More time to practice / explore new concepts.&#8221;</p>
<h4>What&#8217;s Next</h4>
<p>The RAC Attack curriculum has been around for over three years now &#8211; and I&#8217;m still getting requests to put it on.  Seems that a number of people have felt that it&#8217;s a decent quality class and handbook.  I talked to several other Oracle instructors and speakers at Collaborate this year about the RAC Attack curriculum &#8211; including an in-depth conversation with <a href="http://arup.blogspot.com/">Arup Nanda</a>.</p>
<p>As a result of these conversations, I did some careful thinking and I&#8217;ve been very busy for the past few weeks since Collaborate.  And I now have three major announcements about RAC Attack:</p>
<p><span id="more-1219"></span></p>
<ol>
<li>Last week, I published an <a href="http://www.ardentperf.com/downloads/">updated RAC Attack Lab Handbook</a>.  The handbook is now <strong>completely self-contained for home use</strong>.  I&#8217;ve added significant new content:
<ul>
<li>Detailed hardware requirements (taking into consideration optimal I/O and memory configuration)</li>
<li>How to install and setup VMware environment (including networking and storage)</li>
<li>How to create the RAC Attack Classroom DVD (this is actually scripted now)</li>
</ul>
</li>
<p>This means that you can now (finally) setup RAC Attack on your own computer &#8211; without needing any external assistance!</p>
<li>Although I created most of the labs for RAC Attack, some labs were also contributed by <a href="http://www.dannorris.com/">Dan Norris</a> and Parto Jalili.  Last week I also obtained their permission to move all RAC Attack content under a Creative Commons Attribution-ShareAlike license &#8211; which you will see in the latest handbook.The RAC Attack lab handbook has now been <strong>donated to the Oracle professional community</strong> with a solid copyleft license.
<ul>
<li>New DBAs are invited to use this material at home.</li>
<li>Experienced DBAs are invited to make corrections and enhancements to the textbook.  Also you can add your own labs which work on the RAC Attack &#8220;platform&#8221;.</li>
<li>Instructors are invited to use this material partially or completely in their own classes, as they wish.  You can even run your own class based solely on this material &#8211; without needing anyone&#8217;s permission.  Also, you can charge for these classes &#8211; there isn&#8217;t a &#8220;non-commercial&#8221; limitation on the terms of our license.</li>
</ul>
</li>
<p>On a related note, all code related to RAC Attack (including the jumpstart scripts and supplemental DVD build script) is now under the GPL and has been published on <a href="https://github.com/ardentperf/racattack">github</a> for easy downloading, copying and branching.</p>
<li>Finally&#8230; how are you supposed to make corrections, enhancements and additions if you&#8217;ve only got a PDF version?  Right now the textbook sources are in OpenOffice format (with a master document and about 50 lab documents).  Although this is very well suited to print media, it&#8217;s difficult to coordinate with very many authors or editors.  I&#8217;ve decided that the best way to truly encourage collaboration with other teachers &amp; instructors is to use the best platform on the web for collaborative open-content textbook projects: <strong>the RAC Attack lab textbook has a new home on wikibooks</strong>.
<p><a href="http://en.wikibooks.org/wiki/RAC_Attack_-_Oracle_Cluster_Database_at_Home">http://en.wikibooks.org/wiki/RAC_Attack_-_Oracle_Cluster_Database_at_Home</a></p>
<p>I&#8217;m just getting started now &#8211; but I&#8217;m hoping to have all existing RAC Attack content on this wiki textbook by the end of the month.  Please have a look at the project page and consider getting involved!</li>
</ol>
<h4>Feedback</h4>
<p>I consider these to be some major changes.  I&#8217;ve put some careful thought into them, and I&#8217;m excited about the new direction for this class.  Let me know what you think.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2011/05/17/rac-attack-oracle-cluster-database-at-home/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>BYO Oracle RAC on EC2</title>
		<link>http://www.ardentperf.com/2011/03/04/byo-oracle-rac-on-ec2/</link>
		<comments>http://www.ardentperf.com/2011/03/04/byo-oracle-rac-on-ec2/#comments</comments>
		<pubDate>Fri, 04 Mar 2011 16:11:24 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1204</guid>
		<description><![CDATA[Over the past year or so I&#8217;ve had a number of conversations about running Oracle RAC on Amazon&#8217;s EC2 cloud platform. Chet Justice had suggested a long time ago that I try it, but I never quite found the time. Last fall at the Oak Table Symposium in Michigan, Jeremiah Wilton told me he hadn&#8217;t [...]]]></description>
				<content:encoded><![CDATA[<p>Over the past year or so I&#8217;ve had a number of conversations about running Oracle RAC on Amazon&#8217;s EC2 cloud platform.  <a href="http://www.oraclenerd.com/">Chet Justice</a> had suggested a long time ago that I try it, but I never quite found the time.  Last fall at the Oak Table Symposium in Michigan, <a href="http://www.bluegecko.net/category/oradeblog/">Jeremiah Wilton</a> told me he hadn&#8217;t yet done it and I spent the last night scheming with <a href="http://orajourn.blogspot.com/">Charles Schultz</a> about getting started.  But it wasn&#8217;t until this week that I finally found the time to try.  (To be fair, I&#8217;ve been quite busy lately &#8212; our first kid was born on February 2nd, a beautiful little girl!)</p>
<h4>Cheap Lab Environment</h4>
<p>RAC on EC2 would be useful.  (1) If you&#8217;ve never done much with RAC, then you can have a sandbox to play in &#8211; try the installer, try some parallel queries, try anything else.  (2) You can setup a cluster with any version, any patch level &#8211; and see if your pl/sql or query works.  (3) You could give each individual student in a class their own cluster.</p>
<p>And of course, the cost is very reasonable:</p>
<table>
<tr>
<td>2 nodes, 1 SCSI target</td>
<td>$3.08 for 6 hours</td>
</tr>
<tr>
<td>4 nodes, 2 SCSI targets</td>
<td>$4.68 for 6 hours</td>
</tr>
<tr>
<td>20 nodes, 5 SCSI targets</td>
<td>$14.37 for 6 hours</td>
</tr>
</table>
<p>These numbers are based on small instances.  In my observation so far the network connection is the bottleneck &#8212; and quite likely the 20 node cluster wouldn&#8217;t actually work. It would be about $25/hr to run a 10 node cluster with 4 SCSI targets on EC2 &#8220;cluster compute&#8221; nodes (w/10G network)&#8230; probably a little steep but maybe someday I&#8217;ll try running a benchmark for an hour or two just to see how fast it is.  :)</p>
<h4>Progress</h4>
<p>I&#8217;m convinced that this will work. After a few days of working on it in between diapers, I haven&#8217;t quite gotten all the way &#8211; but I&#8217;m very close.  I&#8217;ve got shared iSCSI-based block storage for ASM and two private networks which I think will support VIPs and multicast.  The only two pre-checks that fail are swap space and an optional flag to NTP.  The installer completes.  I just need iron the kinks out of the post-install/root script execution.</p>
<p>I was initially trying to install 11.2.0.2 but I wimped out and reverted to 11.2.0.1 after wrestling with it for a while.  I only have so much time in between diapers&#8230;  :)</p>
<p>Everything is scripted.  When you&#8217;re paying by the hour, you don&#8217;t want to waste time building the environment!  Also, you don&#8217;t want to leave the environment running for days after you&#8217;ve set it up.  Much better if you can start the RAC environment on-demand when you want it, and shut it down if you&#8217;re not using it.</p>
<p>Here&#8217;s what I&#8217;ve got so far:<br />
<span id="more-1204"></span></p>
<pre><code>jeremy@joshua:~/rac-ec2$ ./run-cluster 3 2
EC2-RAC-BUILDER Copyright 2010 Jeremy Schneider

    This program comes with ABSOLUTELY NO WARRANTY.
    This is free software, and you are welcome to redistribute it
    under certain conditions.  See included gpl.txt for details.

Creating cluster: ID #3
1 storage node(s)
2 compute node(s)

time      storage             network             compute
-----     --------            --------            --------
18:21:42  starting            starting            starting
18:22:18  ec2 load instance   ec2 load instance   ec2 load instance
18:23:44  boot O.S.
18:23:50  setup iscsi
18:23:55                      boot O.S.
18:24:10  finished
18:24:36                                          wait strg/ntwrk
18:25:27                      configure switch
18:25:37                      reboot O.S.
18:27:22                      finished
18:27:28                                          boot O.S.
18:27:48                                          configure O.S.
18:33:17                                          finished

username root/oracle
password ec2rac
hostname ec2-50-16-23-197.compute-1.amazonaws.com</code></pre>
<p>This part is working pretty well.  Right now I&#8217;m working on the script to actually install the oracle bits.  Like I said &#8211; I&#8217;m still working out the kinks with the root script&#8230; but here&#8217;s what it currently looks like:</p>
<pre><code>jeremy@joshua:~/rac-ec2$ ./install-oracle 3 2 &gt;/tmp/ora-inst.out
19:57:19  downloading grid software from Oracle support
19:58:16  extracting grid software (/tmp/oracle/grid)
19:59:10  running cluvfy
20:00:33  installing grid infrastructure
20:28:11  running root scripts
20:30:38  downloading database software from Oracle support
20:36:16  extracting database software (/tmp/oracle/database)
20:38:02  finished!</code></pre>
<h4>Try It Yourself</h4>
<p>If you&#8217;re interested in running RAC on EC2 then I&#8217;d love your help with this.  You can <a href="https://github.com/ardentperf/ec2-rac-builder">download these programs</a> and try them out for yourself.  And let me know if you finish the install, and what steps you took.  I&#8217;d really love to have a public toolkit that anyone can download, which will build RAC environments in EC2 on demand.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2011/03/04/byo-oracle-rac-on-ec2/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Binding DBLINK to a Network Interface</title>
		<link>http://www.ardentperf.com/2010/11/27/binding-dblink-to-a-network-interface/</link>
		<comments>http://www.ardentperf.com/2010/11/27/binding-dblink-to-a-network-interface/#comments</comments>
		<pubDate>Sat, 27 Nov 2010 21:45:25 +0000</pubDate>
		<dc:creator>Jeremy Schneider</dc:creator>
				<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://www.ardentperf.com/?p=1176</guid>
		<description><![CDATA[Just thought I&#8217;d do a quick post on this one; came out of a conversation about a month or two ago. We had a single-instance database running on a failover cluster (RHCS). A database link existed for a related database and the connection had to pass through a firewall. The problem was the firewall: it [...]]]></description>
				<content:encoded><![CDATA[<p>Just thought I&#8217;d do a quick post on this one; came out of a conversation about a month or two ago.</p>
<p>We had a single-instance database running on a failover cluster (RHCS).  A database link existed for a related database and the connection had to pass through a firewall.  The problem was the firewall: it had a rule which only allowed connections from the VIP.</p>
<p>The database server has two IP addresses &#8211; a system IP and a VIP.  Is there any way to bind the dblink to one specific interface?  (Note: we would still like the system IP to be used for other traffic.)</p>
<p>I couldn&#8217;t think of a way for Oracle to do that.  But we did find a workaround, of sorts (though not perfect) &#8211; by using the operating system <strong>route</strong> command.</p>
<p>Sometimes there&#8217;s a misconception that the purpose of OS routing is just to set the default gateway.  For example, in the windows networking GUI that function is visible and it&#8217;s rare to need much else.  However the command-line OS route command is in fact <strong>much</strong> more powerful (on all platforms).</p>
<p>Here&#8217;s a detailed manual page for the linux version of route:<br />
<a href="http://www.linuxmanpages.com/man8/route.8.php">http://www.linuxmanpages.com/man8/route.8.php</a></p>
<p>First, in addition to setting the default gateway, the route command can also determine which physical interface to use.  Second, in addition to configuration for a subnet, the route command can also do configuration for only one specific host.  So the syntax we&#8217;d want is something like this:</p>
<pre><code>route add -host [targetIP] dev [interface/ethX]</code></pre>
<p>Of course it&#8217;s important to check that there&#8217;s nothing else connecting from the database cluster to the target host IP.  Also, you&#8217;d want to link this command with the cluster VIP resource so that it follows the VIP around the cluster.</p>
<p>But it does seem to be one possible solution to the dilemma.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.ardentperf.com/2010/11/27/binding-dblink-to-a-network-interface/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
<!-- WP Super Cache is installed but broken. The path to wp-cache-phase1.php in wp-content/advanced-cache.php must be fixed! -->