The 833, What Is It?

Here’s a story that I’d like to share, about the infamous SQL Server 833. A while ago I was on a performance-tuning call at a client site; they’d been experiencing event 833 from some of their production SQL Servers, and they wanted to know if this was a SQL Server problem or a problem with the disk farm.

To back up just a little, an event 833 (or error 833, as it’s sometimes called) is a signal from the SQL Server that it’s waiting for records that it’s requested. An 833 message reads like “SQL Server has encountered 49 occurrence(s) of I/O requests taking longer than 15 seconds to complete on file […file full path…] in database […database name…]…”.

This message is often interpreted as being a disk I/O issue, and in many cases the 833 can be resolved by optimizing disk operations or just replacing older, slower equipment with the new fast disks. But this one case didn’t resolve quite so easily. You see, the disk farm attached to the SQL Servers in question were composed of flash drives — a flash SAN.

Let’s complicate this situation:

SAN storage serverThe production SQL Servers which were getting the 833s are on the far right of this system diagram. The 833s were coming from different database data files and log files, sometimes during heavy transaction activity and other times during very light transaction activity…so, in other words, the 833s were sporadic and unpredictable.

Given all the “upstream” data stores, processes, and source data implications (no one on staff seemed to really understand how the production system worked), it’s not surprising that there would be some sort of issue in the transaction stream. The stream starts at point-of-sale terminals and data aggregators on the far left and proceeds through load balancers and “soft routers” (this is a distributed environment) and on to error-handling routines before finally arriving at a place where the SQL Servers can begin to do their job.  The good news is that this system is able to handle many millions of transactions a day with only the occasional 833 hiccup. The 833 “wait” always resolves itself within a minute or two, but by then, the upstream congestion is significant.

The client set up a test environment with the intent of replicating the 833 situation to determine root cause. There was only one problem with the test setup…it bore little if any resemblance to production. And, while I was on-site, we were never allowed to use advanced analysis tools on the production systems.

Spoiler Alert! the exact root cause of the 833s was never determined…

So now the challenge to you, dear reader: given what you know so far, what would you surmise might be the cause of the 833s which are being thrown by the SQL Servers?  Remember, that SAN is full of flash drives…

 

Scheduled Job Failed to Run…How to Troubleshoot SQL Server Error Messages

Troubleshooting error messages can be a little tricky, especially if you’re new to the task. Here’s a good example of a job that failed without a whole lot of information as to why it failed. This was taken from the Windows Event Log/Application Log:

Log Name:      Application
Source:        SQLISPackage
Date:          7/12/2014 10:45:45 PM
Event ID:      12291
Task Category: None
Level:         Error
Keywords:      Classic
User:
Computer:
Description:   Package “Dynamics DB Integrity Check” failed.

Step 1: this error message is pointing to a job (Application/SQLISPackage) called Dynamics DB Integrity Check.Subplan, so it’s logical to start with the SQL Server Management Studio scheduled jobs. Sure enough, there’s a scheduled job with the same name in the job list, and it’s an active job.

SchedJobErrorMsg_01

 

 

 

 

 

 

 

 

(The jobs with the tiny red downward-pointing arrows as part of the icon have been disabled.)

Step 2: open up the job history for the job and see if there’s any additional information:

SchedJobErrorMsg_02

 

 

 

 

 

 

As it turns out, there isn’t, not in the main entry for the job history description or in the subplan step.

SchedJobErrorMsg_03

 

 

 

 

 

The clues are in the error message Source (SQLISPackage) and Description that starts with “Package …”. The job name is a clue, also – the two-part name usually indicates that this job was created by a Maintenance Plan.

Step 3: Confirm these suspicions by opening the job itself and viewing the content (you’ll have to Edit in order to see the internal configuration of the job, but don’t make any changes – and always cancel out when you’re done, never save *).

SchedJobErrorMsg_04

 

 

 

 

 

 

 

 

 

Maintenance Plans are SQL Server Integration Services packages. When you save a package that you’ve just created, it creates and enables scheduled jobs which run the various tasks that comprise the maintenance package.

Step 4: go to the Maintenance Plans and check the plan named in the job step…but look!  There IS NO MAINTENANCE PLAN!

SchedJobErrorMsg_05

 

 

 

 

 

 

 

 

 

If there were maintenance plans for this SQL Server they would be listed under Legacy>Database Maintenance Plans (this is a SQL 2005 server). Obviously, there are none. Also obviously, at one time there was a maintenance plan.

Reason for the Error Message: the scheduled job is calling a non-existent maintenance plan.

Lesson Learned: when a Maintenance Plan is deleted from a SQL Server, the scheduled jobs that ran the pieces of the Plan are not also deleted. You have to remove (or disable) them separately.

Step 5: remove or disable the scheduled job. For the sake of caution, we’re going to disable the job. Highlight the job name, R-click, choose “Disable”. The result looks like this:

SchedJobErrorMsg_06

 

 

 

 

And those are the steps to troubleshooting an error message about a job failing to run.

* Save v. cancel: when investigating settings or parameters for live jobs or other database properties, it’s always safest to cancel out rather than save when you’re done observing. A “save” could cause changes if you inadvertently modify something in the settings, whereas a “cancel” will simply leave the settings as they were. “Save” when you intend to make changes; “cancel” when you don’t.

 

Windows Azure SQL Database (WASD)

Azure Dragon - Windows Azure SQL Database (WASD)Tom LaRock, aka the SQL Rockstar, recently blogged about the Windows Azure SQL Database (WASD).

He pointed out that, as good as Azure is, it does lack a few features of the ground-bound version of SQL Server.

He lists seven tasks that you won’t have to bother with, or won’t be able to do. To read Tom’s blog post click on Top 7 Things You Want To Do In Windows Azure SQL Database, But Can’t.

 

Database Data Modeling: Can You Afford Pre-built Data Models?

I just read a blog post about “Data Models: Build or Buy”, by Jason Haley …

He has some interesting points.  If I could recap for brevity and clarity …

  • Good data models that accurately reflect business requirements are critical to the success of a database project, whether transactional or data warehouse, web-based retail or BI;
  • Pre-built data models are available, often from software vendors;
  • Pre-built data models are like off-the-rack suits; they may not fit very well, and will need some alterations to better fit the buyer;
  • Altering a pre-built data model is the realm of a small group of highly talented and (usually) expensive people called (surprise!) data modelers and data integrators;
  • Since budgets are tight, it’s hard to justify hiring one of these highly skilled professionals as a full-time employee, and once the project is complete, what do you do with this skill set? And the person?
  • The risks associated with poorly-modeled databases and bad integration are substantial, and include but are not limited to incorrect information and difficulty of use;

He is absolutely right on all counts.  If you’re getting started on a project and you think you’ll need modifications to an existing database or even an entirely new database, and you don’t want to deal with the problems inherent in an “off-the-rack” pre-built model, investigate adding a database data modeling consultant to your team.  These people are usually happy to work on contract for a set period of time, so when the project is over you don’t have the excess headcount.  But you will have had the benefit of this person’s skill set.  It’s a win-win for everyone.

Come talk to us … the Folks at Mount Vernon Data Systems, where we do it all … database design, support & protection … at www.MountVernonDataSystems.com. We’re affordable, and we’re good.

 

SQL Backups, Use the Maintenance Wizard or No?

Yesterday I was reading an entry in one of the LinkIn forums that I belong to. It went something like this:

“In a simple scheduled backup maintenance plan the Server runs the job but doesn’t create a backup? what could be the reason? In the history of the backup it shows the job ran successfully?”

Obviously, this person had created the SQL Server backup plan using the SQL Maintenance Wizard, which, if you’ve never used it before, walks you through a plethora of maintenance options that you can select from, and then generates an SSIS-like visual plan.  You then schedule the plan to run whenever needed.

The subsequent forum conversation ran the gamut from how to troubleshoot a SQL Maintenance Plan (there really isn’t any good way, without a LOT of additional work, creating error-checking routines in the plan itself), to my philosophy, which was an echo of several individuals who posted before me:

“Take direct control; script out the backups and run them as SQL Agent scheduled jobs. This way, if they fail, you have a record of what happened and why.”

I’ve heard from overworked DBAs that they use the SQL Maintenance Plans because they don’t have the time to write out the code and schedule the jobs to run under the SQL Agent’s control. My experience is that 1) the Maintenance Wizard gives the user too much temptation in selecting options (do you really need to reorganize all the indexes in the database before backing up?), and 2) you don’t have control over what happens and why. With no good troubleshooting tools, when something goes wrong, the time the DBA will spend trying to figure out root cause and fixing it more than compensates for any time saved by using the Maintenance Wizard.

 

Ten Best Cities for Finding a Tech Job

Baseline Magazine, in their Briefing newsletter, has listed out ten best cities in the US for finding a tech job. If you’re looking for a a new challenge in your professional life, it might be worth-while to begin tracking the area lists and newspapers that coincide with the industry type you’re most interested in.

The cities, in order from east to west across the country, with major industry verticals listed:

  • Boston, MA:  finance, insurance, pharmaceutical & biotech; startup capability in conjunction with the MIT-Harvard research environment.
  • New York, NY:  “Silicon Alley”, thanks to Mayor Bloomberg’s recent investment of $9.2 billion; financial services, insurance and media.
  • Washington, D.C.:  government/military/homeland security in both the public sector and the IT contractor community; research universities, think tanks, PACs and nonprofits.
  • Raleigh, NC:  Research Triangle Park – biotech, pharmaceutical, medical devices, and the software to power them all.
  • Charlotte, NC:  health care, energy; project managers, business analysts, developers attuned to these industries.
  • St. Louis, MO:  “Wall Street of the Midwest”; financial developers, project managers and .NET/Java developers.
  • Minneapolis, MN:  a cold-weather tech hub for help-desk staff, project managers and developers, in the finance, health care and manufacturing sectors.
  • Denver, CO:  a startup-friendly destination for a plethora of industry verticals.
  • Los Angeles, CA:  Media, entertainment, banking, aerospace and health care.
  • San Francisco, CA:  cloud-focused jobs – consumer apps, analysts and database, .NET and Java developers.

For the full story, go to http://tinyurl.com/afctspk

 

Cloud Computing A-Z

Had to share this finding — SearchCloudComputing.com’s “Cloud computing from A to Z”.  I’m not saying that this is the definitive dictionary of cloud computing terms, and it’s a little dated, but it’s a good resource.  Check it out at http://bit.ly/cPRoTA

Over time I’ll be adding references to cloud computing resources to this blog category. Even though most of use some sort of cloud service, adopting a new way of doing things can be trying.  It’s my goal to make this process a little less nerve-wracking by providing you with resources that you can use to better evaluate your options.

Wishing you sunny skies!

 

How Do You Determine What Operating System & Version of SQL Server You’ve Got Under Management?

As a new DBA, or if you’re consulting and you need to quickly determine what environments you’re working in, one of the first things you’re going to do is to document (that’s right, you’re going to record this somewhere! it’s one of those “best practices” things…) each server under management.

So how do you do that?

Windows Version: the how-to will vary from opsys to opsys, but essentially, from the start menu, select RUN and type in winver.  For newer versions of the opsys, just type winver into the search box directly above the Start button. The pop-up box shows you just about everything that you might want to know about the operating system except for one very important thing — is this a 32-bit system, or a 64-bit system?

32-Bit or 64-Bit: you’ll have to dig a little deeper, by going to the System report. From the start menu, select RUN and type in system, or type system into the search box just above the Start button. If you have to select your computer by name, you’ll see an option to View System Information — that’s what you want. Hint: if it doesn’t say x64 or 64 Processor or something like that, you’re dealing with a 32-bit system. Hint-hint: if you’re on a 64-bit machine, then the SQL Server instance(s) will also be 64-bit…the opsys and SQL Server have to match.

SQL Server, how to tell which version you’re dealing with: Open Management Studio (I’m going to pretend that everyone is now on SQL Server 2005, although I, myself, support one instance of 2000), connect to a SQL Server instance, open a query window, and type in one of these two queries:

select @@version  — the old standby, and yes, it’ll work on SQL Server 2000, also. The output looks something like this:

How Do You Determine What Operating System & Version of SQL Server You've Got Under Management?

Or, you can run this query:
 SELECT SERVERPROPERTY(‘productversion’) AS ProductVersion,
          SERVERPROPERTY (‘productlevel’) AS ProductLevel,   
         SERVERPROPERTY (‘edition’) As ProductEdition

How Do You Determine What Operating System & Version of SQL Server You've Got Under Management?

— by specifying various server properties and giving the output column names, you can produce something that looks more like a real report.

And there you have it!  Yes, there’s lots more to find out about your servers under management, but this is a place to start.

 

Do You Design Databases as Part of Your Job?

If you design databases as part of your job, you’re not alone.  Recently, the SQL Server Magazine ran an “insta-poll”, where they asked “Do you consider database design to be part of your job function?”   The results from the 102 responses:

– Yes, I’m primarily a database designer–1%
– Yes, I consider myself to be a database designer/developer–52%
– Yes, I consider myself to be a database designer/BI pro–8%
– Yes, I consider myself to be a database designer/DBA–28%
– No, I don’t consider database design to be part of my job–11%

Now, despite the fact that with any opt-in poll there’s a high probability of skew due to differential interest in the topic at hand, and the count of persons polled was really very low (some would say statistically insignificant), 89% of respondents claim that database design is part of their job function.  That’s impressive.

So, if you’re a DBA or a database developer and you’re expected to “do” design as part of your job, do you know what the downside of inadequate design could be? Let me count the ways…. lost performance for the database group and lost revenue for the company are just a starting point.

If you check the stats in the list, Dev is the component with the highest level of design overlap, but not necessarily because it’s creative.  Developers in the pre-database world built their own file structures to hold the input data needed by the programs they wrote, and to store the output data.   That mentality has (gawd help us) been carried forward into today.

There are a lot of managers who just don’t get (or who don’t want to get, because it’ll cost more) that developers may not have the skill set to create efficient, effective, and long-lasting (extensible) database schemas. It’s easier (and cheaper in the short run) to just whip the Dev donkey to get the job done.  Developers want to do it right and, left to their own devices, they want to do a good job — and they will.  But if you have a supervisor or manager who’s not invested in anything except making the numbers on the bottom line, you, Mr. or Ms. Developer, are not even going to get a chance to do it right.

In addition to the Developer, any SQL Server DBA who’s been around for any length of time will be responsible for some database design… thus the 28% in their category.  And any DBA who values performance will monitor and sign off on schema changes to the databases under his/her control.

Bad or inadequate design is arguably the #1 reason for poor performance, not only with SQL Server, but with any DBMS platform.  But once the design is put into production, rectifying it may not be possible, so the DBA now has to tinker with index management, I/O throughput acceleration, adjusting and balancing memory resources, query optimization, etc.

That’s my opinion.  What’s yours?

 

50 Reasons to Virtualize Your SQL Server

The whole SQL Server world seems like it’s moving towards virtualization, and you’re wondering “why would anyone want to do that?”

Change is hard; change is uncomfortable.  It’s a whole lot easier just doing what you’ve always done, but, dear reader, I promise you, if you hesitate for too long, you’re going to get trampled by the stampeding herd. The early adopters1 jumped aboard the virtualization bandwagon years ago; the mainstream majority is now in the process of embracing the technology. Wait much longer, and you’ll be accused of being a laggard, and we wouldn’t want that, would we?  Especially if hesitating were to compromise database performance, increase exposure to disastrous downtime, and/or cost more.

There’s a lot of reasons to move SQL Server– and all your servers — to a virtualized environment; here are 50 reasons that I can think of…

  1. It’s free if you pick the right virtualization product.
  2. You can have quicker access to your servers because the virtualization software has a central manager that makes it so.
  3. Deploying servers is faster when they’re virtual servers.
  4. Quicker CD/DVD mount.
  5. You can quickly allocate additional RAM or disk storage to a VM (virtual machine).
  6. You can move virtual servers from one VM to another.
  7. You can restore from image backups that are routinely taken as part of the VM environment.
  8. You can deploy applications to production more quickly, and with more flexibility, on a VM.
  9. You can increase your hardware utilization by 50% or more if you virtualize.
  10. Your hardware and software capital costs can be decreased by 40% or more if you virtualize.
  11. You can reduce operating costs by 50% or more with virtualization.
  12. You can be green: virtualizing can save energy, which maps to a smaller carbon footprint.
  13. You can reduce your TCO (total cost of ownership) by virtualizing.
  14. You can re-direct the capital that you saved when you virtualized the servers to other projects.
  15. Virtualization allows you to redirect unspent capital to other projects.
  16. When you virtualize you get to play with a SAN (hopefully).
  17. Instead of “100 servers, 100 databases” you can say “1 server, 100 databases” — of course, you may still have 100 VMs…
  18. You can move database “guests” from hardware to hardware, quickly, without crashing the system, and with minimum disruption to online customers.
  19. You’re justified in kicking other “guest” systems off your VM— SQL Server needs resources for good performance — and move them over to their own VM.
  20. You get to use cool new acronyms (like LUN or HBA) when you use a SAN.
  21. You get to help design multi-pathing networks for your SAN, unless you’re the SAN administrator, in which case you get to design it yourself.
  22. You get to whiteboard the network (or forever be lost in the nests of wiring!)
  23. When you virtualize you don’t have to be apologetic about asking others for their use patterns, scheduled jobs, anti-virus applications, patch deployment schedules, etc.
  24. You get to compare SAN-specs at the next geek-fest (”My SAN is bigger than your SAN…”)
  25. You can use cool tools like SQLIO to test the SAN speed.
  26. You may get SAN through-put bragging rights.
  27. You get to play peek and poke trying to figure out why the page file looks normal but SQL Server performance is in the toilet. (Don’t worry, there’s an app for that!)
  28. You get to monitor how long SQL Server backup jobs take as a measure of SAN performance (”the canary in the coal mine”).
  29. You get to be best friends with Windows PerfMon and the three best metrics for maintaining virtualization:
    • Physical Disk: Avg Disk sec/Read and Avg Disk sec/Write, measures SAN performance
    • System: Processor Queue Length, measures CPU contention
    • SQL Server Buffer Manager: Page Life Expectancy, measures memory allocation
  30. You may be able to change jobs, from database administrator to SAN administrator.
  31. You may get to replace ordinary database backups with SAN storage snapshot backups — they run SO much faster than regular backups!
  32. You get to brag about how expensive your SAN was and how cool you were when arguing to get it in the first place.
  33. You get to practice for the eventual shift to Cloud computing!
  34. You can give developers their own sandbox(es) to play in by simply spinning up one or more VMs.
  35. You can build a world-class test environment in less than a day.
  36. You can run out of SQL Server licenses faster than you ever thought possible!
  37. You can take an occasional day off from work — the whole weekend, maybe?
  38. You can train and mentor your “grasshoppers” by giving them their own sandbox, so they won’t bring production crashing down.
  39. You can experiment with and compare different server configurations without having to buy more hardware.
  40. You can feel cool and ahead of the crowd — well, ahead of the laggards, anyway.
  41. You can reduce face time with difficult end-users — just send over a new image and they’re up, up and away!
  42. You can reduce face time with geeks — when they have a problem you can spin up an image and they can test to their heart’s content!
  43. You have bragging rights for exploiting the latest cost containment solution — aren’t you clever?
  44. You can provide red meat for your geeks; happy talent = retained talent.
  45. The business continuity program gets a lift from new knowledge applied to minimize your company’s IT risk exposure — that makes your boss happy.
  46. You can manage your servers and the VM cluster from your Smartphone.
  47. Clients who require compliance can actually “inspect” your system without risk of harm to your production servers.
  48. You can be the first on the block to adopt server virtualization (even now).
  49. You can show your peers and colleagues how to scale back on data center space and do more with less.
  50. You can quickly clone a very complex configuration that would have taken weeks to set up if you weren’t using virtualization.

1 Crossing the Chasm (1991, revised 1999), by Geoffrey A. Moore