New Directions in Project Management by Paul C. Tinnirello

The use of any single benchmark is inherently misleading because few production systems handle a single type of workload. Transactions and queries may vary widely from application to application, and the overall processing mix is likely to include batch as well as other types of system operations.

Thus, any cost comparison between different systems should be based on quantified workloads that correspond to the current and future requirements of the business.

Database Query Volumes

Particular attention should be paid to query volumes. Although the effects of transaction and batch workloads are well documented for IS environments, large volumes of user-initiated database queries are a more recent phenomenon. Their effects are still poorly understood in most organizations.

Businesses that move to client/server computing commonly experience annual increases in query volumes of from 30 to 40 percent. Moreover, these types of increases may continue for at least five years. Although most transactions can be measured in kilobytes, queries easily run to megabytes of data. Long, sequential queries generate particularly heavy loading. Thus, once client/server computing comes into large-scale use within an organization, heavier demands are placed on processor, database, and storage capacity.

In many organizations, queries can rapidly dominate computing workloads, with costs exceeding those for supporting online transaction processing applications. IS

costs in this type of situation normally go up, not down. It is better to plan for this growth ahead of time.

SERVICE LEVELS

Service levels include response time, availability, hours of operation, and disaster recovery coverage. They are not addressed by generic performance indicators such as MIPS or TPC metrics and are not always visible in workload calculations. However, service levels have a major impact on costs.

Many businesses do not factor in servic e-level requirements when evaluating different types of systems. The requirements need to be explicitly quantified. All platforms under evaluation should include costs for configurations that will meet the required levels.

Response Time

Response time is the time it takes a system to respond to a user-initiated request for an application or data resource. Decreasing response time generally requires additional investments in processors, storage, I/O, or communications capacity, or (more likely) all of these.

In traditional mainframe-based online transaction processing applications, response time equates to the time required to perform a transaction or display data on a terminal. For more complex IS environments, response time is more likely to be the time required to process a database query, locate and retrieve a file, and deliver a document in electronic or hard-copy form. Delivering fast response time according to these criteria is both more difficult and more expensive than it is for traditional mainframe applications.

Availability

Availability is the absence of outages. Standard performance benchmarks, and even detailed measurements of system performance based on specific workloads, provide no direct insight into availability levels. Such benchmarks do not indicate how prone a system will be to outages, nor what it will cost to prevent outages.

Even in a relatively protected data center environment, outages have a wide range of common causes. These include bugs in system and applications software, hardware and network failures, as well as operator errors. When computing resources are moved closer to end users, user error also becomes a major source of disruptions.

Production environments contain large numbers of interdependent hardware and software components, any of which represents a potential point of failure. Even the most reliable system experiences some failures.

Thus, maintaining high availability levels may require specialized equipment and software, along with procedures to mask the effects of outages from users and enable service to be resumed as rapidly as possible with minimum disruption to applications and loss of data. The less reliable the core system, the more such measures will be necessary.

Availability can be realized at several levels. Subsystem duplexing and resilient system designs have cost premiums. For example, to move from the ability to restart a system within 30 minutes to the ability to restart within a few minutes can increase costs by orders of magnitude.

Hours of Operation

Running multiple shifts or otherwise extending the hours of operation increases staffing requirements. Even if automated operations tools are used, it will usually be necessary to maintain personnel on-site to deal with emergencies.

Disaster Recovery

Disaster recovery coverage requires specialized facilities and procedures to allow service to be resumed for critical applications and data in the event of a catastrophic outage. Depending on the level of coverage, standby processor and storage capacity may be necessary or an external service may be used. Costs can be substantial, even for a relatively small IS installation.

SOFTWARE LOADING

The cost of any system depends greatly on the type of software it runs. In this respect, again, there is no such thing as a generic configuration or cost for any platform. Apart from licenses and software maintenance or support fees, software selections have major implications for system capacity. It is possible for two systems running similar workloads, but equipped with different sets of applications and systems software, to have radically different costs.

For example, large, highly integrated applications can consume substantially more computing resources than a comparable set of individual applications. Complex linkages between and within applications can generate a great deal of overhead.

Similarly, certain types of development tools, databases, file systems, and operating systems also generate higher levels of processor, storage, and I/O consumption.

Exhibit 2 contains a representative list of the resource management tools required to cover most or all the functions necessary to ensure the integrity of the computing installation. If these tools are not in place, organizations are likely to run considerable business risks and incur excessive costs. Use of effective management tools is important with any type of workload. It is obligatory when high levels of availability and data integrity are required. In a mainframe-class installation, tools can consume up to 30 percent of total system capacity and license fees can easily run into hundreds of thousands of dollars.

Exhibit 2. A Representative List of Resource Management Tools System Level

High Availability

System Management/Administration

Power/Environmental Monitoring

Performance Management

Disk Mirroring/RAID

Performance Monitoring/Diagnostics

Fallover/Restart

Performance Tuning/Management

Disaster Recovery Planning

Capacity Planning

Network Management

Applications Optimization

Operations Management/Control

Storage Management

Change Management

Host Backup/Restore

Configuration Management

Hierarchical Storage Management

Problem Management

Disk Management

Resource Monitoring/Accounting

Disk Defragmentation

Software Distribution/License Control

Tape Management

Operations

Tape Automation

Print/Output Management

Volume/ File Management

Job Rescheduling/Queuing/Restart

Configuration/Event Management Resource Allocation

Configuration Management

Workload Management

Change/Installation Management

Load Balancing

Fault Reporting/Management

Console Management

Problem Tracking/Resolution

Automated Operations

Data Management

Administrative

Database Administration

Resource Accounting/Chargeback

Security

Data Center Reporting

Statistical Analysis/Report Generation

EFFICIENCY OF IS RESOURCE USE

Capacity Utilization Most computing systems operate at less than maximum capacity most of the time. However, allowance must be made for loading during peak periods. Margins are usually built into capacity planning to prevent performance degradation or data loss when hardware and software facilities begin to be pushed to their limits. If the system is properly managed, unused capacity can be minimized.

When planning costs, IS managers must distinguish between the theoretical and used capacity of a system. Failure to account for this is one of the more frequent causes of cost overruns among users moving to new systems. It may be necessary to add additional capacity to handle peak workloads. For example, properly managed disk storage subsystems may have high levels of occupancy (85 percent and over is the norm in efficient installations). There is a close relationship between data volumes used in applications and actual disk capacity. Inactive data is more frequently dumped to tape, thus reducing disk capacity requirements and corresponding hardware costs. If a system operates less efficiently, capacity requirements, and hence costs, can be substantially higher even if workloads are the same.

Consolidation, Rationalization, and Automation

Properly applied, the principles of consolidation, rationalization, and automation almost invariably reduce IS costs. Conversely, an organization characterized by diseconomies of scale, unnecessary overlaps and duplications of IS resources, and a prevalence of manual operating procedures will experience significantly higher IS

costs than one that is efficiently managed.

For example, in many organizations, numerous applications perform more or less the same function. These applications have few users relative to their CPU and storage capacity utilization, as well as to their license fee costs. Proliferation of databases, networks, and other facilities, along with underutilized operating systems and subsystems, also unnecessarily increase IS costs.

Requirements for hardware capacity can also be inflated by software versions that contain aged and inefficiently structured code. System loading will be significantly less if these older versions are reengineered or replaced with more efficient alternatives or if system, database, and application tuning procedures are used.

Automation tools can reduce staffing levels, usually by eliminating manual tasks.

Properly used, these tools also deliver higher levels of CPU capacity utilization and disk occupancy than would be possible with more labor-intensive scheduling and tuning techniques. A 1993 study commissioned by the U.S. Department of Defense compared key cost items for more efficient best-practice data centers with industry averages. Its results, summarized in Exhibit 3, are consistent with the findings of similar benchmarking studies worldwide.

Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115

Leave a Reply 0

Your email address will not be published. Required fields are marked *