Techniques for HA IT Management

  1. 7. Techniques That Address Multiple Availability Requirements
    1. Redundancy
      1. Hardware Redundancy Examples
      2. Software Redundancy Examples
      3. Environmental Redundancy Example
      4. Critical Success Factors
    2. Backup of Critical Resources
      1. Methods of Backup
      2. Hardware Backup Examples
      3. Software Backup Examples
      4. IT Operations Backup Examples
      5. Critical Success Factors
        1. Currency of backup
        2. Automated updating of backups
        3. Isolation of backup from primary
        4. Backup and restore procedure review and testing
        5. Generations of backups
        6. Integrity verification
    3. Clustering
      1. Comparing Clustering and Redundancy
      2. Hardware and Software Clustering Examples
      3. IT Operations Clustering Examples
      4. Environmental Clustering Examples
      5. Critical Success Factors
        1. Automatic load-sharing
        2. Physical separation of clustered components
    4. Fault Tolerance
      1. Hardware Fault Tolerance Examples
      2. Software Fault Tolerance Examples
      3. Environmental Fault Tolerance Examples
      4. Critical Success Factors
    5. Isolation or Partitioning
      1. Hardware Isolation Examples
      2. Software Isolation Examples
      3. Other Benefits of Isolation
        1. Minimize risk of changes
        2. Reduce resource contention
        3. Maximize resources
        4. Simpler systems management procedures
      4. Critical Success Factors
    6. Automated Operations
      1. Console and Network Operations Examples
      2. Workload Management Examples
      3. System Resource Monitoring Examples
      4. Problem Management Applications
      5. Distribution of Resources Example
      6. Backup and Restore Examples
      7. Critical Success Factors
    7. Access Security Mechanisms
      1. Steps to Secure Access
        1. Step 1: Identify the person requesting access
        2. Step 2: Verify the identity
        3. Step 3: Control access
        4. Step 4: Monitor all activities
      2. Types of Security
        1. Physical security
        2. Network security
        3. Application security
        4. Computer resource security
      3. Password Management
        1. Step 1: Enforce password selection guidelines
        2. Step 2: Expire passwords regularly
        3. Step 3: Expire assigned passwords on first use
        4. Step 4: Disable user accounts after successive invalid password attempts
        5. Step 5: Educate users on how to protect their password information
      4. Critical Success Factors
    8. Standardization
      1. Hardware Standardization Examples
      2. Software Standardization Examples
      3. Network Standardization Examples
      4. Processes and Procedures Standardization Examples
      5. Naming Standardization Examples
      6. Critical Success Factors
      7. Transitioning to Standardization
    9. Summary
  2. 8. Special Techniques for System Reliability
    1. The Use of Reliable Components
      1. Techniques for Maximizing Hardware Component Reliability
        1. Choose components with low failure rates
        2. Choose components that have high MTBF
        3. Purchase from reputable suppliers
        4. Use technical specifications as a gauge
        5. Choose products with fewer parts or greater integration
        6. Avoid newly developed products whenever possible
        7. Follow maintenance schedules diligently
      2. Techniques for Maximizing Software Component Reliability
        1. Avoid using "Version 1" and "Beta" software
        2. Don't use shareware or freeware
        3. Buy industry-standard software from reliable vendors
        4. Prior to installation, test for viruses
        5. Provide menus and other ways to control user inputs
        6. Reuse bug-free components or modules
        7. Test programs thoroughly
        8. Run "beta tests" with a controlled set of users
        9. Install the latest application software fixes judiciously
        10. Install the latest device drivers when available
        11. Upgrade to newer operating systems with caution
        12. Minimize the use of system utilities
      3. Personnel-Related Techniques for Maximizing Reliability
        1. Ensure high-quality user training
        2. Ensure quality training of support staff members
        3. Be wary of contractual hires
      4. Environment-Related Techniques for Maximizing Reliability
        1. Install Automatic Voltage Regulators (AVRs)
        2. Use adequate air-conditioning equipment
      5. Some Reliability Indicators for Suppliers
        1. Time in business
        2. Quality certification
        3. Industry awards
        4. Peer recommendation
        5. Warranty and support
    2. Programming to Minimize Failures
      1. Correctness
        1. Ensure user requirements are adequately determined
        2. Prototype the application prior to detailed coding
        3. Revalidate user requirements midway through the project
        4. Beta test prior to wide-scale deployment
      2. Robustness
        1. Test against out-of-bounds values
        2. Trap errors and prevent them from propagating
        3. Anticipate external changes
      3. Extensibility
        1. User changes
        2. System platform changes
        3. Regulatory changes
        4. Budgetary changes
        5. Business volume changes
        6. Business demand changes
        7. Generous database field sizes
        8. Design with overcapacity
        9. Place constant values in a look-up table
      4. Reusability
    3. Implement Environmental Independence Measures
      1. Use Power Generators
      2. Use Independent Air-Conditioning Units
      3. Use Fire Protection Systems
      4. Use Raised Flooring
      5. Install Equipment Wheel Locks
      6. Locate Computer Room on the Second Floor
    4. Utilize Fault Avoidance Measures
      1. Analyzing Problem Trends and Statistics
      2. Use of Advanced Hardware Technologies
      3. Use of Software Maintenance Tools
    5. Summary
  3. 9. Special Techniques for System Recoverability
    1. Automatic Fault Recognition
      1. Parity Checking Memory
      2. ECC Memory
      3. Data Validation Routines
    2. Fast Recovery Techniques
    3. Minimizing Use of Volatile Storage Media
      1. Regular Database Updates to Central Storage
      2. Automatic File-Save Features
    4. Summary
  4. 10. Special Techniques for System Serviceability
    1. Online System Redefinition
      1. Add or Remove I/O Devices
      2. Selectively Power Down Subsystems
      3. Commit or Reject Changes
    2. Informative Error Messages
      1. Use Standard Corporate Terminology
      2. Adopt Terms Already Used by Common Applications
      3. Tell What, Why, Impact, and How
      4. Implement Context-Sensitive Help
      5. Give Options for Viewing More Detailed Error Information
      6. Make Error Information Available After the Error Has Been Cleared
    3. Complete Documentation
      1. Have a Manual of Operations On Hand
      2. Write Basic Problem Isolation and Recovery Guides
      3. Provide System Configuration Diagrams
      4. Label Resources
      5. Provide a Complete Technical Library
    4. Installation of Latest Fixes and Patches
    5. Summary
  5. 11. Special Techniques for System Manageability
    1. Use Manageable Components
      1. Simple Network Management Protocol (SNMP)
      2. Common Management Information Protocol (CMIP)
      3. Desktop Management Interface (DMI)
      4. Common Information Management Format (CIM)
      5. Wired for Management (WfM)
    2. Management Applications
      1. Systems Management Issues
        1. Deployment
        2. Operations
        3. Security
      2. Automated Systems Management Capabilities
      3. System Management Applications and Frameworks
        1. Unicenter TNG (Computer Associates)
        2. Tivoli (IBM)
    3. Educate IS Personnel on Systems Management Disciplines
      1. Business Value of the Information System
      2. Value of Systems Management Disciplines
      3. Principles of Management
      4. Basic Numerical Analysis Skills
    4. Summary
  6. 12. All Together Now
    1. The Value of Systems Management Disciplines
    2. Which One First?
    3. Analyze Outages
    4. Identify Single Points of Failure
    5. Exploit What You Have
    6. An Implementation Strategy
    7. Summary
原文地址:https://www.cnblogs.com/dhcn/p/10236030.html