Remote Management

"It's late on Sunday evening and you're about to settle down to watch the big film when you receive a message that the system has gone down". It's a common style of catch line for vendors of remote access solution presenting their box as a means of avoiding a late night journey to site to fix a problem.

The right technology is fundamental.

Like many things there is a significant element of truth in this but the reality is that it has never been harder to effectively manage system problems. The right technology is fundamental but many other factors are just as important. System faults or hardware failures are ultimately inevitable events and dealing with them has to be planned for before they happen. Whilst the threat of global Terrorism may be somewhat hyped, real events such as the recent oil terminal fire near Hemel Hempstead show that full disaster recovery is an issue that does have to be properly considered.

Compliance with legislation.

Legislation such as Sarbaines Oxley from the USA and Basle II within Europe , lay down standards for corporate governance that mean system integrity, data protection and failure recovery procedures are now obligatory rather than common sense and desirable. However, changes in technology and the way that computing is used within organisations mean that for an IT or System Manager this presents significant challenges both to budgets and resources.

Cost of on site and fast response maintenance.

When computing was largely mainframe based, management of support and back up was much more straight forward. Most of the equipment was extremely expensive compared to labour costs. Hardware such as dumb terminals was often replicated across a single site in large numbers. This made the cost of on site and fast response maintenance a sensible percentage of the overall cost of running the systems annually. The advent of minicomputers created new applications and even took over the role of mainframes in some instances, but the systems still were generally centralised and based on common hardware within an individual user environment.

The pace of change within the PC and peripheral industries.

Today's systems are much more varied and physically distributed. As market prices have tumbled many larger companies no longer consider PCs and peripherals as major capital purchase items, even if there is a corporate policy on Servers. This has resulted in local purchasers buying a wide variety of brands and the creation of small sub nets installed across an organisation. The pace of change within the PC and peripheral industries has meant that there is little chance to maintain a simple stock of appropriate spares for a large organisation.

Value of remote management.

The old established concepts of maintenance being a fixed percentage of the hardware value can no longer be easily applied. Equipment is simpler and the relatively high cost of specialist technicians compared to hardware values, often makes it impossible to justify sending an engineer to site for a fix. If its impractical to send out engineers and users are unable to carry out extensive diagnostics for themselves, then the value of remote management becomes clear. Once the fault or cause of a problem is established, sending a pre configured or plug and play device in the post may sometimes be the best option.

Management and diagnostic equipment

Just as it is harder to justify the costs of engineers travelling to site so the cost of management and diagnostic equipment is also something that many users are reluctant to allocate already stretched budgets on. To totally ignore this aspect of management is ultimately going to come back and bite. It is universally agreed that down time costs money. Inadequate equipment and resources to monitor systems, diagnose faults and implement back up and recovery options is ultimately going to mean longer down time and far greater cost. In the real world however, the equipment has to be low cost and perform other value added functions where possible to justify its existence. The sheer expense of a dedicated workstation based management system such as a full HP open view implementation is well beyond the needs and budgets of most users.

Justifying their initial cost

However the need to detect and deal with a system lock up such as the infamous "Blue Screen of Death" associated with Microsoft based system is real and brings us back to the argument of the film on Sunday night. KVM technology has long been used as a cost effective means of managing and operating servers. New products such as the Black Box Wizard IP provide a means to access these through the Internet. A user with a broad band connection can now see a server from their own home as if sitting in front of the local keyboard, saving that late night trip to the computer room. KVM solutions with integrated IP access are also now available. The Black Box ServSwitch CX is a good example. It means that products are not just expensive test equipment waiting for a crisis but working systems justifying their initial cost.

Look at each aspect individually

A single box isn't the whole answer unfortunately. The aspect of planning and integrating system control and recovery should be a design philosophy from the outset. The solution will deliver a means to access systems remotely and diagnose the problem. Enable reconfiguration or implement a fix that allows failed hardware to be taken out of service while it is repaired and brought back on line once re installed. The solution needs to be flexible and modular allowing implantation in stages or to be integrated within an existing network environment.

Whilst the requirements should be considered as a whole, it is convenient to look at each aspect individually.

Uninterruptible Power Supplies

We have all heard the example of a cleaner disconnecting a vital system to plug in a vacuum cleaner. Both central and remote servers should to be fed through a UPS (Uninterruptible Power Supplies). These should have a secure mains connection that cannot suffer random disconnection. Sometimes power recycling is the only means to re boot locked up equipment, so individual feeds need to capable of being switched or cycled. Power strips such as the ServPower freelancer from Black Box are available now with built in intelligence and network connection. Remote access via dial up or the Internet is a further possibility. Single devices can be reset or a whole bank of equipment power recycled.

Console Servers

Whist KVM technology with remote access will enable management connection to equipment other than just servers, another option is to employ dedicated Console Servers. These provide a common access method to the management or configuration ports on equipment such as communications systems, switches and routers. Access can be through the network or via an out band connection such as a modem or a dedicated DSL link. One of the basic concepts of any management system that its access must be independent from the network or system you are trying to fix.

Use a Black Box ServSensor

Especially important at remote locations environmental monitoring on parameters such as temperature, humidity and unauthorised access can alert of the possibility of trouble before a disaster arises. Systems such as the Black Box ServSensor can be used not only to monitor the changes but also to turn on equipment such as extra fans in event of overheating. Alternatively this could be controlled as a management response through power strip management depending on the requirements of a given system. This is an example of where the whole arena needs to be considered not just the single test function.

A double-edged sword of easy remote access

A double-edged sword of easy remote access is leaving a back door open to the heart of the IT system. All of the devices mentioned as solutions here have high levels of intrinsic security, many with the possibility to link to other devices such as RAS servers or similar security functions. Firewalls and intrusion detection devices also need to be part of the overall management solution. Apart from anything else, if the firewall is the problem how do you access the system to check if this is the case.

Virtual Media Support

A software problem will often require a new patch to be loaded on to a remote computer. New technology called Virtual Media Support is available to transfer files across systems as if at a local level. Combined with remote access KVM technology such as the Black Box ServSwitch CX, KVM switch it makes available a means to change and upgrade systems without security risks.

Multiple disc based data storage systems

Data Storage is further area that requires consideration for diagnostic access. It has long been established that data back ups need to be regular and systematic with the data held off site. Huge drive arrays such as raid systems can now carry on operating without loss of data in event of a drive failure. However unless a defective drive is detected and replaced a whole system may be wiped out if a second disc goes down. Multiple disc based data storage systems with inbuilt protection/recovery methods are now entirely practical with storage costs at an all time low per megabyte.

Multiple disc based data storage systems

Many of these systems can provide alarm outputs eliminating the need for constant attention. The new Mutiny system from Black Box is an example of a low cost hardware unit that will monitor the status of all devices within a network and forward alarms to defined addresses as required. Alarms can be prioritised to ensure the most urgent and significant issues are highlighted. These can be as emails or text messages where SMS messaging is available.

The bottom line

The bottom line is that there is equipment out there that will help manage modern complex distributed systems. It can be implemented as building blocks as required, added to existing systems and won't break the budget. It may require some thought up front, but the day-to-day management is starting to get a little easier for the first time in a long while.

