Terry L. Rodgers

(Accepting In-Person & Virtual Presentation Requests)
599 Union St S
Concord, NC 28025
United States
704-942-1185
Region: IV
Honorarium: None

Terry has over 40 years of progressive experience in Critical Facilities operations and management including strategic planning, critical infrastructure design, operations, and commissioning; business protection and recovery; preventive and predictive maintenance; technical training and professional training development.

Terry is an ASHRAE Distinguished Lecturer and a voting member of ASHRAE TC9.9 "Mission Critical Facilities, Data Centers, Technology Spaces, & Electronic Equipment", ASHRAE SSPC 90.4 "Energy Standard for Data Centers”, ASHRAE SPC-127 "Method of Testing for Rating Computer Room Air Conditioners”, and GPC-1.6P “Commissioning Process for Data Centers”. He is on the Board of Directors of the 7x24 Exchange Carolinas Chapter. He is an active participant in the LLNL Energy Efficient High Performance Computing Working Group. He is on the Board of Directors of the Building Commissioning Association’s Southeast Region and has authored or co-authored books, whitepapers, and presentations on Critical Facilities, facilities management, and formal commissioning. He has developed and taught multiple training classes in the design, construction, operation, and maintenance of critical facilities including commercial nuclear power plants, aerospace facilities, and large data centers.

Terry has performed site reliability assessments for more than 50 sites in over 20 countries and 5 continents. Terry writes a bi-monthly column in Mission Critical magazine called Sustainable Operations. Terry is currently the VP and National Leader for the Commissioning and Building Analytics business unit at Jones Lang LaSalle (JLL).

Topics
Implementing Computerized Maintenance Management Systems (CMMS)
Too often site maintenance programs are implemented independently of the facility construction effort and in many cases almost as an after-thought. The best approach is to literally incorporate the design, development, and implementation of the maintenance program in the design, construction, and commissioning of the physical facility. This includes establishing the objectives, requirements, strategies, and performance criteria of the maintenance program in the same site programming effort and Owners Project Requirements (OPR) document as used to capture and define the same aspects of the physical facility. Similarly, the development of the Basis-of-Design (BOD) document should incorporate aspects of what “means and methods” will be employed by the maintenance program to meet the OPR maintenance objectives.
Embedding O&M Best Practices in the Owners Project Requirements
Most people connote “design” to mean the engineering and design of the physical facility – walls, roof, utilities, systems, equipment, etc. But why stop there? The programming phase should also include defining the Facilities Management (FM) strategies, resources, programs, and deliverables that will be required to properly operate and maintain (O&M) the facility over its long term life-span. This is true for three main reasons. First is that these Facilities Management resources, programs, and services need to be fully functional on “Day-1” and so their associated design, development, and deployment needs to occur concurrent with the design, construction, and commissioning of the physical facility. Second is that many decisions and strategies made regarding how a facility will be staffed, operated, and maintained can directly influence how the physical facility is best designed. Third is that there are many opportunities for synergies during the design and construction of the physical facility to develop and deliver on the Facility Management requirements.
Facilities Management Staffing & Training
When most people think about the “design” for a facility they limit this to the engineering and design of the physical facility. But why stop there? The design phase should also include defining the Facilities Management (FM) organization, staffing, and training that will be required to properly operate and maintain (O&M) the facility over its long term life-span. As with the infrastructure, you start with the requirements agreed to during the programming phase. Armed with this framework, the owner can now solicit proposals, negotiate agreements, and award Facilities Management/O&M contracts and SLAs (service level agreements). The goal should be to get these FM/O&M “stakeholders” involved in the construction project so they can weigh in on the design, support the supervision and oversight of construction and commissioning, and be fully deployed and functional on the “go live”, Day-1 date.
Service Level Agreements
Service Level Agreements, or SLAs, are used extensively in the critical facilities industry. They vary greatly and are used both internally within corporations, between landlords and tenants (especially within colocation (“colos”) facilities), and between owners and outside service providers. When written and implemented correctly they are excellent tools for establishing performance standards and measurements to keep critical operations on course. When written poorly they can waste valuable resources. And if not implemented properly they can result in a false sense of security and ugly outcomes.
Site Assessments
Site assessments are an excellent means for gaining an objective appraisal of a facility’s true strengths and weaknesses and can reap a substantial return on investment. Not all site assessments are alike, nor should they be. For example, assessing a previously unknown site as part of a due diligence effort varies greatly from a reliability assessment of a known site. The first order of business is to define the purpose and scope of what to assess. This can have a direct impact on what expertise is required to perform a quality assessment. Assessments can target specific or general aspects of the site. A key criterion to ensure a quality assessment is to assemble a team of professionals with the correct expertise and experience needed for a critical facility.
Commissioning of Facilities Management Processes
Commissioning buildings and facilities has become an accepted practice. Many projects encompass full service commissioning beginning at the design phase or earlier and continuing through formal acceptance and testing, including closeout documentation and operating staff training. But like the rest of the capital project, commissioning ends when the facility gets accepted by the owner and the project is deemed complete. There is a new trend, however, to extend the concept of “Third-Party Commissioning” to include the commissioning of the facilities management organization and associated operations and maintenance programs. This is not to be confused with what is typically referred to as “continuous commissioning”. Operations commissioning is applying the commissioning process to the programs and staff required to manage and execute the day-to-day, week-to-week, and year-to-year activities necessary to ensure sustained operations for the life of the facility. Operational commissioning, or OPx, involves engaging third- party validators to oversee and manage the design, development, implementation, documentation, and staff training on the site-specific facilities management processes and programs.
Existing Building Commissioning
Formal commissioning is now an accepted practice in many construction projects. The commissioning process is described in ASHRAE Guideline 0 “The Commissioning Process”. With owners and developers now recognizing the value of employing formal commissioning to new construction, they are looking to apply these same processes to existing facilities. Is it reasonable to expect the same processes included in commissioning new construction projects to work equally well when commissioning existing facilities? The broad consensus is no, and that’s why ASHRAE is developing a new guideline for commissioning of existing facilities. This document will be called ASHRAE Guideline 0.2 “The Commissioning Process for Existing Building Systems and Assemblies”. The Existing Building Commissioning Process (EBCxP) is fundamentally different from new construction commissioning. With EBCx, the commissioning team is tasked with assessing existing building systems and conditions and comparing them and their capabilities to satisfy the owner’s current needs and requirements, which can differ from the original design.
Performance Based Training Programs
An absolute best practice for critical facilities is a site-specific, performance-based training program. A performance-based training program is more than a set of classes, it is a structured curriculum based on proven instructional strategies and techniques that develop student’s skills and knowledge. It focuses on the specific skills and knowledge the students need to fulfill the duties and responsibilities associated with their job descriptions by use of instructional techniques including a combination of academics,demonstration, and on-the-job supervision. And as with most “best practices”, it requires pre-planning, programming, design and development, implementation, and a continuous improvement process to keep it current and optimized.
Emergency Preparedness for Critical Facilities
Emergency preparedness is being ready for sudden, unexpected events. The key is to expect the unexpected and to focus on the symptoms vs. the causes. The first task should be to identify the potential hazards that a site is reasonably expected to be exposed to, based upon geographical location. There are good reasons that site selection and due diligence are given the importance they are. Next is to categorize the hazards, or “events” into two categories. These would be events that typically occur with 24 hours or longer advance notice and those that could have less than an hour advance notice. Hurricanes can be predicted days before they reach a given location, whereas tornados may give little to no advance notice whatsoever. Most disasters that occur with 24 hours of advance warning also tend to affect a large region. These events can incur wide spread destruction and disruption of critical off-site services that can take days or weeks (or in extreme cases, months) to restore such as the electric power grid, municipal water systems, communication systems, and roads and bridges. On the other hand, most disasters that occur with little or no warning typically affect a much smaller area. Tornados, severe thunderstorms, wind, hail, and localized flooding, and hazardous gas or material spills are good examples. As destructive as these events may be, the relatively small area affected allows for a much quicker response and recovery of affected critical services; typically within 24 to 48 hours. However, some events such as a major earthquake have the potential to occur without notice and affect wide areas.
Enterprise Migration to Colocation Data Centers
Many colocation companies build new or expand existing data centers on speculation that demand for their services will continue to grow. Rather than wait a year or more to construct and commission new space, move in can start as soon as an appropriate site is identified. The corporations and their enterprises can focus on their core business and leave all that facilities management work to the colocation’s staff to worry about. Or can they? Is the impact of an IT outage any less when it occurs in a colocation site than in an owned site? Is the loss of reputation any less? The loss of revenue? So, the down side to moving an enterprise to a colocation facility is the loss of direct control over the quality of design, construction, operations, and maintenance of the data center. The corporate need for reliability and availability are the same but the means to achieve them are different and so this drives a new set of best practices based on oversight and continuous due diligence. Not all colocation sites are the same. Colocation companies offer a range of capabilities and services. The sites vary in size, technology, staffing, and most every other characteristic. One aspect that they pretty much all have in common though is that their corporate bottom line depends on maximizing the rent collected while minimizing expenses. This can present some fundamental conflicts between their goals and objectives vs their client’s expectations for sustained operations. This means that corporations that migrate to a colocation strategy need to also migrate their critical facility focus from an execution perspective to an audit and oversight perspective.
Document Control and Retention Programs
There are industries where typical sites have accurate and comprehensive documentation. Examples include nuclear sites, airline operations and maintenance facilities, pharmaceutical plants, etc. In each case, the site must have a formal document control and retention program. Most facilities approach document management in a less structured, informal manner. The result, in many cases, is that on-site staff struggle to retrieve requested documents; and even then, they are not totally confident that the information is accurate and up-to-date. Competent staff acting responsibly can make human errors due to inaccurate or out-of-date documentation. The key to success is to have a process that maintains critical site documentation that is up-to-date, organized, and readily accessible to authorized personnel. Equally important is the elimination of out-of-date and inaccurate documents from “as-built” status, to be either destroyed or archived as appropriate. This requires a formal document control and retention program supported by corporate policies, program management and oversight, procedures, resources, and trained staff. A formal document management program needs to define the basic questions of who, what, where, when, and how. Who will be responsible for administering the program and who will have access to each document? What documents need to be controlled? Where will the documents be located (both hardcopy and softcopy)? When do documents have to be created or received, when should they be reviewed or updated, and when do they get destroyed or archived? And how will they be by organized, categorized, and kept updated?
Stranding and Recovering Capacity (and Assets)
We often measure data centers and other critical facilities in terms of capacity. Capacity is used to measure power, cooling, and of course space. Saying a facility has 50,000 square feet of computer room space doesn’t really describe its actual capacity. Instead the description needs to include how much load can be supported such as 50,000 square feet at 100 watts per square feet (or equally 50,000sf and 5 megawatts of critical load). Implied in this description is that there is sufficient HVAC capacity to cool the heat generated by the critical load. The point is that the “rated” capacity of a critical facility assumes there is a match between the physical size, available power, and respective cooling parameters. Otherwise, the actual capacity of the site is limited by the most restrictive parameter (space, power, or cooling). This balancing act is an inherent challenge that each A&E firm meets when they design and engineer a new facility or a modification of an existing facility. One reason this balancing act to match space, power and cooling capacities is so important is to avoid what has become referred to as “stranded” capacity. Stranded capacity is installed capacity that cannot be used to support critical load.
Optimizing Staff Resources
The staff that’s entrusted to operate and maintain facilities are valuable assets. The best facilities not only rely on people, but on teams with clear direction and objectives, properly provisioned with the tools and resources required. In short, the best run facilities have established a culture of excellence. Staff take ownership in their work, support their team, and are proud of their accomplishments, large and small, immediate and long-term. Even the best staff will succumb in the face of overwhelming tasks and lack of provisions. The key is to establish clear minimum standards, match the resources to the tasks, and provide leadership and supervision to hold staff accountable for the outcome. A good starting point is with clear and reasonable standards. These should stipulate what is considered acceptable performance and become the measuring stick for management to use to judge if the site and staff are performing adequately. There should be a standard for how anomalies and emergencies are handled. Resources and tools are more than staff and wrenches. It includes robust programs such as Computerized Maintenance Management Systems (CMMS), a document management and control program, detailed operating procedures, vendor and contractor management protocols. And just as importantly, the staff needs formal, site specific training and hands-on drills and exercises to ensure they are proficient in using these programs and responding to expected and unexpected events.
Managing Building Monitoring & Control Systems
Perhaps the most critical equipment and systems in any critical facility are the monitoring and control systems. This presentation is on industry “best practices” for managing the operations and maintenance of these critical systems. Most of these systems have extensive capabilities that remain unused unless the site staff has in house system specific expertise (i.e., system administration, programming, and certified-technician trained staff). This is due in great part to how the systems were originally specified in the construction project contract documents, in how well the installation, programming, and staff training were executed, and in what kind of post- installation vendor support services get retained. For sites that lean toward outsourcing the bulk of their monitoring and controls maintenance, it is still advisable to have at least one in house expert who can manage the contractor and administer the SLA. The monitoring and control systems should be included in the overall facility maintenance program and periodic routine tasks should be scheduled and tracked just as other critical system maintenance is. As with other critical infrastructure, monitoring and control system operations and maintenance tasks should be supported by clear and concise standard operating procedures (SOPs). These should include prerequisites such as backing up processes and software prior to making changes (as a contingency back-out plan), authorizations and approvals required before executing risky procedures, and testing program and software modifications and revisions. The SOPs should also include emergency operating procedures such as how the staff should respond if the front-end fails, if critical field panels fail, if the controls network fails, and perhaps most importantly how the site can be operated manually if the critical controls fail or must be taken off-line for replacement or other reasons.
Operating & Maintenance Costs and Determining Equipment Lifespan
ASHRAE and various other sources have researched and published equipment lifespan tables that give good advice on expected lifespans for typical equipment used in facilities. There are multitudes of competing, conflicting, and in many cases ambiguous parameters that influence decisions regarding when the best time is to replace equipment. Ultimately, the exercise is a financial analysis. Often the responsibility for performing these analyses fall upon a facilities manager instead of a finance wizard. Consideration should be given to capital costs to purchase and replace assets, overall operating costs, expected maintenance costs, and the intangible costs of risk of mission impact. There are the costs for new training of operating staff on the new system technologies, writing new procedures, and establishing new maintenance agreements, spare parts inventory, etc. Understanding the utility rate structure and comparing it to the projected energy profile of the various solutions becomes necessary. There are other financial considerations that need to be taken into account such as depreciation, salvage (or resale) value, corporate tax implications, etc. Maintenance costs are another important consideration not only in comparing the total cost of ownership of various solutions, but also in the timing. This entire exercise needs to happen concurrently with an Equipment Operating Condition Assessment program, so expected remaining lifespan can be quantified. An equipment condition assessment program should gather as much data as possible regarding equipment operating condition. To add even more complexity, the predicted costs should be based on an economic analysis. A quick and easy technique is the “Simple Payback” method. Unfortunately, this ignores the cost of borrowing money, interest rates, inflation, changes in utility rates, depreciation, taxes, tax credits, and a whole host of other financial considerations.
The Value of Good Specifications
“A picture is worth a thousand words.” There is a lot of truth in this old saying, and therefore, when handed a set of construction documents the drawings are typically the first part to be scrutinized. Reading a set of drawings provides a quick understanding of a construction project. But the construction drawings provide only a partial and incomplete picture of the overall project scope and must be viewed in the context of the requirements embedded in the written specifications. What’s fascinating, and quite frankly disappointing, is how often the written specifications get glossed over, at best, if not completely ignored by project stakeholders as they focus on the drawings during design reviews, estimating, and proposal efforts. Considering that in most cases, the specifications take precedence over the drawings in case of conflicts, it is even more surprising how little attention specifications often get. On the other hand, it is understandable. Written specifications are about as dry and unexciting prose as can be written and for large projects can be hundreds or even over a thousand pages in length. In today’s hectic environments where time is of the essence, and deadlines must be met, it is easy to rely on the drawings and assume the specifications will reflect “industry standards”. And this becomes the root-cause of many construction related issues and conflicts as the project progresses.
Managing Critical Operations Via Facility Management Firms
As facility infrastructure matured and evolved over the last few decades, it has progressively become more complex and specialized. For many businesses and organizations, it makes not just common sense, but financial sense to procure rather than hire the specialized facilities management expertise and staff, and to allow the corporation to focus on the core business. But just as with any outsourced function, it is paramount that owners remain diligent and cognizant of how, and how well, their operations are supported. Many if not most critical facilities today have outsourced facilities operations and management functions in lieu of relying on direct employees. These firms have substantial portfolios of sites under contract. In most cases, the firms and their assigned staff become almost indistinguishable from their clients to the outside public. They permanently reside at the client’s site, have client email addresses, and represent the client in many day-to-day meetings and decision making processes. In some cases they have even taken over various procurement functions and manage capital projects. There is still some debate in the industry as to which model is the “best practice”; to manage critical facilities in-house with direct employees or to outsource to facility management firms. Either model can be successful or either can fail. The key to success lies in the execution by the owner to ensure actual performance meets or exceeds expectations. The best tool for measuring the performance of a facility management firm is a site-specific audit that is comprehensive of the contracted and assigned duties and responsibilities.
Integrating IT and Facilities Management Through Work-Flow Processes
Integrating the realms of IT and facilities remains a challenge to many organizations. In organizations where IT and facilities operate as separate entities, the success of the overall site is in many ways determined by the level and effectiveness of the communications and collaboration between these two departments. The inherent challenges (and respective failures) seem to occur at the boundaries, or hand-offs between the two. One proven solution is for the organization to establish a comprehensive work-flow management process. Many organizations use these processes within their respective silo to manage the work they are responsible for. Facility management departments have work order systems, computerized maintenance management systems (CMMS), and other means to assign, track, and close work efforts. Many IT departments also use work order systems that manage the procurement, deployment, and startup of IT equipment. The key to success is to integrate these systems together to develop a single process that combines management of both facility management and IT.
Commissioning Data Centers
ASHRAE Guideline 0 “The Commissioning Process” is widely recognized as the best guide for the commissioning of new construction projects. This guideline is generic in that it does not differentiate between the types of facilities being constructed and is equally applicable to data centers as to commercial office buildings. Data centers have their nuances and challenges that require different strategies and processes than other facilities. In general, it is best to structure testing to start with the small and simple and progress to the large and complex. The commissioning agent should manage the acceptance testing process and associated schedule to ensure that components and equipment are tested prior to testing systems, and that systems are tested and proven prior to proceeding to integrated testing. Acceptance testing of data centers differs somewhat from commercial office buildings and other non-critical facilities in that data centers have a much higher requirement for reliability and availability due to the requirement for continuous operations. Continuous operations require designs and installations that are fault tolerant and allow for concurrent maintainability. This equates to a much higher degree of redundancy, fail-over and backup systems, and more complex automation and associated sequences-of-operations. This obviously means the commissioning agent must have a different mindset than one who is not experienced with commissioning critical facilities. Closeout documentation and staff training are also discussed. So, in short, the commissioning process starts at the very onset of a new construction initiative and ends when the site transitions into operational status. The commissioning agent should be the first one in, and the last one out.
Characteristics of Cultures of Excellence
Over the last few years I have performed in depth reliability assessments of over 45 sites in 20 countries across 5 continents. What I have seen is a broad cross-section of compliance ranging from marginal to awesome. What I have also noticed is that in almost every case my first impressions based on a familiarization tour and initial staff interviews pan out to be accurate in the long run. There are obvious telltale signs that quickly reveal what the culture is for any given site. General housekeeping and cleanliness, organization, institutional knowledge, and availability of accurate site-specific documentation are just a few aspects that are indicative of how well the site is managed. What is consistent is that in all cases, there is a very high standard of what is considered acceptable, and expectations that all staff will not only comply, but will collectively enforce compliance by others. Insufficient staffing and/or resources inevitably results in a reactionary culture where staff constantly must prioritize tasks and activities and compromise on performance. At first it is the superficial tasks that get deferred (housekeeping, storage and inventory control, document management, non-critical preventive maintenance, etc.), but eventually the standards aren’t met, morale degrades, and pride and ownership dissipate.
The Value of Institutional Knowledge
A frequently undervalued resource in a mission critical operation is what is commonly referred to as “institutional knowledge”. This is a term describing the staff’s intangible understanding and proficiency through site specific experience. In general, it cannot be taught or acquired from outside. It is acquired through the staff’s history with the site and accumulated knowledge gained through experiences and associated lessons learned. One example of where the value of institutional knowledge becomes obvious is when management decides to outsource facilities staff and associated duties and responsibilities. In most cases, many of the existing employees are retained by the new firm to prevent the loss of institutional knowledge. Retaining institutional knowledge requires the organization to develop a culture that values, encourages, and rewards knowledge sharing. The organization should provide cross-training and mentoring and avoid developing single “specialists” who are the only employee capable (or at least allowed) to perform a critical task or activity. The culture should also encourage and reward staff who take the initiative and expend the effort to learn the details and intricacies associated with the site systems, equipment, and controls. Second, the organization must implement and maintain formal document management systems that capture, compile, organize, and archive critical information and documents that become accessible to all authorized staff. This can create a formal means to transfer institutional knowledge from employee’s memories to something that the organization owns and manages. Third, and perhaps most importantly, the organization must provide an environment and working conditions that result in keeping the best staff and minimizing attrition. This not only means good pay and benefits, but a work environment where staff likes what they do, where they do it, and who they do it for. They should feel ownership and pride in their work and in the site.
Designing for the Worst Case
Advance consideration and planning during the project programming and engineering phase to clearly define the entire realm of operating conditions will result in the best design. Not only should safety margins be included in sizing equipment, systems, and sequences-of-operations to meet maximum loads, but similar safety margins should be included to accommodate smaller than anticipated day-1 loads. It is my experience that far fewer facilities ever reach their ultimate maximum loads than there are of those that see significantly smaller day-1 loads than was predicted and programmed for. In some cases, the worst case is the day-1 low load conditions which require the over-sized supporting critical infrastructure to operate outside of their range of normal operations. There is perhaps no industry where this is truer than the data center and datacom industry. Data centers are typically designed for ever growing load profiles. Rated design loads can be a magnitude or greater than those expected on day-1. Further compounding this dilemma is the need for redundant infrastructure. Whereas most electrical systems can be “turned-down” and operate at very low loads (though at reduced efficiencies), mechanical systems are less accommodating, and it can become physically stressful when you start frequently cycling large equipment on-off. And of course, the resulting energy efficiency on day-1 can be horrendous.
The Value of Corporate Standards
Most companies and corporations have corporate standards that address fiduciary and fiscal duties, responsibilities, and protocols, etc. They have standards addressing the execution and governance of their core business and how the company interacts with regulators, labor unions, and other companies. They have standards for contracting and procurement practices, etc. as well as for human resources with regards to employment hiring and retention, workplace behaviors, even dress codes. But surprisingly it is often the case that they have no corresponding standards for the design, construction, operations, or maintenance of their critical facilities, or that the standards that exist are either inadequate or unenforced. Standards are more than directions for managing businesses and processes. They establish the minimum acceptable performance that must be achieved to ensure the respective activity meets the fundamental needs of the company to succeed. Standards also have value in that they not only set minimum performance levels, but also standardize the associated processes. But this can be a double-edged sword. In today’s dayand age, where change is the new constant, corporate standards must also evolve to remain appropriate.
Differentiating Critical vs Non-Critical
It’s a simple question – how do you define “mission critical”? But the answer can be a bit complicated. As facility owners, operators, designers, and engineers, we need to consider what aspects of our businesses, enterprises, and facilities are truly critical to the missions we are tasked to perform. Most data centers are considered “mission critical”. But some enterprises now have such inherent reliability in their enterprise software and site failover capability, that an entire data center can fail, and the “mission” is essentially unaffected. So, are their data centers still truly “mission critical”? On the other hand, typical office buildings are generally considered non-critical. But what if the people working in a particular office environment are performing business functions that when interrupted have an impact on the corporate “mission”? Is this building still non-critical? Let’s complicate the discussion further. Many work functions associated with typical office environments can be interrupted for a brief time without a noticeable impact to a corporation’s bottom-line. The same holds true for facility support systems. Now consider an extended outage such as for hours or days? Can the business continue to support its mission when these ancillary “office” functions are halted? It is probable that eventually the loss of these functions will result in a significant business impact (or why else would they even exist?). The answer requires a thorough understanding of not only what the mission is, but also of what are the essential elements required to support the mission, how their loss impacts the mission, and how long it takes for the impact to occur.
Integrated Training Program Development
Most if not all facility managers agree that staff require training if expected to perform their duties and assignments competently. This is even more true for critical facilities where there are often significant site nuances due to complex system topologies, multiple layers of redundancy, and equally complex sequences-of-operations in the building automation systems. Couple this with a low tolerance for human error and the need for training becomes obvious. It is no coincidence that most of the best run sites also have excellent training programs. Developing, implementing and maintaining a training program requires first and foremost a corporate commitment to ensure the training initiative receives the requisite resources, attention, and level-of-effort required. It also requires leadership with a good understanding of how a training program differs from what often is a set of disjointed training classes. The best training programs are structured and customized to match the needs of the students. In critical facilities the “students” are the employees and staff assigned to perform the day-to-day operations and maintenance of the critical infrastructure. The fundamental goals and objectives of this training is to produce staff who can perform their assigned duties consistently, safely, and in a quality manner. Unlike typical academics, this training needs to be what is called performance-based training.
Commissioning Team Roles & Responsibilities
Commissioning is a quality control process performed by a team with representatives from all the major entities involved in a construction or upgrade project. The process is led by the Commissioning Agent, but many, if not most of the activities are performed by the other team members. For the process to work correctly, it is imperative that each participant understand his or her specific role and purview. Formal, full life-cycle commissioning begins during the initial programming phase and continues through the design, construction, startup and acceptance testing, and extends into the initial operation of the facility or infrastructure and includes delivering comprehensive site as-built and record documents as well as operations staff training. The commissioning team should have representatives from the owner, commissioning agent, architect and engineering firms, the owner’s operating staff, the general contractor, and the major subcontractors, vendors, and suppliers. Some of these team members need to be onboard at project inception including the owner’s critical stakeholders, the commissioning agent, and the architects and engineers. Other team members may join the commissioning team later as the project progresses from design to construction including the general contractor, subcontractors, vendors and suppliers. The earlier the operating staff can be assigned and brought onboard the better. Perhaps the most important role of all is that of the site owner. The more engaged and active the owner remains in the commissioning process the better. For commissioning to work effectively, the owner must understand the commissioning process, what it is as well as what it is not, and demonstrably support the commissioning agent. The owner must also ensure the roles, responsibilities, activities, and deliverables of all project participants are clearly defined and specified in binding contracts and agreements. After all, the purpose of commissioning is at least in part to ensure the owner gets what was paid for.
Design Reviews Save Money and Improve Quality
There are few things more frustrating over the course of a construction project than to realize that the built facility does not meet the fundamental requirements set forth at the onset of the project. The obvious solution is to avoid these situations by “doing it right the first time”, and that’s where thorough, comprehensive, and integrated design reviews demonstrate their value. Most design reviews (other than peer reviews) are performed by members of the project team. Again, these design reviews are not all the same. Each entity has a different focus on what to review and more importantly, what to look out for. In many cases the owner may not have experienced staff in the design, engineering, and construction industry; and even when they do they have a different perspective than the general contractor, commissioning agent, and their facilities management partners. Owners tend to focus on the building aesthetics, occupant needs, work flow, lighting, and other aspects that have a direct impact on how the facility will be used. The facility management partner’s perspective is more focused on how the facility will be operated and maintained. They focus on the core infrastructure systems including mechanical, electrical and plumbing (MEP) systems, life safety and fire suppression systems, controls, audio-visual and physical security systems, etc. A good practice that should be considered is to bring on a qualified general contractor (GC) during the design phase to participate in design reviews for overall constructability. An alternative is to hire a GC specifically for “pre-construction” services in lieu of hiring them for the entire project prior to having a bid-set of construction documents. The commissioning agent (CxA) should also perform a design review focused on “commissionability”. The commissioning agent should ensure the design includes the appropriate quality control aspects. Specifications are boring, tedious, and extensive. They also usually take precedence over the drawings in the case of a conflict. Any design review that ignores or just glosses over the project specifications is an incomplete review.
Reliability vs Complexity
If you look at the evolution of critical facilities, you will see a fairly consistent increase in complexities as the requirements and expectation of sustaining continuous operations became more and more demanding. In response to the ever increasing industry expectations of 7x24xforever continuous operations, the critical facility design firms looked to innovation to come up with better and more resilient designs. More research was performed, and the results determined that most of the unanticipated outages and mission impacts were not due to equipment or system failures. In most cases the root-cause wasn’t even directly related to the infrastructure at all. It was due to “human activities” with the most preponderance being directly related to “human error”. What had occurred was the engineering and design community, in their quest for ever increasingly reliable infrastructure and topologies, had introduced so much complexities that they exceeded the capabilities of most site operations and maintenance staff to understand, manage, and operate the sites when the inevitable anomalies and failure scenarios materialized. In essence, the required facilities management processes did not keep pace with the increasing site complexities. The Achilles Heel is that when the infrastructure fails to perform as expected, the emergency responders require higher technical expertise to the point of becoming specialists, and very few sites have the necessary specialists on staff. Even the manufacturers, vendors, and local technical representatives are being challenged in keeping fully abreast and competent in understanding and delivering optimized system performance for their products. There is still much wisdom in the well-known phrase “keep it simple”.
Equipment Startup and Checkout
Commissioning is a process intended to verify and validate that a project delivers what is required and expected. As with most processes it is paramount that each step, phase or activity is begun and completed sequentially to keep the process flowing smoothly. At a high level, this is obvious. You shouldn’t start the design phase until the programming and requirements definition phase is complete and you shouldn’t start construction until the design is complete, etc. This strategy holds true for acceptance testing as well. Acceptance testing should progress from factory testing, to equipment receipt and installation and continue through startup, followed by performance testing, and culminate in integrated systems testing (IST). Performance testing should proceed from verifying simple components before testing assembled equipment, and testing equipment before proceeding to systems testing, and not attempting to test system interfaces until all respective systems have been validated. When commissioning is executed consistent with these strategies, each activity builds increasing confidence that the ensuing activities will be successful and keep the identified discrepancies to a manageable level. The converse is also true. When commissioning and/or acceptance testing fails to follow these simple strategies, the result is often adverse impacts on schedule, budgets, and quality. Unfortunately, this happens far too often even on projects where the participants are seasoned veterans of major construction projects where formal commissioning is the norm. What causes the commissioning process to get compromised? There is no one answer. In many instances, it is a combination of competing influences and poor assumptions that result in decisions to eliminate steps or to attempt to schedule actions that should be sequential into concurrent/parallel paths. Common sense would find that as scheduled activities slip the overall project timeline should also slip. But in many cases the project team begins trying to “compress” the schedule rather than delay project completion. The first things to go are the “contingency” times. After that the project team begins looking for what activities can be performed in parallel vs. in series. This often results in incomplete equipment startup and checkout with equipment and systems being declared ready for functional testing when in fact they are not ready, and in some instances, aren’t even safe for operation.
Integrating Facility Staff and Building Management Systems (BMS)
Most critical facilities demand continuous operations for extended periods typically measured in years if not decades. One characteristic of many successful critical facility management organizations is the interdependence and integration of the functions and capabilities of the facilities staff and the site building management system (BMS) and other associated monitoring & control systems. One of the most important functions of the facilities staff is to monitor and verify that the BMS and related control systems perform correctly and to initiate manual intervention when control systems fail. BMS systems and their site-specific deployment varies greatly across facilities in general. Some are designed for basic monitoring and control with only the most critical parameters being monitored/controlled such as equipment on/off status and “summary” alarm monitoring. On the other extreme, BMS systems can acquire massive data from connected equipment and sensors to the point where operating staff can be affected by “information overload”. The best practice isn’t one vs the other, but a well thought out strategy that matches the site staff resources and capabilities with the deployed system. Likewise, the BMS can provide some oversight of the overall performance of the site staff. An obvious example would be when the BMS alarms when site staff initiate erroneous actions such as operating the wrong breaker or valve. Facilities should consider and strategize how to best deploy and manage their facilities staff in conjunction with the site monitoring and control systems. Both are indispensable in meeting the challenge of providing continuous operations. In some cases, the staff supplement the capabilities of the BMS and vice versa, and in other aspects they provide oversight and performance measurement of the other’s effectiveness including intervening when the human or BMS actions fail.
The Value of Factory Witness Testing
The manufacture of most equipment includes various levels of quality control. Most large infrastructure equipment such as chillers, generators, and UPS systems are subjected to some manner of operational testing prior to shipment. Even so, it is not uncommon for manufacturing defects, assembly errors, and other discrepancies to slip through. Specifying more stringent and project specific inspections, certifications and testing is referred to as Factory Acceptance Testing. When the Owner also requires these tests and inspections be witnessed by their project team this becomes known as Factory Witness Testing (FWT). There are many benefits to requiring FWTs but like pretty much everything else, there are associated costs. The decision to include FWTs in the purchase of equipment should be based on what value-add the FWT provides, the risks of not doing FWTs, and the costs (in both time and money) of performing the FWTs. In general, requiring FWTs for simple, mass produced and standardized products with short or no lead times would be overkill. On the other hand, complex, customized and long lead time equipment can be good candidates for FWTs. The value of FWTs depends upon how well the FWT is planned, specified, and executed. It depends on who attends, what capabilities the testing facility has, and the stringency of the tests performed. In large part the success is dependent upon how well the project team’s expectations are communicated to the supplier, manufacturer, and the testing staff to ensure everyone agrees with the testing goals and objectives, acceptance criteria and/or certifications, and durations.
Site-Specific Industry Best Practices
The Mission Critical industry is well known for advocating the use of industry “best practices”. Most owners and operators will claim they use or otherwise comply with these practices. On the other hand, there is no formal, published set of best practices that are universally recognized within the industry. This is further complicated due to the introduction of new technologies and capabilities available for use in this constantly evolving industry, and the resulting impact this can have on how critical facilities are designed, built, and operated. So, how are best practices determined? In many cases owners will rely on engineering firms, professional societies, facilities management firms and consultants to inform them of what the current industry best practice is. These entities will often cite independent, 3rd-party sources such as ASHRAE, IEEE, TIAA/Telcordia, and others as the “recognized” authority on best practices. This over simplifies the process and can result is decisions that may not be in the owner’s best interest. The reality is that industry best practices vary based upon site-specific needs and conditions including the site’s mission, location, staff, budget, and preferences.
True Innovation - Solid State Switching
It is extremely rare to watch a truly innovative product revolutionize an industry. The development of the Atom Switch and associated product line based on solid-state switching may very well be one of these rare events. The Atom Switch is a circuit breaker that uses semiconductors to interrupt and switch the flow of current. The big deal is the solid-state switch with integral high-speed current sensing can detect a fault during a sub-cycle and open approximately 16,000 times faster than a mechanical breaker. The result is the elimination of associated low-impedance arc-flash hazards which typically cause the highest destruction. So what else is special about the Atom Power solution? Well, how about the following capabilities (and remember we are comparing this to a typical electrical distribution panel with “dumb” breakers): Individually programmable and adjustable so you can change the time-current curve and breaker settings. An entire facility coordination study can be done on a laptop and then downloaded through the Atom OS Network and all the breaker settings and configurations get implemented without leaving the office. Can be commanded open or closed either local-manually, remote-manually, or automatically. Combining the operation of a pair of Atom Switches allows these “breaker pairs” to function as automatic transfer switches (ATSs) with inherent arc-flash mitigation. Can be customized to perform soft-starting/soft-loading. Can perform motor overload protection. Incorporates high-speed data sensing which can be monitored, alarmed, trended, etc. Mitigation of arc-flash hazards
The Inefficient Pursuit of Energy Efficiency
Thirty years ago, when I first started working in data centers, energy efficiency wasn’t a concern. It was maybe 10 years ago when the CFO’s began to realize the data center electric bills were becoming on par and even surpassing the costs of IT equipment. And in fairly short-order the industry became energy conscious. Facilities started measuring their energy use and efficiency and the term “PUE” (power usage effectiveness) was coined. The industry-wide pursuit of efficiency was on! At first it was easy to find opportunities to improve energy efficiency. There was low-hanging fruit everywhere. Today it is safe to say most of the low and even middle- hanging fruit has been picked. Where in the not so distant past we sought out products and designs promising significant improvements in efficiency, we are now pursuing the remaining smaller opportunities. This continuing pursuit for the most efficient product and facility has in some (maybe many) instances resulted in increased complexities that add serious risk to overall reliability, and ironically even to being able to “optimize” operations to achieve the promised efficiencies. The continuing pursuit of energy efficiency improvements is and should remain an inherent goal for most if not all facilities. This is even more true for data centers where even very small improvements in efficiencies can result in large energy savings due to the enormous IT loads coupled with 7x24xforever operations. But for these improvements to be realized requires products, designs, and infrastructure that can be consistently deployed as intended, understood by reasonably experienced and trained staff, and does not induce operational risk due to overly complex solutions. They need to be intuitive, user-friendly, and fail-safe. Integrity (How to Respond When Things Go Wrong)
Integrity (How to Respond When Things Go Wrong)
Integrity is often used to describe people and their traits. Some descriptions of integrity are exhibiting conduct that conforms to accepted standards of right and wrong, remaining truthful even when it means taking responsibility for failures, and being faithful to high moral standards. A prerequisite to integrity is to have established norms, standards, behaviors, etc. that everyone agrees are generally appropriate and applicable to all situations. Integrity is also a term that can be used when describing physical assets, processes, structures, and even organizations. When assessing a building, a dam, a bridge or structure we evaluate the physical aspect of integrity. When attributing integrity to processes and organizations, a different set of examples and characteristics come to mind. Processes that have integrity must be thorough and complete, include checks and balances, and be repeatable, verifiable, and supported by clearly defined procedures. They need to include an enforcement aspect and be founded on acknowledged best practices. But more importantly, they need to articulate their purpose and goals, what defines acceptable compliance, and a means for being audited or otherwise validated. Integrity can also be used when describing companies. A company with integrity is one that has both employees who act with integrity, and processes with integrity. It takes both. A company that employs only honest and truthful staff with high moral values but lacks clarity on overarching principles, and deploys ambiguous processes, may produce inconsistent and non-compliant products and results. And likewise, a company that relies on untrained or even worse, unscrupulous staff, will also lack integrity regardless of how structured and complete their processes are. Integrity is demonstrated by how we respond to, and how we behave when we make mistakes or encounter the unexpected. The key to integrity is to be objective and truthful about failures that occur, i.e., accept the failure for what it is, and respond appropriately based on the overarching universal principles of right and wrong.
Effective Communication
Between edge-computing and the Internet of Things (IoT) we are continuing to bring more technology, automation, and information to the individual everywhere. Information overload is not only an ever-present danger, it is the new norm. It’s like everyone and everything wants your attention all the time. The same can be true for managing facilities. The amount of data generated, and information communicated on a continuous basis is astounding. The documentation aspect of managing facilities has evolved as well. Even the site-specific documentation is evolving in previously unexpected ways. A bar code and a tablet can put all the relevant documentation and information in a technician’s hands for the respective equipment he needs to address. Access to necessary information is less of a problem now. The new problem is identifying the necessary information from all the extraneous data that inundates us. The key is to communicate effectively. Effective communication occurs when the message sent is the same as the message received and in a timely manner. Research has categorized various barriers to effective communication. A more recent barrier is related to “technological multi-tasking and absorbency”. This phenomenon occurs when someone is in a state of cognitive multi-tasking as they receive a near constant barrage of unrelated reminders, alerts, messages and information vying for attention. The problem with multi-tasking is that none of the tasks get 100% attention. Multi-tasking coupled with information overload is a recipe for human error. Work environments, especially for critical facilities, need to be conducive to effective communication. In many cases, this needs to be a situational awareness issue. When a critical or important message or conversation needs to occur, unrelated sources of distraction need to be eliminated or mitigated. It should be clear to all involved that there is a distinct difference between informal and social conversations vs formal or official communications. It becomes increasingly valuable to solicit clear feedback from your audience on what message, directive, or other communication was received to ensure it matches with the message that was sent. This is the key to effective communication.
ASHRAE Standards Development Process
Industry relies on industry standards to establish the minimum requirements necessary to design, build and operate facilities. Some of these standards get adopted by federal, state and local governments through legislature as enforceable codes. So where do these standards come from and how do they get developed? Naturally, first you need a standard for developing standards. This is the job of the American National Standards Institute, or ANSI. Surprising to many people, ANSI does not develop standards. Instead, ANSI facilitates the development of standards by accrediting the processes used by organizations to develop standards related to their respective purviews and community needs. Once accredited by ANSI, these organizations become “standards development organizations”, or SDOs. The role of ANSI is to ensure standards are developed in an open and objective manner that “ensures that all interested and affected parties have an opportunity to participate in a standard’s development” and that the development process be open, balanced, based on consensus and follow due process. This presentation will describe the ANSI standard development process in general, how this process is used by ASHRAE, and will use the recent development of ASHRAE Std 90.4 Energy Standard for Data Centers as a case study. It culminates with a discussion on how individuals can participate in the development of standards via the public review and comment process.