A data center is a facility that centralizes an organization’s IT operations and equipment, as well as where it stores, manages, and diseminates its data. Data centers house a network’s most critical systems and are vital to the continuity of daily operations. Consequently, the security and reliability of data centers and their information is a top priority for organizations.

Although data center designs are unique, they can generally be classified as internet-facing and enterprise (or internal) data centers. Internet-facing data centers usually support relatively few applications, are typically browser-based, and have many users, typically unknown. In contrast, enterprise data centers service fewer users, but host more applications that vary from off-the-shelf to custom applications.

The main purpose of a data centre design is to run core business or mission critical applications and store operational data as well as providing Disaster Recover (DR) facilities. Typical applications will be enterprise software systems such as Enterprise Resource Planning (ERP) and Customer Relationship Management (CRM) services. Data center architectures and requirements can differ significantly.
For example, a data center built for a cloud service provider like Amazon EC2 satisfies facility, infrastructure, and security requirements that significantly differ from a completely private data center, such as one built for the Pentagon that is dedicated to securing classified data.

Why data centers are business-critical

Most data center deployments are carried out for the following reasons:
Availability: Maximizing the availability of IT services to the organization.
Business continuity: The redundancy, monitoring and infrastructure provided by most data centers means that the potential for business  interruption is very low.
Lower total cost of ownership: Where an organization has several silos of data, it can combine resources and reduce the amount of separate data servers required.
Staff overhead is reduced as administrative operations are simplified, whilst energy and floor space costs are reduced.
Agility: Centralizing IT infrastructure within a data center creates greater agility since new deployments do not have to be rolled out to multiple physical locations.

Basically, an effective data center operation is achieved through a balanced investment in the facility and equipment housed. The elements of a data center break down as follows:

Facility – the location and white space or usable space, that is available for IT equipment. Providing round-the-clock access to information makes data centers some of the most energy-consuming facilities in the world. A high emphasis is placed on design to optimize white space and environmental control to keep equipment within manufacturer-specified temperature/humidity range.

Support infrastructure – equipment contributing to securely sustaining the highest level of availability possible.
Some components for supporting infrastructure include:
•Uninterruptible Power Sources (UPS) – battery banks, generators and redundant power sources.
•Environmental Control – computer room air conditioners (CRAC), heating, ventilation, and air conditioning (HVAC) systems, and exhaust systems.
•Physical Security Systems – biometrics and video surveillance systems.

IT equipment – actual equipment for IT operations and storage of the organization’s data. This includes servers, storage hardware, cables and racks, as well as a variety of information security elements, such as firewalls.

Operations staff – to monitor operations and maintain IT and infrastructural equipment around the clock.

Data centers have evolved significantly in recent years, adopting technologies such as virtualization to optimize resource utilization and increase IT flexibility. As enterprise IT needs continue to evolve toward on-demand services, many organizations are moving toward cloud-based services and infrastructure. A focus has also been placed on initiatives to reduce the enormous energy consumption of data centers by incorporating more efficient technologies and practices in data center management. Data centers built to these standards have been coined green data centers.

Data Center Solutions

So, what exactly do data centers offer? What’s so great about them versus doing everything in-house.Below is a list of some of the services offered:
•Colocation Services
A colocation data center lies on the opposite spectrum of the in-house data center. Colocation facilities are third party organizations that are multi-tenant accessible, meaning that multiple businesses of any size or industry may house their equipment within the data center. Customers are able to select from a variety of solutions to accommodate the specific requirements for their business.

•Cloud Services like Iaas, Paas, Saas

•Disaster Recovery
This is a flexible and scalable service to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster. Disaster recovery focuses on the IT or technology systems supporting critical business functions,as opposed to business continuity, which involves keeping all essential aspects of a business functioning despite significant disruptive events. Disaster recovery is therefore a subset of business continuity.

•Managed Services
Managed hosting service provider is a company or individual that remotely manages a customer’s IT infrastructure.Many small and medium sized businesses tend to use the services of a managed service provider for web servers and website management.

Managed providers will each specialize in a different area of IT. Some may be experts in managed colocation services whereas others are more focused on business connectivity and cloud services.

•High Availability
High availability in the data center refers to systems and components that are continuously operational for a long time. It typically means the systems have been thoroughly tested, are regularly maintained and have redundant components installed to ensure continuous operation.

These services and much more are managed professionally by IT administrators who stay up-to-date with the latest technology and ultimately help a data center’s overall efficiency.

Data center administrators have a long to-do list when it comes to infrastructure monitoring. From server and equipment monitoring — and in some cases, mainframe monitoring — it’s a practice that’s often difficult to juggle, especially if you work in a large data center. But monitoring is an essential task. By obtaining the data you need, you can increase security and scalability, efficiently automate and better align resources with capacity needs.

Instead of scrambling to fix a problem after it occurs, data center admins should strive to be proactive, anticipating issues before end users even notice. But that can be difficult to do without the right data center monitoring tools and strategy.Trends such as mobility, virtualization adoption, new and increasing compliance and governance requirements, and the need to modernize existing infrastructure add further complication to managing the IT environment.

To make information across the enterprise readily available requires an enterprise infrastructure that is managed as an integrated whole. The go to ITSM tool(Example:Opsmart, Remedy etc.,) should provide a flexible, scalable, and open solution.Also the tool should align ITSM resources to support customer business objectives.These services create a centralized operations and support center, providing the customer a cohesive approach to integrate process optimization, systems development and support, and network and service desk management across the business enterprise.

Moving Forward

Due to the cloudification of data centers, current situation is such that it is no longer reserved for any specific type of organizations any longer; they’ve become accessible to almost anyone. Even if in-house IT operations are conceivable for small or large companies, wise IT administrators are choosing to outsource some portions of IT management. In the long run, it saves time, money, and manpower and it’s much less of a headache.

Service Desk Operations

Operations staff have direct responsibility for the availability of computer services; nonetheless, they also have direct contact with users and decision makers. One part of the Operations ,i.e,the service desk also known as the “help desk” is the single point of contact for users to report incidents. Without the service desk, users will contact support staff without the limitations of structure or prioritization. This means that a high-priority incident may be ignored while the staff handles a low-priority incident.

IT Infrastructure Library, ITIL is a set of best practices for an effective IT Service management, ITSM is followed across companies of all sizes.
ITIL enables businesses to handle IT issues and service requests efficiently by assigning clear roles and responsibilities. It helps individuals and organizations to realize business growth and transformation.

ITIL Service Life cycle

ITIL service lifecycle consists of five stages. Each stage has a set of ITIL processes and it is significant to understand the purpose of each process before implementing. However, companies may selectively implement few processes that are necessary.

• Service strategy – Service strategy involves clear understanding of customer’s and market’s needs. This advocates a long-term market driven approach to deliver IT support. Service strategy include strategy management for IT services, service portfolio management, demand management, financial management for IT services and business relationship management that focuses on customer satisfaction.

• Service design – This is an holistic approach to design a support service. The right service design approach translates to higher customer satisfaction and usability. Some of the ITIL service design processes include service level agreement, service catalog management, availability management and IT service continuity management.

• Service Transition – This stage ensures changes in service lifecycle are handled with minimum risk and impact so that there is minimal or no downtime. Some of the processes here include change management, release management, configuration management database, knowledge management.

• Service Operation – ITIL service operation ensures seamless service in day to day business activities. Actual delivery and consumption of services happen during this stage. This has a direct impact on the productivity of end users. Typical processes include incident management, request fulfillment, problem management.

• Continual Service Improvement – CSI aims for process review and improvement throughout the service lifecycle. This is applicable to all service stages including strategy, design, transition and operation. Metrics definition and performance review happen at this stage which help companies to revisit existing process.

Incident Management

ITIL defines an incident as an unplanned interruption to or quality reduction of an IT service. The service level agreements (SLA) defines the agreed-upon service level between the provider and the customer. Incident management focuses solely on handling and escalating incidents to the next level as they occur to restore defined service levels. Incident management does not deal with root cause analysis or problem resolution. The main goal is to take user incidents from a reported stage to a closed stage.

The below figure depicts a sample incident management flow.

The Alerts and incident calls may be related in cases where the incidents result in Alerts on the monitoring tool. However, Alert may occur on tool independently and follows the flow 1 – 3 – 4 – 5 and finally closure of the ticket raised w.r.t to the Alert generated on tool.

The other case is when the Client or user calls up regarding a service disruption issue or a service enhancement support. In such cases, the flow 1 or 2 – 3 – 4 – 5 – 6 is followed. Here as well, once the issue is resolved the concerned ticket is closed.

Operational incident management requires several key pieces:
1.A service level agreement between the provider and the customer that defines incident priorities, escalation paths, and response/resolution time frames
2.Incident models, or templates, that allow incidents to be resolved efficiently
3.Categorization of incident types for better data gathering and problem management
4.Agreement on incident statuses, categories, and priorities
5.Establishment of a major incident response process
6.Agreement on incident management role assignment

Incident statuses

Incident statuses mirror the incident process and include:
•New
•Assigned
•In progress
•On hold or pending
•Resolved
•Closed

The new status indicates that the service desk has received the incident but has not assigned it to an agent.
The assigned status means that an incident has been assigned to an individual service desk agent.
The in-progress status indicates that an incident has been assigned to an agent but has not been resolved. The agent is actively working with the user to diagnose and resolve the incident.
The on-hold status indicates that the incident requires some information or response from the user or from a third party. The incident is placed “on hold” so that SLA response deadlines are not exceeded while waiting for a response from the user or vendor.
The resolved status means that the service desk has confirmed that the incident is resolved and that the user’s service has restored to the SLA levels.
The closed status indicates that the incident is resolved and that no further actions can be taken.

Incident management follows incidents through the service desk to track trends in incident categories and time in each status. The final component of incident management is the evaluation of the data gathered. Incident data guides organizations to make decisions that improve the quality of service delivered and decrease the overall volume of incidents reported. Incident management is just one process in the service operation framework.

Sources:
Work Experience, Online Learning