Principal Responsibilities
- Implementation, support, and management of the enterprise monitoring tools.
- Provide subject matter expertise in the area of Systems, Application, Storage and Network performance monitoring and management.
- Provide top-down monitoring strategy for enterprise service visibility that includes Cloud and On-Prem environments
- Monitor infrastructure performance and usage to determine need to enhance server / VM capacity (CPU, memory, disk, etc.) to improve the systems based on operational needs
- Build and maintain server and application real time and historic performance and availability reports.
- Work with multiple teams and stakeholders to determine best monitoring approach of systems and applications
- Contribute to discussions for redesign and implementation of the monitoring environment to modernize with latest trends
- Coordinate between support and development teams to ensure effective delivery of monitoring of all services
- Stay up to date with the latest monitoring technology and trends
- Defines, implements, and maintains system configuration standards.
- Provide on-call nighttime and weekend maintenance and production support, as needed.
- Performs other duties and responsibilities as assigned.
- Participate in Outage RCA’s and look for / implement ways to reduce future incidents through improved monitoring / optics.
- Perform Health Checks, QA Checks and Configuration Checks to ensure the resiliency of the environment.
- Maintain and Develop PowerCLI scripts where necessary.
Technical Responsibilities
vROPSConfiguration : Supermetrics, Views, Dashboards, Reports, Symptoms, Alerts, PoliciesUpgrades : vROPs and Management PacksCertificate ReplacementWork with VMware GS on Service Requests (Troubleshooting / Problem Solving)vROPs CAvRLIConfiguration : Queries, Dashboards, AlertsUpgrades : vRLI and Content PacksCertificate ReplacementWork with VMware GS on Service Requests (Troubleshooting / Problem Solving)Education and Qualifications / Skills and Competencies :
Bachelor's degree in Computer Science, Engineering, Networking and Security or a related field of study and 3+ years of experience required. The company will also accept a Master's degree and 2 years of experience.
Work Experience :
Experience must involve 4 years of development (DevOps) experience and 3 years in a monitoring / operational type role (vROPs / vLI / vRNI, Prometheus, Grafana, Netbrain, etc.);Designing and maintaining of Critical System and security infrastructure;Datacenter System and security events monitoring, troubleshooting; Public cloud system design, implementation and support;VMware vROPs Automation and Orchestration experience desired.