Service Desk

Monitoring Engineer

We are looking for a Monitoring Engineer with experience working for a Managed Service Provider and/or a Technical Support Team.

Job Description

Title: Monitoring Engineer (Business Continuity)

Location: Newport or London

Reporting to: Operations Manager

Position Description

The Proactive Monitoring and Business Continuity Team is part of the overall Service Desk team, and is preliminary responsible for addressing any automated alerts relating to the health of customer and internal environments, to maintain uptime and proactively address any issues before there is any impact. These areas are broadly split between proactive alerts (storage, compute, connectivity), Business Continuity (backups, replication, DR), and outage management (network devices going offline or servers down).

Alerts are raised in the form of automated tickets, generated via API calls from the group Remote Monitoring & Management (RMM) platform, and remediation measures will be carried out directly by the engineers, including any customer communications that are required (change requests, general updates etc.)

For the Business Continuity role, the engineer would be preliminary responsible for maintaining the health of customer and group backups (re-running or addressing failed jobs), performing restores, and creating/updating backup schedules.

Key Responsibilities

To manage all alert tickets, ensuring issues are prioritised in terms of age and severity, with particular focus on backup related alerts:

  • General failed jobs
  • Licencing issues
  • Repository Issues (failing to sync, file corruption)
  • Failed Virtual Exports (e.g. storage related problems)
  • Replication Issues with DR environments

Other:

  • Work with the SysGroup Projects Team in creating new backup schedules as part of onboarding
  • Perform restores upon request, or as part of backup testing
  • Arranging and planning periodic DR tests with customers (powering on servers in DR platforms, triggering failover)
  • Liaise with 3rd Party providers (e.g. Veeam) as part of issue resolution
  • To identify any optimisation for the automated alerting criteria, to reduce “false positives”
  • To proactively analyse trends with alerting via ticket review or report generation, to assist with capacity planning for clients or recurring issues that need focus.
  • To respond to incoming customer telephone calls and support tickets, during periods where the group Service Desk is unavailable (i.e. weekend and overnight periods).

Skills & Experience

The ideal candidate will have previous experience working for a Managed Service Provider and/or a Technical Support Team. Proven experience and familiarity is expected in one or more of the following (with working knowledge of as many as possible being advantageous)

Strong experience with Backup Technologies (Veeam, Quest Rapid Recovery, ArcServe Backup Builder)

  • Experience in creating backup schedules, performing restores, diagnosing and resolving backup failures as part of break-fix support..

Experience with RMM or other monitoring platforms (Solarwinds, Zabbix, Kaseya, Comodo)

  • Familiarity in identifying alerts, navigating the systems, generating reports, and using the ‘remote control’ functionality to access devices.

IT Service Management Processes

  • Understanding industry standard terms (e.g. differences between P1, P2 or P3 priorities), exposure to Root Cause Analysis (RCA) and incident management, experience using a Ticket Systems to log and manage issues.

Network Storage Management (Nimble, NetApp, Compellent)

  • Familiarity with navigating various SANs, reclaiming space, and allocating more storage

Firewall Management (e.g. WatchGuards, Sonicwall, Fortinet)

  • Basic understanding of networking, firmware upgrades, creating port numbers/destination addresses, NAT rules, HTTP/S webblockers, Site-to-site VPN creation (Phases one and two).

Router Configurations

  • Setting up PPPoE, firmware upgrades, basic understanding of ADSL, Lease Lines and MPLS (and the differences between them), general operational knowledge (e.g. knowing the differences between bridge and router mode).

While not specifically required for the Monitoring Role, experience in the following areas is desirable

Office 365 Administration

  • Forcing AD syncs, converting user mailboxes to shared, migrating Exchange/Hybrid mailboxes into O365, creating/administrating distribustion groups, understanding of licencing, running PowerShell for tasks not possible in the web UI (e.g. calendar permissions).

SharePoint Administration

  • Creating SharePoint sites , changing security permissions, applying best practices, syncing SharePoint sites with file explorer (OneDrive) .

MS SQL

  • General administration and tasks typically associated with DBA duties (local backups, integrity checks, diagnosing errors, upgrades etc.)

Though not essential, the successful candidate may hold one or more qualifications in appropriate technologies from one of the following organizations

  • CompTIA
  • Azure, Microsoft / Office 365
  • AWS
  • Cisco
  • WatchGuard


Interested? Apply now!

Please send your CV and a covering letter to our People & Culture team

Apply