Effective Data Services relies on us proactively discovering and addressing issues before our clients are aware of them. This requires both Data experience and the use of Monitoring and Alerting.
The following are some of the typical events we monitor:
- Data has not been updated, even though an error did not occur.
- Data integration is taking longer and needs to be optimised before the business is impacted.
- Poorly written SQL by a self-service user is compromising performance.
- Cloud costs are growing and the solution needs to be optimised.
- An ETL trigger has been disabled by mistake.
- A process failed to complete and prior alerts have been overlooked.
- There is excessive resource consumption and the service will be compromised.
- An error has occurred and the ETL needs to be recovered from the current state.
- A reporting dataset has failed to refresh on schedule.
- A report is not being used and can be decommissioned.
- Proactive maintenance of the service is not running.
