Our colleagues in the K8S team launched the OVH Managed Kubernetes solution last week, in which they manage the Kubernetes master components and spawn your nodes on top of our Public Cloud solution. I will not describe the details of how it works here, but there are already many blog posts about it (here and here, to get you started). In the Prescience team, we have used Kubernetes for more than a year now. Our cluster includes 40 nodes, running on top of PCI. We continuously run about 800 pods, and generate a lot of metrics as a result. Today, we’ll look at how (...)
At the OVH Observability (formerly Metrics) team, we collect, process and analyse most of OVH’s monitoring data. It represents about 500M unique metrics, pushing data points at a steady rate of 5M per second. This data can be classified in two ways: host or application monitoring. Host monitoring is mostly based on hardware counters (CPU, memory, network, disk…) while application monitoring is based on the service and its scalability (requests, processing, business logic…). We provide this service for internal teams, who enjoy the same experience as our customers. Basically, our Observability service is SaaS with a compatibility layer (supporting InfluxDB, (...)
At the Metrics team we have been working on time series for several years. From our experience the data analytics capabilities of a Time Series Database (TSDB) platform is a key factor to create value from your metrics. And these analytics capabilities are mostly defined by the query languages they support. TSL stands for Time Series Language. In a few words, TSL is an abstracted way, under the form of an HTTP proxy, to generate queries for different TSDB backends. Currently it supports Warp 10's WarpScript and Prometheus' PromQL query languages but we aim to extend the support to other major TSDB. (...)
OVH relies extensively on metrics to effectively monitor its entire stack. Whenever they are low-level or business centric, they allow teams to gain insight into how our services are operating on a daily basis. The need to store millions of datapoints per second has produced the need to create a dedicated team to build a operate a product to handle that load: Metrics Data Platform. By relying on Apache Hbase, Apache Kafka and Warp 10, we succeeded in creating a fully distributed platform that is handling all our metrics... and yours! After building the platform to deal with all those metrics, our next challenge was to build one of the most needed feature for Metrics: Alerting.
C’est la rentrée de l’émission « Tech a Break ». Pour ce numéro notre présentateur Rémy Vandepoel reçoit Steven Le Roux, Directeur Technique chez OVH, en charge des solutions Data Platforms. A travers des démos et des cas d’usage, Steven nous explique comment utiliser les services Logs et Metrics pour faire de l’Observability, autrement dit comment structurer ses données pour pouvoir être proactif et anticiper. Au sommaire : 0:32 / Observability le concept 1:16 / Logs 3:18 / Démos Dashboards 4:12 / Metrics 5:40 / Démo PAD 7:30 / Cas clients 9:08 / Next features