It is better to just log the error once if /proc/self/io cannot be opened,
without exposing additional counters.
The error message should contain directions on how to fix the error.
Updates https://github.com/VictoriaMetrics/metrics/issues/42
* linux process metrics: add error handling for metrics from `/proc/self/io`
* linux process metrics: address review feedback
* linux process metrics: restore error handling for self/stat metrics
* linux process metrics: implement self/io error check with `init` function
This is needed in order to support `net/url.URL.Redacted()` method used in push.go
This method appeared in Go1.15 - see https://tip.golang.org/doc/go1.15
This should guarantee that metrics are pushed regularly with the provided interval.
If the remote storage cannot keep up with push frequency, then timeout errors will be logged.
This reverts commit 55d5027c97e2484d3e2b63ec6eb54845a8f69227.
Reason for revert: the process_resident_memory_anonymous_bytes and process_resident_memory_pagecache_bytes metrics have been substituted
by more complete set of metrics in the commit 5a49bb8e88e070e43cbffaa68776259f11f6c053:
* process_resident_memory_anon_bytes
* process_resident_memory_file_bytes
* process_resident_memory_shared_bytes
Metrics are:
process_io_read_bytes_total - the number of bytes read via io syscalls such as read and pread
process_io_written_bytes_total - the number of bytes written via io syscalls such as write and pwrite
process_io_read_syscalls_total - the number of read syscalls such as read and pread
process_io_write_syscalls_total - the number of write syscalls such as write and pwrite
process_io_storage_read_bytes_total - the number of bytes read from storage layer
process_io_storage_written_bytes_total - the number of bytes written to storage layer
Log-based histograms provide lower estimation error for the same number of buckets compared to log-linear histograms.
For example, the current Histogram implementation splits each decimal range (10^n .. 10^(n+1)] into 18 buckets.
These buckets have the following bounds:
- for log-linear histogram: (1 .. 1.5], (1.5 .. 2], (2 .. 2.5], ... (9.5 .. 10]
- for log-based histogram: (1 .. 1.136], (1.136 .. 1.292], ... (8.799 ... 10]
The maximum estimated error for log-linear histogram is reached in the first bucket per each decimal range and equals to 1.5-1=0.5 or 50%.
The maximum estimated error for log-based histogram is constant across buckets and equals to 1.136-1=0.136 or 13.6%.
This means that log-based histogram improves histogram accuracy by up to 50%/13.6% = 3.6 times when using the same number of buckets.
Further reading - https://linuxczar.net/blog/2020/08/13/histogram-error/
The change covers two things:
1. Cleanup of Set.a metrics list from per-quantile metrics
for summary.
2. Register summary metric and per-quantile metrics in one take.
This prevents registry corruption when Unregister called in the
middle of metric register process.