thingsboard

Commit Graph

Author	SHA1	Message	Date
Oleksii Kuripko	bec05fab53	hold incident resolution timer while services are still failing The resolution countdown now only starts once every failing service has recovered. While any service is still firing the timer is cancelled, so the incident cannot auto-resolve and spawn a new incident for the same ongoing failure between alerts. High latency is a warning signal (no explicit recovery event) and therefore does not block resolution — only FAILING services do.	2 months ago
Oleksii Kuripko	27c2844fec	linkify URL for 'request: Connect to <URL> failed' error variant Spring wraps Apache HttpClient connect failures as 'I/O error on POST request: Connect to <URL> failed: <reason>'. Same treatment as the RestClient variant: the URL is embedded in the 'request' word and the 'Connect to ... failed' restatement is dropped from the visible text.	2 months ago
Oleksii Kuripko	43e87d4588	preserve failure count on service recovery in incident header Recovered services are now shown as ':large_green_circle: <name> (<count>)' with the last failure count that was observed before the service recovered, matching the red-circle format.	2 months ago
Oleksii Kuripko	c87b20e0c2	linkify the URL in HTTP failure messages as a Slack hyperlink Transforms 'request for "<URL>"' into Slack mrkdwn <URL\|request>, so the failure line renders the word 'request' as a clickable link instead of showing the raw URL in quotes.	2 months ago
Oleksii Kuripko	28de53425b	Incident feature for tb-monitoring Adds Slack-API-based incident grouping for the monitoring microservice: alerts that fire within resolution_timeout_s are threaded under a single "Incident" message whose header tracks affected services in real time, and the incident auto-resolves after a quiet period with a final summary. Highlights - Dual Slack modes: when bot_token + channel_id are set, alerts go through chat.postMessage / chat.update with threaded replies; otherwise the existing webhook path is used unchanged. - Incident header shows 🔴 failing / :large_yellow_circle: high latency / :large_green_circle: recovered services with live failure counts and elapsed duration (updated every minute). - Notifications carry structured affected-service data (AffectedService { name, status, failureCount }) so the incident layer no longer parses formatted alert text with regex. - IncidentManager is decoupled from Slack via a small IncidentTransport interface; SlackIncidentTransport adapts SlackApiClient. - PE/other service-key types can plug in a friendly name via the ShortNameProvider interface; TransportInfo implements it. - Config lives under monitoring.notifications.incident.* (enabled, resolution_timeout_s, tag_channel). Slack bot config stays under monitoring.notifications.slack.{bot_token,channel_id}. YAML defaults are authoritative; Spring @Value no longer carries a conflicting fallback. - Concurrency: state and transport I/O run under the manager's monitor. Slack client has explicit 5s call timeouts (set on SlackConfig) so the hold time is bounded. Slack client is closed on PreDestroy. - HTTP failure text is sanitised: HTML response bodies are stripped so Nginx-style error pages don't flood alerts. - BaseMonitoringService splits login / WS connect / WS subscribe into distinct MonitoredServiceKey entries, uses catch(Exception) instead of catch(Throwable), and wraps WsClient in try-with-resources. - Unit tests cover incident lifecycle, status transitions, duration formatting, HTML body stripping, and the ShortNameProvider dispatch.	2 months ago
Viacheslav Klimov	295e7d68c9	Update license headers	5 months ago
Viacheslav Klimov	5b5b4dff6b	Update license header	5 months ago
Sergey Matvienko	f2538117ce	refactoring	6 months ago
Sergey Matvienko	9850e7a466	monitoring: show dashboard link on startup notification. Initial delay set for services to avoid spikes	6 months ago
Sergey Matvienko	4a6177d85e	monitoring service improvements: provisioning dashboard, make monitoring asset and dashboard public, log dashboard url, notify on startup and shutdown, shutdown thread pools on PreDestroy, removed file based logback in favour of stdout, logback severity down to INFO, JAVA_OPTS adjusted	6 months ago
IrynaMatveieva	558a13b5b0	refactored cf output	7 months ago
ViacheslavKlimov	779e2461d8	Fix CoAP monitoring	1 year ago
ViacheslavKlimov	8899a750b1	Fix LwM2M client for monitoring	1 year ago
ViacheslavKlimov	fd66c5f177	CF monitoring fixes	1 year ago
ViacheslavKlimov	602d60281c	Fixes for monitoring	1 year ago
ViacheslavKlimov	32212b9c51	Monitoring: automatic rule chain update	1 year ago
Sergey Matvienko	99d2d1e033	Monitoring COAP Leshan Dependency Upgrade	1 year ago
ViacheslavKlimov	d6a4d454fd	Monitoring: minor refactoring	1 year ago
ViacheslavKlimov	a8a0083bb2	Monitoring: trim device name	1 year ago
ViacheslavKlimov	82d4cb5381	Add monitoring for calculated fields	1 year ago
ViacheslavKlimov	0a14ce3f12	Add EDQS monitoring	1 year ago
nick	0e819850c8	lwm2m: fix bug monitoring	1 year ago
Dmytro Skarzhynets	400e74b00d	Save attributes strategies: BE initial implementation	1 year ago
Igor Kulikov	5cf26d4851	Update license header	1 year ago
Dmytro Skarzhynets	12a8c070a6	Change wording from persistence to processing on backend	1 year ago
Dmytro Skarzhynets	71d43f3af2	Save time series strategies: update configurationVersion for save time series node in rule chain JSONs	1 year ago
Dmytro Skarzhynets	345e423973	Save time series strategies: update time series node config in rule chain JSONs; add null-check for persistence settings	1 year ago
Andrii Shvaika	d92681cf71	Refactoring from multiple fields into one settings object	2 years ago
YevhenBondarenko	637fe2a258	Used debugFailures and debugAll params instead of DebugStrategies	2 years ago
ShvaykaD	54a7400611	Fixed stored rule chain json. Added ignore uknown properties for rule node class	2 years ago
Dmytro Skarzhynets	bbe328d158	Implemented safe scheduled thread pool	2 years ago
ViacheslavKlimov	6efb2ab81f	Add WS subscribe latency monitoring	2 years ago
Andrii Landiak	5ca6ad03e3	CE: optimize java imports	2 years ago
Kulikov	b7891dfdd9	fix_bug: lwm2m monitoring (#11025 )	2 years ago
nick	2eee2bc441	fix_bug: monitoring - resources	2 years ago
Kulikov	6c499fd342	LwM2M: default object version (#10731 ) * lwm2m: object-id-version default 1.0 * lwm2m: delete translate * lwm2m: delete translate3 * lwm2m: object id ver comment4 * lwm2m: object id ver comment5 * lwm2m: object id ver comment6	2 years ago
nick	49320e5651	fix bug: monitoring_lwm2m	2 years ago
ViacheslavKlimov	0e6c8dae06	Increase lwm2m client registration lifetime for monitoring	2 years ago
Ivan Raznatovskyi	09fc025e12	Initial commit from another fork	2 years ago
Igor Kulikov	c5a72ed8df	Update license header to 2024 year.	2 years ago
ViacheslavKlimov	07d3992da4	Monitoring: Main queue by default	2 years ago
ViacheslavKlimov	0812afd0da	Monitoring: ability to specify used queue	2 years ago
ViacheslavKlimov	e9a7bc440e	Monitoring alerts formatting improvements	2 years ago
ViacheslavKlimov	04cab4d5d1	Monitoring service: notification message prefix; minor improvements	2 years ago
ViacheslavKlimov	98a87bfd33	Monitoring: support for dynamic change of load-balancers list	3 years ago
ViacheslavKlimov	24241010b7	Root Rule Chain for monitoring tenant	3 years ago
ViacheslavKlimov	e8ba1e17eb	Monitoring for IPs associated with the domain	3 years ago
ViacheslavKlimov	1b9dfa17c2	Refactoring for latencies tracking	3 years ago
YevhenBondarenko	db6f310284	migration to spring boot 3.1	3 years ago
ViacheslavKlimov	4ae401c54b	Monitoring service refactoring (for compatibility with PE integrations monitoring)	3 years ago

1 2

74 Commits (bec05fab535229e712ee92d13a2c7c6ddd8fe650)