The Alert24 server agent — the same lightweight script you install for CPU, memory, disk, and service-status monitoring — can also search your logs. Matching happens entirely on your server. Only match counts and up to 5 sample lines per search are sent to Alert24. Your raw logs never leave the host: no log shipping, no central log store, and no per-GB ingestion pricing.
When a rule fires, the matched sample lines are attached to the incident, so on-call sees what's actually in the log — not just "error count exceeded."
Plan requirement: Server agents are included on every plan (3 on the free plan, 5 per unit on paid plans). Log search monitoring requires a paid plan.
Agent update required: Log search is part of newer agent builds. If
log_searchesis ignored, update the agent first — see Troubleshooting.
How it works
- You define one or more log searches in the agent config (a glob path or system-log source plus a regex pattern).
- Each interval, the agent reads only the new lines added since the previous run, matches them against your patterns, and counts the hits.
- The agent sends the counts (and up to 5 sample matched lines) to Alert24.
- Alert rules evaluate those metrics and open or resolve incidents automatically.
The agent reads incrementally and handles log rotation, so it never re-alerts on historical logs. On first run it establishes a baseline at the current end of each file — it does not alert on lines written before the search existed.
Configuring log_searches
Add a log_searches array to the agent's JSON config. Each entry needs a unique name, a source, and source-specific fields.
"log_searches": [
{ "name": "app_errors", "source": "file", "path": "/var/log/myapp/*.log", "pattern": "ERROR|FATAL" },
{ "name": "nginx_5xx", "source": "file", "path": "/var/log/nginx/access.log", "pattern": "\" 5\\d\\d " },
{ "name": "ssh_failures", "source": "journald", "unit": "sshd", "pattern": "Failed password" },
{ "name": "failed_logins", "source": "windows_event", "channel": "Security", "event_ids": [4625] },
{ "name": "system_errors", "source": "windows_event", "channel": "System", "min_level": "error" }
]
Build your config interactively → Use the Log Search Config Builder to pick a source, test your pattern against sample lines, add alert rules, and copy ready-to-paste JSON — all in your browser.
Source: file (Linux, macOS, Windows)
Searches plain-text log files.
| Field | Required | Description |
|---|---|---|
name |
Yes | Unique identifier referenced by alert rules. |
source |
Yes | "file". |
path |
Yes | File path or glob (e.g. /var/log/myapp/*.log). All matches are read. |
pattern |
Yes | Regex evaluated against each new line. Counts the lines that match. |
Because pattern is a regex inside JSON, backslashes must be escaped. For example, to match an nginx access-log 5xx status (" 5xx "), the regex " 5\d\d becomes "\" 5\\d\\d " in JSON.
Source: journald (Linux)
Searches the systemd journal — no file logging configuration required.
| Field | Required | Description |
|---|---|---|
source |
Yes | "journald". |
unit |
Yes | The systemd unit to read (e.g. sshd, nginx, myapp). |
pattern |
Yes | Regex evaluated against each new journal entry's message. |
Source: windows_event (Windows)
Searches the Windows Event Log. Reads the System, Application, and Security channels.
| Field | Required | Description |
|---|---|---|
source |
Yes | "windows_event". |
channel |
Yes | System, Application, or Security. |
event_ids |
Optional | Array of event IDs to match (e.g. [4625] for failed logins). |
min_level |
Optional | Minimum level: information, warning, error, or critical. |
pattern |
Optional | Regex evaluated against the event message for finer matching. |
Provide event_ids, min_level, and/or pattern — an event must satisfy all of the criteria you specify. Common examples:
- Failed logins —
channel: "Security",event_ids: [4625] - System errors —
channel: "System",min_level: "error" - Application crashes —
channel: "Application",event_ids: [1000]
Alert rules
Rules turn search metrics into incidents. A rule can live in the agent config or be set in the Alert24 UI. Each rule references a log_search by name, supports a sustained-duration threshold and a severity, and auto-resolves its incident when the condition clears.
{ "metric": "log_error_rate", "log_search": "nginx_5xx", "operator": "gt", "threshold": 2, "duration_seconds": 300, "severity": "high" }
This fires a high-severity incident when more than 2% of lines match the nginx_5xx search for 5 minutes straight.
Rule fields
| Field | Description |
|---|---|
metric |
One of the four metric types below. |
log_search |
The name of the log search this rule evaluates. |
operator |
Comparison operator, typically gt (greater than) or lt (less than). |
threshold |
The value to compare against (count, percentage, or lines-per-minute). |
duration_seconds |
How long the condition must hold before firing — filters out brief blips. |
severity |
Incident severity, e.g. low, medium, high, critical. |
Metric types
| Metric | Fires when… | Threshold means… | Example |
|---|---|---|---|
log_match_count |
Match count crosses a threshold. | Number of matching lines. | App logs ERROR|FATAL more than 20 times in the interval. |
log_match_rate |
Matches per minute crosses a threshold. | Matches per minute. | More than 5 failed SSH logins per minute. |
log_error_rate |
Matches as a percent of total lines crosses a threshold. | Percentage (e.g. 2 = 2%). |
More than 2% of nginx requests are 5xx. |
log_volume |
Total lines per minute is too high — or too low (use lt). |
Lines per minute. | A retry storm writes 50,000 lines/minute — or a chatty app goes silent. |
Error spike — log_match_count
{ "metric": "log_match_count", "log_search": "app_errors", "operator": "gt", "threshold": 20, "duration_seconds": 300, "severity": "high" }
Error rate — log_error_rate
{ "metric": "log_error_rate", "log_search": "nginx_5xx", "operator": "gt", "threshold": 2, "duration_seconds": 300, "severity": "high" }
Log volume / flood — log_volume + gt
{ "metric": "log_volume", "log_search": "app_errors", "operator": "gt", "threshold": 5000, "duration_seconds": 120, "severity": "medium" }
Log silence (deadman) — log_volume + lt
{ "metric": "log_volume", "log_search": "app_heartbeat_lines", "operator": "lt", "threshold": 1, "duration_seconds": 180, "severity": "high" }
For a silence alert, point a search at a log line your app emits continuously under normal operation, then alert when log_volume drops below a floor. If the app stops logging, it has likely stopped working — even though uptime and ping checks against the host still pass.
Sample lines in the incident
Each interval the agent includes up to 5 sample matched lines per search. When a rule opens an incident, those samples are attached, so the on-call responder sees the actual offending log lines without SSH-ing into the box. To limit exposure of sensitive data, keep patterns specific; only matching lines are ever sampled, and only up to five of them.
Troubleshooting
log_searches is ignored / no log metrics appear. Your agent predates log search. Re-run the installer to update to the latest build, then restart the agent. Confirm the config file contains a top-level log_searches array (not nested under another key).
No alerts on existing errors right after setup. Expected. On first run the agent baselines at the current end of each file/journal and only evaluates new lines from that point forward. It never alerts on historical logs. Trigger a fresh log line to verify.
Log rotation. The agent detects rotation (truncation or inode change) and continues from the new file without missing lines or re-reading the rotated-out file. No configuration is needed.
Regex isn't matching. Remember the pattern is a regex inside JSON — escape backslashes (\\d, not \d) and quotes (\"). Test your pattern against a sample line before deploying.
Too many false positives. Increase duration_seconds so the condition must be sustained, raise the threshold, or make the pattern more specific.
No matching files for a glob. If a path glob matches no files, that search reports zero matches rather than erroring. Verify the path and the agent user's read permissions on the log files.
Related
- Heartbeat Monitoring — dead-man's switch for cron jobs and scheduled tasks
- Check Configuration — intervals, thresholds, and service linking
- Incident Management — triage, escalation, and resolution