summaryrefslogtreecommitdiffstats
path: root/libnetdata/log/log2journal.md
diff options
context:
space:
mode:
Diffstat (limited to 'libnetdata/log/log2journal.md')
-rw-r--r--libnetdata/log/log2journal.md518
1 files changed, 518 insertions, 0 deletions
diff --git a/libnetdata/log/log2journal.md b/libnetdata/log/log2journal.md
new file mode 100644
index 0000000000..01d2b80f4e
--- /dev/null
+++ b/libnetdata/log/log2journal.md
@@ -0,0 +1,518 @@
+# log2journal
+
+`log2journal` and `systemd-cat-native` can be used to convert a structured log file, such as the ones generated by web servers, into `systemd-journal` entries.
+
+By combining these tools, together with the usual UNIX shell tools you can create advanced log processing pipelines sending any kind of structured text logs to systemd-journald. This is a simple, but powerful and efficient way to handle log processing.
+
+The process involves the usual piping of shell commands, to get and process the log files in realtime.
+
+The overall process looks like this:
+
+```bash
+tail -F /var/log/nginx/*.log |\ # outputs log lines
+ log2journal 'PATTERN' |\ # outputs Journal Export Format
+ sed -u -e SEARCH-REPLACE-RULES |\ # optional rewriting rules
+ systemd-cat-native # send to local/remote journald
+```
+
+Let's see the steps:
+
+1. `tail -F /var/log/nginx/*.log`<br/>this command will tail all `*.log` files in `/var/log/nginx/`. We use `-F` instead of `-f` to ensure that files will still be tailed after log rotation.
+2. `log2joural` is a Netdata program. It reads log entries and extracts fields, according to the PCRE2 pattern it accepts. It can also apply some basic operations on the fields, like injecting new fields or duplicating existing ones. The output of `log2journal` is in Systemd Journal Export Format, and it looks like this:
+ ```bash
+ KEY1=VALUE1 # << start of the first log line
+ KEY2=VALUE2
+ # << log lines separator
+ KEY1=VALUE1 # << start of the second log line
+ KEY2=VALUE2
+ ```
+3. `sed` is an optional step and is an example. Any kind of processing can be applied at this stage, in case we want to alter the fields in some way. For example, we may want to set the PRIORITY field of systemd-journal to make Netdata dashboards and `journalctl` color the internal server errors. Or we may want to anonymize the logs, to remove sensitive information from them. Or even we may want to remove the variable parts of the requests, to make them uniform. We will see below how such processing can be done.
+4. `systemd-cat-native` is a Netdata program. I can send the logs to a local `systemd-journald` (journal namespaces supported), or to a remote `systemd-journal-remote`.
+
+## Real-life example
+
+We have an nginx server logging in this format:
+
+```bash
+ log_format access '$remote_addr - $remote_user [$time_local] '
+ '"$request" $status $body_bytes_sent '
+ '$request_length $request_time '
+ '"$http_referer" "$http_user_agent"';
+```
+
+First, let's find the right pattern for `log2journal`. We ask ChatGPT:
+
+```
+My nginx log uses this log format:
+
+log_format access '$remote_addr - $remote_user [$time_local] '
+ '"$request" $status $body_bytes_sent '
+ '$request_length $request_time '
+ '"$http_referer" "$http_user_agent"';
+
+I want to use `log2joural` to convert this log for systemd-journal.
+`log2journal` accepts a PCRE2 regular expression, using the named groups
+in the pattern as the journal fields to extract from the logs.
+
+Prefix all PCRE2 group names with `NGINX_` and use capital characters only.
+
+For the $request, use the field `MESSAGE` (without NGINX_ prefix), so that
+it will appear in systemd journals as the message of the log.
+
+Please give me the PCRE2 pattern.
+```
+
+ChatGPT replies with this:
+
+```regexp
+^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>[^"]+)" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"
+```
+
+Let's test it with a sample line (instead of `tail`):
+
+```bash
+# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>[^"]+)" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"'
+MESSAGE=GET /index.html HTTP/1.1
+NGINX_BODY_BYTES_SENT=4172
+NGINX_HTTP_REFERER=-
+NGINX_HTTP_USER_AGENT=Go-http-client/1.1
+NGINX_REMOTE_ADDR=1.2.3.4
+NGINX_REMOTE_USER=-
+NGINX_REQUEST_LENGTH=104
+NGINX_REQUEST_TIME=0.001
+NGINX_STATUS=200
+NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
+
+```
+
+As you can see, it extracted all the fields.
+
+The `MESSAGE` however, has 3 fields by itself: the method, the URL and the procotol version. Let's ask ChatGPT to extract these too:
+
+```
+I see that the MESSAGE has 3 key items in it. The request method (GET, POST,
+etc), the URL and HTTP protocol version.
+
+I want to keep the MESSAGE as it is, with all the information in it, but also
+extract the 3 items from it as separate fields.
+
+Can this be done?
+```
+
+ChatGPT responded with this:
+
+```regexp
+^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"
+```
+
+Let's test this too:
+
+```bash
+# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"'
+MESSAGE=GET /index.html HTTP/1.1 # <<<<<<<<< MESSAGE
+NGINX_BODY_BYTES_SENT=4172
+NGINX_HTTP_REFERER=-
+NGINX_HTTP_USER_AGENT=Go-http-client/1.1
+NGINX_HTTP_VERSION=1.1 # <<<<<<<<< VERSION
+NGINX_METHOD=GET # <<<<<<<<< METHOD
+NGINX_REMOTE_ADDR=1.2.3.4
+NGINX_REMOTE_USER=-
+NGINX_REQUEST_LENGTH=104
+NGINX_REQUEST_TIME=0.001
+NGINX_STATUS=200
+NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
+NGINX_URL=/index.html # <<<<<<<<< URL
+
+```
+
+Ideally, we would want the 5xx errors to be red in our `journalctl` output. To achieve that we need to add a PRIORITY field to set the log level. Log priorities are numeric and follow the `syslog` priorities. Checking `/usr/include/sys/syslog.h` we can see these:
+
+```c
+#define LOG_EMERG 0 /* system is unusable */
+#define LOG_ALERT 1 /* action must be taken immediately */
+#define LOG_CRIT 2 /* critical conditions */
+#define LOG_ERR 3 /* error conditions */
+#define LOG_WARNING 4 /* warning conditions */
+#define LOG_NOTICE 5 /* normal but significant condition */
+#define LOG_INFO 6 /* informational */
+#define LOG_DEBUG 7 /* debug-level messages */
+```
+
+Avoid setting priority to 0 (`LOG_EMERG`), because these will be on your terminal (the journal uses `wall` to let you know of such events). A good priority for errors is 3 (red in `journalctl`), or 4 (yellow in `journalctl`).
+
+To set the PRIORITY field in the output, we can use `NGINX_STATUS` fields. We need a copy of it, which we will alter later.
+
+We can instruct `log2journal` to duplicate `NGINX_STATUS`, like this: `log2journal --duplicate=STATUS2PRIORITY=NGINX_STATUS`. Let's try it:
+
+```bash
+# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"' --duplicate=STATUS2PRIORITY=NGINX_STATUS
+MESSAGE=GET /index.html HTTP/1.1
+NGINX_BODY_BYTES_SENT=4172
+NGINX_HTTP_REFERER=-
+NGINX_HTTP_USER_AGENT=Go-http-client/1.1
+NGINX_HTTP_VERSION=1.1
+NGINX_METHOD=GET
+NGINX_REMOTE_ADDR=1.2.3.4
+NGINX_REMOTE_USER=-
+NGINX_REQUEST_LENGTH=104
+NGINX_REQUEST_TIME=0.001
+NGINX_STATUS=200
+STATUS2PRIORITY=200 # <<<<<<<<< STATUS2PRIORITY IS HERE
+NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
+NGINX_URL=/index.html
+
+```
+
+Now that we have the `STATUS2PRIORITY` field equal to the `NGINX_STATUS`, we can use a `sed` command to change it to the `PRIORITY` field we want. The `sed` command could be:
+
+```bash
+sed -u -e 's|STATUS2PRIORITY=5.*|PRIORITY=3|' -e 's|STATUS2PRIORITY=.*|PRIORITY=6|'
+```
+
+We use `-u` for unbuffered communication.
+
+This command first changes all 5xx `STATUS2PRIORITY` fields to `PRIORITY=3` (error) and then changes all the rest to `PRIORITY=6` (info). Let's see the whole of it:
+
+```bash
+# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"' --duplicate=STATUS2PRIORITY=NGINX_STATUS | sed -u -e 's|STATUS2PRIORITY=5.*|PRIORITY=3|' -e 's|STATUS2PRIORITY=.*|PRIORITY=6|'
+MESSAGE=GET /index.html HTTP/1.1
+NGINX_BODY_BYTES_SENT=4172
+NGINX_HTTP_REFERER=-
+NGINX_HTTP_USER_AGENT=Go-http-client/1.1
+NGINX_HTTP_VERSION=1.1
+NGINX_METHOD=GET
+NGINX_REMOTE_ADDR=1.2.3.4
+NGINX_REMOTE_USER=-
+NGINX_REQUEST_LENGTH=104
+NGINX_REQUEST_TIME=0.001
+NGINX_STATUS=200
+PRIORITY=6 # <<<<<<<<< PRIORITY IS HERE
+NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
+NGINX_URL=/index.html
+
+```
+
+Similarly, we could duplicate `NGINX_URL` to `NGINX_ENDPOINT` and then process it with sed to remove any query string, or replace IDs in the URL path with constant names, thus giving us uniform endpoints independently of the parameters.
+
+To complete the example, we can also inject a `SYSLOG_IDENTIFIER` with `log2journal`, using `--inject=SYSLOG_IDENTIFIER=nginx`, like this:
+
+```bash
+# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"' --duplicate=STATUS2PRIORITY=NGINX_STATUS --inject=SYSLOG_IDENTIFIER=nginx | sed -u -e 's|STATUS2PRIORITY=5.*|PRIORITY=3|' -e 's|STATUS2PRIORITY=.*|PRIORITY=6|'
+MESSAGE=GET /index.html HTTP/1.1
+NGINX_BODY_BYTES_SENT=4172
+NGINX_HTTP_REFERER=-
+NGINX_HTTP_USER_AGENT=Go-http-client/1.1
+NGINX_HTTP_VERSION=1.1
+NGINX_METHOD=GET
+NGINX_REMOTE_ADDR=1.2.3.4
+NGINX_REMOTE_USER=-
+NGINX_REQUEST_LENGTH=104
+NGINX_REQUEST_TIME=0.001
+NGINX_STATUS=200
+PRIORITY=6
+NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000
+NGINX_URL=/index.html
+SYSLOG_IDENTIFIER=nginx # <<<<<<<<< THIS HAS BEEN ADDED
+
+```
+
+Now the message is ready to be sent to a systemd-journal. For this we use `systemd-cat-native`. This command can send such messages to a journal running on the localhost, a local journal namespace, or a `systemd-journal-remote` running on another server. By just appending `| systemd-cat-native` to the command, the message will be sent to the local journal.
+
+
+```bash
+# echo '1.2.3.4 - - [19/Nov/2023:00:24:43 +0000] "GET /index.html HTTP/1.1" 200 4172 104 0.001 "-" "Go-http-client/1.1"' | log2journal '^(?<NGINX_REMOTE_ADDR>[^ ]+) - (?<NGINX_REMOTE_USER>[^ ]+) \[(?<NGINX_TIME_LOCAL>[^\]]+)\] "(?<MESSAGE>(?<NGINX_METHOD>[A-Z]+) (?<NGINX_URL>[^ ]+) HTTP/(?<NGINX_HTTP_VERSION>[^"]+))" (?<NGINX_STATUS>\d+) (?<NGINX_BODY_BYTES_SENT>\d+) (?<NGINX_REQUEST_LENGTH>\d+) (?<NGINX_REQUEST_TIME>[\d.]+) "(?<NGINX_HTTP_REFERER>[^"]*)" "(?<NGINX_HTTP_USER_AGENT>[^"]*)"' --duplicate=STATUS2PRIORITY=NGINX_STATUS --inject=SYSLOG_IDENTIFIER=nginx | sed -u -e 's|STATUS2PRIORITY=5.*|PRIORITY=3|' -e 's|STATUS2PRIORITY=.*|PRIORITY=6|' | systemd-cat-native
+# no output
+
+# let's find the message
+# journalctl -o verbose SYSLOG_IDENTIFIER=nginx
+Sun 2023-11-19 04:34:06.583912 EET [s=1eb59e7934984104ab3b61f5d9648057;i=115b6d4;b=7282d89d2e6e4299969a6030302ff3e4;m=69b419673;t=60a783417ac72;x=2cec5dde8bf01ee7]
+ PRIORITY=6
+ _UID=0
+ _GID=0
+ _BOOT_ID=7282d89d2e6e4299969a6030302ff3e4
+ _MACHINE_ID=6b72c55db4f9411dbbb80b70537bf3a8
+ _HOSTNAME=costa-xps9500
+ _RUNTIME_SCOPE=system
+ _TRANSPORT=journal
+ _CAP_EFFECTIVE=1ffffffffff
+ _AUDIT_LOGINUID=1000
+ _AUDIT_SESSION=1
+ _SYSTEMD_CGROUP=/user.slice/user-1000.slice/user@1000.service/app.slice/app-org.gnome.Terminal.slice/vte-spawn-59780d3d-a3ff-4a82-a6fe-8d17d2261106.scope
+ _SYSTEMD_OWNER_UID=1000
+ _SYSTEMD_UNIT=user@1000.service
+ _SYSTEMD_USER_UNIT=vte-spawn-59780d3d-a3ff-4a82-a6fe-8d17d2261106.scope
+ _SYSTEMD_SLICE=user-1000.slice
+ _SYSTEMD_USER_SLICE=app-org.gnome.Terminal.slice
+ _SYSTEMD_INVOCATION_ID=6195d8c4c6654481ac9a30e9a8622ba1
+ _COMM=systemd-cat-nat
+ MESSAGE=GET /index.html HTTP/1.1 # <<<<<<<<< CHECK
+ NGINX_BODY_BYTES_SENT=4172 # <<<<<<<<< CHECK
+ NGINX_HTTP_REFERER=- # <<<<<<<<< CHECK
+ NGINX_HTTP_USER_AGENT=Go-http-client/1.1 # <<<<<<<<< CHECK
+ NGINX_HTTP_VERSION=1.1 # <<<<<<<<< CHECK
+ NGINX_METHOD=GET # <<<<<<<<< CHECK
+ NGINX_REMOTE_ADDR=1.2.3.4 # <<<<<<<<< CHECK
+ NGINX_REMOTE_USER=- # <<<<<<<<< CHECK
+ NGINX_REQUEST_LENGTH=104 # <<<<<<<<< CHECK
+ NGINX_REQUEST_TIME=0.001 # <<<<<<<<< CHECK
+ NGINX_STATUS=200 # <<<<<<<<< CHECK
+ NGINX_TIME_LOCAL=19/Nov/2023:00:24:43 +0000 # <<<<<<<<< CHECK
+ NGINX_URL=/index.html # <<<<<<<<< CHECK
+ SYSLOG_IDENTIFIER=nginx # <<<<<<<<< CHECK
+ _PID=354312
+ _SOURCE_REALTIME_TIMESTAMP=1700361246583912
+
+```
+
+So, the log line, with all its fields parsed, ended up in systemd-journal.
+
+The complete example, would look like the following script.
+Running this script with parameter `test` will produce output on the terminal for you to inspect.
+Unmatched log entries are added to the journal with PRIORITY=1 (`ERR_ALERT`), so that you can spot them.
+
+We also used the `--filename-key` of `log2journal`, which parses the filename when `tail` switches output
+between files, and adds the field `NGINX_LOG_FILE` with the filename each log line comes from.
+
+Finally, the script also adds the field `NGINX_STATUS_FAMILY` taking values `2xx`, `3xx`, etc, so that
+it is easy to find all the logs of a specific status family.
+
+```bash
+#!/usr/bin/env bash
+
+test=0
+last=0
+send_or_show='./systemd-cat-native'
+[ "${1}" = "test" ] && test=1 && last=100 && send_or_show=cat
+
+pattern='(?x) # Enable PCRE2 extended mode
+^
+(?<NGINX_REMOTE_ADDR>[^ ]+) \s - \s # NGINX_REMOTE_ADDR
+(?<NGINX_REMOTE_USER>[^ ]+) \s # NGINX_REMOTE_USER
+\[
+ (?<NGINX_TIME_LOCAL>[^\]]+) # NGINX_TIME_LOCAL
+\]
+\s+ "
+(?<MESSAGE> # MESSAGE
+ (?<NGINX_METHOD>[A-Z]+) \s+ # NGINX_METHOD
+ (?<NGINX_URL>[^ ]+) \s+ # NGINX_URL
+ HTTP/(?<NGINX_HTTP_VERSION>[^"]+) # NGINX_HTTP_VERSION
+)
+" \s+
+(?<NGINX_STATUS>\d+) \s+ # NGINX_STATUS
+(?<NGINX_BODY_BYTES_SENT>\d+) \s+ # NGINX_BODY_BYTES_SENT
+"(?<NGINX_HTTP_REFERER>[^"]*)" \s+ # NGINX_HTTP_REFERER
+"(?<NGINX_HTTP_USER_AGENT>[^"]*)" # NGINX_HTTP_USER_AGENT
+'
+
+tail -n $last -F /var/log/nginx/*access.log |\
+ log2journal "${pattern}" \
+ --filename-key=NGINX_LOG_FILE \
+ --duplicate=STATUS2PRIORITY=NGINX_STATUS \
+ --duplicate=STATUS_FAMILY=NGINX_STATUS \
+ --inject=SYSLOG_IDENTIFIER=nginx \
+ --unmatched-key=MESSAGE \
+ --inject-unmatched=PRIORITY=1 \
+ | sed -u \
+ -e 's|^STATUS2PRIORITY=5.*$|PRIORITY=3|' \
+ -e 's|^STATUS2PRIORITY=.*$|PRIORITY=6|' \
+ -e 's|^STATUS_FAMILY=\([0-9]\).*$|NGINX_STATUS_FAMILY=\1xx|' \
+ -e 's|^STATUS_FAMILY=.*$|NGINX_STATUS_FAMILY=UNKNOWN|' \
+ | $send_or_show
+```
+
+
+## `log2journal` options
+
+```
+
+Netdata log2journal v1.43.0-337-g116dc1bc3
+
+Convert structured log input to systemd Journal Export Format.
+
+Using PCRE2 patterns, extract the fields from structured logs on the standard
+input, and generate output according to systemd Journal Export Format
+
+Usage: ./log2journal [OPTIONS] PATTERN
+
+Options:
+
+ --filename-key=KEY
+ Add a field with KEY as the key and the current filename as value.
+ Automatically detects filenames when piped after 'tail -F',
+ and tail matches multiple filenames.
+ To inject the filename when tailing a single file, use --inject.
+
+ --unmatched-key=KEY
+ Include unmatched log entries in the output with KEY as the field name.
+ Use this to include unmatched entries to the output stream.
+ Usually it should be set to --unmatched-key=MESSAGE so that the
+ unmatched entry will appear as the log message in the journals.
+ Use --inject-unmatched to inject additional fields to unmatched lines.
+
+ --duplicate=TARGET=KEY1[,KEY2[,KEY3[,...]]
+ Create a new key called TARGET, duplicating the values of the keys
+ given. Useful for further processing. When multiple keys are given,
+ their values are separated by comma.
+ Up to 2048 duplications can be given on the command line, and up to
+ 10 keys per duplication command are allowed.
+
+ --inject=LINE
+ Inject constant fields to the output (both matched and unmatched logs).
+ --inject entries are added to unmatched lines too, when their key is
+ not used in --inject-unmatched (--inject-unmatched override --inject).
+ Up to 2048 fields can be injected.
+
+ --inject-unmatched=LINE
+ Inject lines into the output for each unmatched log entry.
+ Usually, --inject-unmatched=PRIORITY=3 is needed to mark the unmatched
+ lines as errors, so that they can easily be spotted in the journals.
+ Up to 2048 such lines can be injected.
+
+ -h, --help
+ Display this help and exit.
+
+ PATTERN
+ PATTERN should be a valid PCRE2 regular expression.
+ RE2 regular expressions (like the ones usually used in Go applications),
+ are usually valid PCRE2 patterns too.
+ Regular expressions without named groups are ignored.
+
+The maximum line length accepted is 1048576 characters.
+The maximum number of fields in the PCRE2 pattern is 8192.
+
+JOURNAL FIELDS RULES (enforced by systemd-journald)
+
+ - field names can be up to 64 characters
+ - the only allowed field characters are A-Z, 0-9 and underscore
+ - the first character of fields cannot be a digit
+ - protected journal fields start with underscore:
+ * they are accepted by systemd-journal-remote
+ * they are NOT accepted by a local systemd-journald
+
+ For best results, always include these fields:
+
+ MESSAGE=TEXT
+ The MESSAGE is the body of the log entry.
+ This field is what we usually see in our logs.
+
+ PRIORITY=NUMBER
+ PRIORITY sets the severity of the log entry.
+ 0=emerg, 1=alert, 2=crit, 3=err, 4=warn, 5=notice, 6=info, 7=debug
+ - Emergency events (0) are usually broadcast to all terminals.
+ - Emergency, alert, critical, and error (0-3) are usually colored red.
+ - Warning (4) entries are usually colored yellow.
+ - Notice (5) entries are usually bold or have a brighter white color.
+ - Info (6) entries are the default.
+ - Debug (7) entries are usually grayed or dimmed.
+
+ SYSLOG_IDENTIFIER=NAME
+ SYSLOG_IDENTIFIER sets the name of application.
+ Use something descriptive, like: SYSLOG_IDENTIFIER=nginx-logs
+
+You can find the most common fields at 'man systemd.journal-fields'.
+
+```
+
+## `systemd-cat-native` options
+
+```
+
+Netdata systemd-cat-native v1.43.0-319-g4ada93a6e
+
+This program reads from its standard input, lines in the format:
+
+KEY1=VALUE1\n
+KEY2=VALUE2\n
+KEYN=VALUEN\n
+\n
+
+and sends them to systemd-journal.
+
+ - Binary journal fields are not accepted at its input
+ - Binary journal fields can be generated after newline processing
+ - Messages have to be separated by an empty line
+ - Keys starting with underscore are not accepted (by journald)
+ - Other rules imposed by systemd-journald are imposed (by journald)
+
+Usage:
+
+ ./systemd-cat-native
+ [--newline=STRING]
+ [--log-as-netdata|-N]
+ [--namespace=NAMESPACE] [--socket=PATH]
+ [--url=URL [--key=FILENAME] [--cert=FILENAME] [--trust=FILENAME|all]]
+
+The program has the following modes of logging:
+
+ * Log to a local systemd-journald or stderr
+
+ This is the default mode. If systemd-journald is available, logs will be
+ sent to systemd, otherwise logs will be printed on stderr, using logfmt
+ formatting. Options --socket and --namespace are available to configure
+ the journal destination:
+
+ --socket=PATH
+ The path of a systemd-journald UNIX socket.
+ The program will use the default systemd-journald socket when this
+ option is not used.
+
+ --namespace=NAMESPACE
+ The name of a configured and running systemd-journald namespace.
+ The program will produce the socket path based on its internal
+ defaults, to send the messages to the systemd journal namespace.
+
+ * Log as Netdata, enabled with --log-as-netdata or -N
+
+ In this mode the program uses environment variables set by Netdata for
+ the log destination. Only log fields defined by Netdata are accepted.
+ If the environment variables expected by Netdata are not found, it
+ falls back to stderr logging in logfmt format.
+
+ * Log to a systemd-journal-remote TCP socket, enabled with --url=URL
+
+ In this mode, the program will directly sent logs to a remote systemd
+ journal (systemd-journal-remote expected at the destination)
+ This mode is available even when the local system does not support
+ systemd, or even it is not Linux, allowing a remote Linux systemd
+ journald to become the logs database of the local system.
+
+ --url=URL
+ The destination systemd-journal-remote address and port, similarly
+ to what /etc/systemd/journal-upload.conf accepts.
+ Usually it is in the form: https://ip.address:19532
+ Both http and https URLs are accepted. When using https, the
+ following additional options are accepted:
+
+ --key=FILENAME
+ The filename of the private key of the server.
+ The default is: /etc/ssl/private/journal-upload.pem
+
+ --cert=FILENAME
+ The filename of the public key of the server.
+ The default is: /etc/ssl/certs/journal-upload.pem
+
+ --trust=FILENAME | all
+ The filename of the trusted CA public key.
+ The default is: /etc/ssl/ca/trusted.pem
+ The keyword 'all' can be used to trust all CAs.
+
+ NEWLINES PROCESSING
+ systemd-journal logs entries may have newlines in them. However the
+ Journal Export Format uses binary formatted data to achieve this,
+ making it hard for text processing.
+
+ To overcome this limitation, this program allows single-line text
+ formatted values at its input, to be binary formatted multi-line Journal
+ Export Format at its output.
+
+ To achieve that it allows replacing a given string to a newline.
+ The parameter --newline=STRING allows setting the string to be replaced
+ with newlines.
+
+ For example by setting --newline='{NEWLINE}', the program will replace
+ all occurrences of {NEWLINE} with the newline character, within each
+ VALUE of the KEY=VALUE lines. Once this this done, the program will
+ switch the field to the binary Journal Export Format before sending the
+ log event to systemd-journal.
+
+```