Update README.md (#4414)

* Update README.md Added everything that was missing from https://github.com/netdata/netdata/wiki/External-Plugins Made minor corrections. This updated README.md will replace External-Plugins in the generated documentation. * Update README.md Resolved @ktsaou comments
author: Chris Akritidis <43294513+cakrit@users.noreply.github.com> 2018-10-16 23:03:26 +0200
committer: Costa Tsaousis <costa@tsaousis.gr> 2018-10-17 00:03:26 +0300
commit: 753350082b25cff1aa2ccc898d304d63d2cd02a0 (patch)
tree: 2684e7c054d9d9388c5a934ca399d710f01dc9d3 /collectors/plugins.d
parent: abe5147869fbf3d09e0c6755db769850e8e72d11 (diff)
1 files changed, 145 insertions, 8 deletions
diff --git a/collectors/plugins.d/README.md b/collectors/plugins.d/README.md
index a3ed8c5d27..130833aac1 100644
--- a/collectors/plugins.d/README.md
+++ b/collectors/plugins.d/README.md
@@ -3,6 +3,17 @@
 `plugins.d` is the netdata internal plugin that collects metrics
 from external processes, thus allowing netdata to use **external plugins**.
 
+netdata supports plugins written in **any language**. The only requirement netdata has from its plugins, is to be able to print data at their output.
+
+Plugins can be written in the appropriate language for their job. For example:
+
+- You can collect data from JMX, using a java application
+- You can collect data from a REST API, using a node.js application
+- You can collect data from a system command, using a shell script
+- etc.
+
+Many of these languages can run their code efficiently, but they require a lot of resources when they are initialized. netdata suggests that plugins will be **initialized once and run forever** (until stopped by netdata). This way, the expensive part of their execution, their initialization, is eliminated.
+
 ## Provided External Plugins
 
 plugin|language|O/S|description
@@ -14,6 +25,8 @@ plugin|language|O/S|description
 [node.d.plugin](../node.d.plugin/)|`node.js`|all|a **plugin orchestrator** for data collection modules written in `node.js`.
 [python.d.plugin](../python.d.plugin/)|`python`|all|a **plugin orchestrator** for data collection modules written in `python` v2 or v3 (both are supported).
 
+Plugin orchestrators may also be described as **modular plugins**. They are modular since they accept custom made modules to be included. Writing modules for these plugins is easier than accessing the native netdata API directly. You will find modules already available for each orchestrator under the directory of the particular modular plugin (e.g. under python.d.plugin for the python orchestrator). 
+Each of these modular plugins has each own methods for defining modules. Please check the examples and their documentation.
 
 ## Motivation
 
@@ -32,19 +45,22 @@ Each of the external plugins is expected to run forever.
 Netdata will start it when it starts and stop it when it exits.
 
 If the external plugin exits or crashes, netdata will log an error.
-If the external plugin exits or crashes without pushing metrics to netdata,
-netdata will not start it again.
+If the external plugin exits or crashes without pushing metrics to netdata, netdata will not start it again.
+- Plugins that exit with any value other than zero, will be disabled. Plugins that exit with zero, will be restarted after some time.
+- Plugins may also be disabled by netdata if they output things that netdata does not understand.
 
 The `stdout` of external plugins is connected to netdata to receive metrics,
 with the API defined below.
 
 The `stderr` of external plugins is connected to netdata `error.log`.
 
+Plugins can create any number of charts with any number of dimensions each. Each chart can have its own characteristics independently of the others generated by the same plugin. For example, one chart may have an update frequency of 1 second, another may have 5 seconds and a third may have 10 seconds.
+
 ## Configuration
 
-This plugin is configured via `netdata.conf`, section `[plugins]`.
-At this section there a list of all the plugins found at the system it runs
-with a boolean setting to enable them or not. 
+Netdata will supply the environment variables `NETDATA_USER_CONFIG_DIR` (for user supplied) and `NETDATA_STOCK_CONFIG_DIR` (for netdata supplied) configuration files to identify the directory where configuration files are stored. It is up to the plugin to read the configuration it needs. 
+
+The `netdata.conf` section [plugins] section contains a list of all the plugins found at the system where netdata runs, with a boolean setting to enable them or not. 
 
 Example:
 
@@ -82,6 +98,20 @@ For example, for `apps.plugin` the following section is available:
 - `command options` allows giving additional command line options to the plugin.
 
 
+Netdata will provide to the extrenal plugins the environment variable `NETDATA_UPDATE_EVERY`, in seconds (the default is 1). This is the **minimum update frequency** for all charts. A plugin that is updating values more frequently than this, is just wasting resources.
+
+Netdata will call the plugin with just one command line parameter: the number of seconds the user requested this plugin to update its data (by default is also 1).
+
+Other than the above, the plugin configuration is up to the plugin.
+
+Keep in mind, that the user may use netdata configuration to overwrite chart and dimension parameters. This is transparent to the plugin.
+
+### Autoconfiguration
+
+Plugins should attempt to autoconfigure themselves when possible.
+
+For example, if your plugin wants to monitor `squid`, you can search for it on port `3128` or `8080`. If any succeeds, you can proceed. If it fails you can output an error (on stderr) saying that you cannot find `squid` running and giving instructions about the plugin configuration. Then you can stop (exit with non-zero value), so that netdata will not attempt to start the plugin again.
+
 ## External Plugins API
 
 Any program that can print a few values to its standard output can become a netdata external plugin.
@@ -129,7 +159,7 @@ variable|description
 `NETDATA_UPDATE_EVERY`|The minimum number of seconds between chart refreshes. This is like the **internal clock** of netdata (it is user configurable, defaulting to `1`). There is no meaning for a plugin to update its values more frequently than this number of seconds.
 
 
-### the output of the plugin
+### The output of the plugin
 
 The plugin should output instructions for netdata to its output (`stdout`). Since this uses pipes, please make sure you flush stdout after every iteration.
 
@@ -203,7 +233,7 @@ the template is:
 
   - `plugin` and `module`
 
-    both are just names that are used to let the user the plugin and its module that generated the chart. If `plugin` is unset or empty, netdata will automatically set the filename of the plugin that generated the chart. `module` has not default.
+    both are just names that are used to let the user identify the plugin and the module that generated the chart. If `plugin` is unset or empty, netdata will automatically set the filename of the plugin that generated the chart. `module` has not default.
 
 
 #### DIMENSION
@@ -289,7 +319,7 @@ The `value` is floating point (netdata used `long double`).
 
 Variables are transferred to upstream netdata servers (streaming and database replication).
 
-## data collection
+## Data collection
 
 data collection is defined as a series of `BEGIN` -> `SET` -> `END` lines
 
@@ -345,3 +375,110 @@ follow these guidelines), will be disabled by netdata.
 
 netdata will collect any **signed** value in the 64bit range:
 `-9.223.372.036.854.775.808` to `+9.223.372.036.854.775.807`
+
+If a value is not collected, leave it empty, like this:
+
+`SET id = `
+
+or do not output the line at all.
+
+## Modular Plugins
+
+1. **python**, use `python.d.plugin`, there are many examples in the [python.d directory](https://github.com/netdata/netdata/tree/master/python.d)
+
+   python is ideal for netdata plugins. It is a simple, yet powerful way to collect data, it has a very small memory footprint, although it is not the most CPU efficient way to do it.
+
+2. **node.js**, use `node.d.plugin`, there are a few examples in the [node.d directory](https://github.com/netdata/netdata/tree/master/node.d)
+
+   node.js is the fastest scripting language for collecting data. If your plugin needs to do a lot of work, compute values, etc, node.js is probably the best choice before moving to compiled code. Keep in mind though that node.js is not memory efficient; it will probably need more RAM compared to python.
+
+3. **BASH**, use `charts.d.plugin`, there are many examples in the [charts.d directory](https://github.com/netdata/netdata/tree/master/charts.d)
+
+   BASH is the simplest scripting language for collecting values. It is the less efficient though in terms of CPU resources. You can use it to collect data quickly, but extensive use of it might use a lot of system resources.
+
+4. **C**
+
+   Of course, C is the most efficient way of collecting data. This is why netdata itself is written in C.
+
+5. **Nim**, there is an unofficial [nim plugin helper](https://github.com/FedericoCeratto/nim-netdata-plugin)
+
+---
+
+## Writing Plugins Properly
+
+There are a few rules for writing plugins properly:
+
+1. Respect system resources
+
+   Pay special attention to efficiency:
+
+      - Initialize everything once, at the beginning. Initialization is not an expensive operation. Your plugin will most probably be started once and run forever. So, do whatever heavy operation is needed at the beginning, just once.
+      - Do the absolutely minimum while iterating to collect values repeatedly.
+      - If you need to connect to another server to collect values, avoid re-connects if possible. Connect just once, with keep-alive (for HTTP) enabled and collect values using the same connection.
+      - Avoid any CPU or memory heavy operation while collecting data. If you control memory allocation, avoid any memory allocation white iterating to collect values.
+      - Avoid running external commands when possible. If you are writing shell scripts avoid especially pipes (each pipe is another fork, a very expensive operation).
+
+2. The best way to iterate at a constant pace is this pseudo code:
+
+```js
+   var update_every = argv[1] * 1000; /* seconds * 1000 = milliseconds */
+
+   readConfiguration();
+   
+   if(!verifyWeCanCollectValues()) {
+      print "DISABLE";
+      exit(1);
+   }
+
+   createCharts(); /* print CHART and DIMENSION statements */
+
+   var loops = 0;
+   var last_run = 0;
+   var next_run = 0;
+   var dt_since_last_run = 0;
+   var now = 0;
+
+   FOREVER {
+       /* find the current time in milliseconds */
+       now = currentTimeStampInMilliseconds();
+
+       /*
+        * find the time of the next loop
+        * this makes sure we are always aligned
+        * with the netdata daemon
+        */
+       next_run = now - (now % update_every) + update_every;
+
+       /*
+        * wait until it is time
+        * it is important to do it in a loop
+        * since many wait functions can be interrupted
+        */
+       while( now < next_run ) {
+           sleepMilliseconds(next_run - now);
+           now = currentTimeStampInMilliseconds();
+       }
+       
+       /* calculate the time passed since the last run */
+       if ( loops > 0 )
+           dt_since_last_run = (now - last_run) * 1000; /* in microseconds */
+
+       /* prepare for the next loop */
+       last_run = now;
+       loops++;
+
+       /* do your magic here to collect values */
+       collectValues();
+
+       /* send the collected data to netdata */
+       printValues(dt_since_last_run); /* print BEGIN, SET, END statements */
+   }
+```
+
+   Using the above procedure, your plugin will be synchronized to start data collection on steps of `update_every`. There will be no need to keep track of latencies in data collection.
+
+   Netdata interpolates values to second boundaries, so even if your plugin is not perfectly aligned it does not matter. Netdata will find out. When your plugin works in increments of `update_every`, there will be no gaps in the charts due to the possible cumulative micro-delays in data collection. Gaps will only appear if the data collection is really delayed.
+
+3. If you are not sure of memory leaks, exit every one hour. Netdata will re-start your process.
+
+4. If possible, try to autodetect if your plugin should be enabled, without any configuration.
author	Chris Akritidis <43294513+cakrit@users.noreply.github.com>	2018-10-16 23:03:26 +0200
committer	Costa Tsaousis <costa@tsaousis.gr>	2018-10-17 00:03:26 +0300
commit	753350082b25cff1aa2ccc898d304d63d2cd02a0 (patch)
tree	2684e7c054d9d9388c5a934ca399d710f01dc9d3 /collectors/plugins.d
parent	abe5147869fbf3d09e0c6755db769850e8e72d11 (diff)