[[en:documentation:pandorafms:introduction:01_introduction|Pandora FMS]] is a complex distributed application that has different key elements, susceptible to represent a bottleneck if it is not sized and configured correctly. The purpose of this chapter is to help to carry out a capacity study, **to analyze the //scalability// of Pandora FMS according to a specific set of parameters**. This study will help to know the requirements that the installation should have to be able to support a certain capacity.
+
[[:en:documentation:pandorafms:introduction:01_introduction|Pandora FMS]] is a complex distributed application that has different key elements, susceptible to represent a bottleneck if it is not sized and configured correctly. The purpose of this chapter is to help to carry out a capacity study, **to analyze the //scalability// of Pandora FMS according to a specific set of parameters**. This study will help to find out the requirements that the installation should have to be able to support a certain capacity.
The load tests are also used to observe the maximum capacity per server. In the current architecture model ([[en:documentation:pandorafms:technical_reference:10_versions|version 3.0 or later]]), with "N" independent servers and a **[[en:documentation:pandorafms:command_center:01_introduction|Command Center (Metaconsole)]]** installed, this //scalability //tends to be of linear order, while //scalability// based on centralized models is exponential.
+
Load tests are also used to see the maximum capacity per server. In the current architecture model ([[:en:documentation:pandorafms:technical_reference:10_versions|version 3.0 or later]]), with "N" independent servers and a **[[:en:documentation:pandorafms:command_center:01_introduction|Command Center (Metaconsole)]]** installed, this //scalability //tends to be of linear order, while //scalability// based on centralized models is exponential.
* Homogeneous systems with a series of characterizations grouped into technologies / policies.
+
* Highly variable intervals between the different modules and events to be monitored.
+
* Large amount of asynchronous information (events, log items).
+
* Lots of process status information with very little probability of change.
+
* Little information on yields compared to the total.
+
+
* 90% のモニタリングをエンドポイントで実施。
* 技術/ポリシーでグループ化できる似たようなシステムがある。
* 技術/ポリシーでグループ化できる似たようなシステムがある。
* モニタするモジュールやイベント間で、実行間隔が異なる。
* モニタするモジュールやイベント間で、実行間隔が異なる。
行 310:
行 317:
<WRAP center round tip 90%>
<WRAP center round tip 90%>
-
An installation of Pandora FMS with a GNU/Linux server installed "by default" in a powerful machine, can not pass from 5 to 6 packets per second, in a powerful machine well "optimized" and "conditioned" it can reach 30 to 40 packets per second. **This also depends a lot on the number of modules in each agent**.
+
An installation of Pandora FMS with a Linux server installed "by default" in a powerful machine, can not pass from 5 to 6 packets per second, in a powerful machine well "optimized" and "conditioned" it can reach 30 to 40 packets per second. **This also depends a lot on the number of modules in each agent**.
Obviously, the more threads you have, the more checks you will be able to execute. If you add all the threads that Pandora FMS executes, they should not reach the range of 30 to 40. You should not use more than 10 threads here, although it depends a lot on the type of hardware and the GNU/Linux version you are using.
+
Obviously, the more threads you have, the more checks you will be able to execute. If you add all the threads that Pandora FMS executes, they should not reach the range of 30 to 40. You should not use more than 10 threads here, although it depends a lot on the type of hardware and the Linux version you are using.
明らかに、スレッドが多いほど、実行できるチェックも多くなります。 Pandora FMS が実行するすべてのスレッドを追加しても、30〜40 を超えることはありません。利用している Linux バージョンとハードウェアの種類に大きく依存しますが、ここでは 10 を超えるスレッドは使用するべきではありませn。
-
Now, you must "create" a fictitious number of ping type modules to test. It is assumed that you will test a total of 3000 ping modules. To do this, it is best to take a system on the network that is capable of supporting all pings (any GNU/Linux server can handle the task).
+
Now, you must "create" a fictitious number of ping type modules to test. It is assumed that you will test a total of 3000 ping modules. To do this, it is best to take a system on the network that is capable of supporting all pings (any Linux server can handle the task).
Using the Pandora FMS CSV importer, create a file with the following format:
Using the Pandora FMS CSV importer, create a file with the following format:
行 461:
行 468:
==== SNMP サーバ ====
==== SNMP サーバ ====
-
This is specifically about the SNMP Enterprise network server. In case of testing for the Open network server, see the section on the (generic) network server.
+
This is the SNMP network server. Assuming you already have the server up and running and configured. Some key parameters for its operation:
It defines the number of SNMP requests that the system will do for each execution. You should consider that the server groups them by destination dir IP, so this block is only indicative. It is recommendable that it wouldn't be large (30 to 40 maximum). When an item of the block fails, an internal counter does that the Enterprise server will try it again, and if after x attempts it doesn't work, then it will pass it to the open server.
+
It defines the number of SNMP requests that the system will make for each execution. Bear in mind that the server groups them by destination IP address, so this block is a guideline. It should not be too large (30 to 40 maximum). When a block element fails, an internal counter causes the PFMS server to retry it.
-
これは、システムが 1回に実行する SNMP リクエストの数を定義します。サーバが宛先 IP でグループ化することを考慮するため、このブロックは単なる指標です。 大きくしないことをお勧めします(最大 30〜40)。ブロック内の要素に障害が発生すると、内部カウンターは Enterprise サーバでそれを再試行し、試行回数が x 回を超えても応答がない場合は、オープンソースのサーバに処理を渡します。
It should not be too large (30 to 40 maximum). When an element of the block fails, an internal counter makes the Enterprise server retry it, and if after X attempts it does not work, it will be passed to the Open server. You shouldn't user more than 10 threads, though it depends on the kind of hardware and GNU/Linux version that you use.
+
Obviously, the more threads you have, the more checks you will be able to execute. If you add all the threads that Pandora FMS executes, they should not reach the range of 30 to 40. You should not use more than 10 threads here, although it depends a lot on the type of hardware and the Linux version you are using.
当然ですが、スレッド数が多いほど実行できるチェックの数も増えますが、Pandora FMS が実行するすべてのスレッド数を合計しても 30~40 の範囲にはならないようにします。ハードウェアの種類や Linux のバージョンによって大きく異なりますが、ここでは 10 を超えるスレッド数は使用しないでください。
The faster way to test is through a SNMP device, applying all the interfaces, all the serial "basic" monitoring modules.This is done through the application of the Explorer SNMP (Agente → Modo de administracion → SNMP Explorer). Identify the interfaces and apply all the metrics to each interface. In a 24 port switch, this generates 650 modules.
The faster way to test is through a SNMP device, applying all the interfaces, all the serial "basic" monitoring modules.This is done through the application of the Explorer SNMP (Agente → Modo de administracion → SNMP Explorer). Identify the interfaces and apply all the metrics to each interface. In a 24 port switch, this generates 650 modules.
行 573:
行 576:
==== イベント ====
==== イベント ====
-
In a similar way as with the SNMP, evaluate the PFMS system's [[:en:documentation:04_using:02_events|events]] in two cases:
+
Similar to SNMP, the [[en:documentation:pandorafms:management_and_operation:02_events|events]] of the PFMS system will be evaluated in two scenarios:
1. Normal range of event reception. This has been already tested in the data server, so in each status change, an event will be generated.
+
- Normal event reception rate. This has already been tested in the data server, since an event is generated at each state change.
+
- Event generation storm. To do this, we will force the generation of events via CLI. Using the following command (with an existing group called "Tests"):
That command, used in a loop like the one used to generate traps, can be used to generate dozens of events per second. It can be parallelized in a script with several instances to cause a higher number of insertions. This would serve to simulate the behavior of the system in an event storm. In this way the system could be tested before, during and after an event storm.
/etc/pandora/pandora_server.conf --create_event "Event test" system TestingGroup
+
-
</file>
+
<wrap #ks3_7 />
-
+
-
This command, used un a loop as the one used to generate traps, it can be used to generate tens of events by second. It could be parallelize in one script with several instances to get a higher number of insertions. This will be useful to simulate the performance of the system if an event storm happens. This way we could check the system, before, during and after the event storm.
For this use another server, independent from Pandora FMS, using the WEB monitoring functionality. Do a user session where we have to do the following tasks in this order, and see how long they take.
+
For this, another server independent from Pandora FMS will be used, using the WEB monitoring functionality. In a user session where it will perform the following tasks in a specific order and measure how long they take to be processed:
- Visualize a report (in HTML). This report should contain a pair of graphs and a pair of modules with report type SUM or AVERAGE. The interval of each item should be of one week or five days.
+
- Display a report (in HTML). This report should contain a couple of graphs and a couple of modules with SUM or AVERAGE type reports. The interval for each item should be one week or five days.
-
- Visualization of a combined graph (24 hours).
+
- Display of a combined graph (24 hours).
-
- Generation of report in PDF (another different report).
+
- Generation of PDF report (another different report).
- コンソールへのログイン。
- コンソールへのログイン。
行 618:
行 619:
- PDF でのレポート生成(他のレポートにて)。
- PDF でのレポート生成(他のレポートにて)。
-
This test is done with at least three different users. This task could be parallelize to execute it every minute, so as if there are 5 tasks (each one with their user) we would be simulating the navigation of 5 simultaneous users.Once the environment is set up, we should consider this:
+
This test is performed with at least three different users. You can parallelize that task to run it every minute, so that if there are 5 tasks (each with its user), you would be simulating the navigation of five simultaneous users. Once the environment is established, it will take into account:
- The average velocity of each module is relevant facing to identify " bottle necks" relating with other parallel activities, such as the execution of the maintenance script, etc.
+
- The average speed of each module is relevant in order to identify "bottlenecks" related to other parallel activities, such as the execution of maintenance //script//, et cetera.
-
- The impact of CPU and memory will be measured in the server for each concurrent session.
+
- CPU and memory impact on the server will be measured for each concurrent session.
-
- The impact of each user session simulated referred to the average time of the rest of sessions will be measured. This is, you should estimate how many seconds of delay adds each simultaneous extra session.
+
- The impact of each simulated user session will be measured with respect to the average time of the rest of the sessions. That is, it should be estimated how many seconds of delay each simultaneous extra session adds.