webstat-collector/README.rst

147 lines
4.8 KiB
ReStructuredText
Raw Permalink Normal View History

2020-09-06 04:42:26 +02:00
====================
Website checker demo
====================
2020-09-07 15:03:56 +02:00
.. image:: https://github.com/vladan/aiven-site-checker/workflows/unittests/badge.svg
:target: https://github.com/vladan/aiven-site-checker/actions?query=workflow%3Aunittests+branch%3Amaster
2020-09-06 04:42:26 +02:00
.. image:: https://github.com/vladan/aiven-site-checker/workflows/documentation/badge.svg
2020-09-06 11:27:56 +02:00
:target: https://vladan.github.io/aiven-site-checker/
2020-09-06 03:54:48 +02:00
2020-09-05 01:50:54 +02:00
CHWEB is a website checking tool.
It sends HTTP requests to sites with the intent to check their status /
availibility, and if the ``regex`` param is specified, it runs it against the
response body. The retreived status check is sent to a Kafka topic. When
read by the consumer, the status check is written in a PostgreSQL database.
2020-09-05 01:50:54 +02:00
ATM in its very early stages meant to demo `aiven <https://aiven.io>`_'s
platform, using their `kafka <https://aiven.io/kafka>`_ and `postgresql
<https://aiven.io/postgresql>`_ services.
Quickstart with docker-compose
==============================
2020-09-07 11:21:37 +02:00
The services can be run with docker-compose. You'd need to change the values of
``KAFKA_SERVERS``, ``POSTGRES_HOST`` and ``POSTGRES_PASS`` in order for
the configuration to be properly applied. Also, you'd need to download the
kafka certificates and put them in the folder you're running docker-composer
from and create the PostgreSQL cert by copying it from aivens console and
saving it to ``pgcert.pem``. After all this is done, simply run::
docker-compose up
No docker?
----------
if you haven't got, or don't want to use docker, then you can install this
package and run it manually.
Install the latest dev version from the github repository::
pip install git+https://github.com/vladan/aiven-site-checker.git
Run the website status collector in one terminal::
chweb_collect -c config.yaml
and the consumer in another::
chweb_consume -c config.yaml
2020-09-07 11:21:37 +02:00
Config file
===========
There's an example config file in the top-level dir of the repository, named
``config.yaml`` that you can use as a reference. Some explanation on the main
sections are listed below.
2020-09-06 09:29:57 +02:00
Sites
-----
.. highlight:: yaml
2020-09-06 09:29:57 +02:00
You can specify the sites you want checked in the yaml config file. They are
stored in the ``sites`` key and are represented as a list of objects with
``url`` and ``check_interval`` as mandatory fields, and regex as an optional
field that can freely omitted which checks the body of the response against the
regex expression::
2020-09-06 10:59:08 +02:00
- url: "https://example.com"
regex: "domain" # a regex matching the body of the response
2020-09-06 09:29:57 +02:00
check_interval: 5
- url: "https://example.com"
2020-09-06 10:59:08 +02:00
regex: "aaaaaaaa" # a regex not matching the body of the response
2020-09-06 09:29:57 +02:00
check_interval: 8
- url: "https://example.com/404"
check_interval: 13
Logging configuration
---------------------
The ``logging`` section must be present. A simple example of a console logger
as seen in ``config.yaml``::
logging:
version: 1
formatters:
standard:
format: "%(asctime)s - %(name)s - %(levelname)s - %(message)s"
error:
format: "%(levelname)s <PID %(process)d:%(processName)s> %(name)s.%(funcName)s(): %(message)s"
handlers:
console:
class: logging.StreamHandler
level: DEBUG
formatter: standard
stream: ext://sys.stdout
root:
level: DEBUG
handlers: [console]
propogate: yes
Postgres
--------
Postgres settings with aivens certs, passwords, etc. You'll get them once you
setup the postgres service on https://console.aiven.io/. My postgres section
in the config looks like this::
postgres:
dbhost: "pg-2e0f365c-vladanovic-4654.aivencloud.com"
dbport: 23700
dbname: "defaultdb"
dbuser: "avnadmin"
2021-05-06 23:26:49 +02:00
dbpass: "..."
2020-09-06 09:29:57 +02:00
dbcert: "./certs/pg.pem"
* ``dbhost`` and ``dbport``, ``dbuser`` and ``dbpass`` are straightforward.
* ``defaultdb`` is the database present by default, when the service is
created. You can create another database if it rocks your boat.
* ``dbcert`` is the cert which opens in a modal popup in the console. You need
to copy it manually to a file on a path you later state in the config.
Kafka
-----
Kafka is also a service easily provisioned through aivens console. After it's
set up you get a config section similar to this one::
kafka:
servers:
- "kafka-f7ae38e-vladanovic-4654.aivencloud.com:23702"
topic: "sitestats"
cafile: "./certs/ca.pem"
cert: "./certs/service.cert"
key: "./certs/service.key"
2021-05-06 23:26:49 +02:00
passwd: "..."
2020-09-06 09:29:57 +02:00
* ``servers`` is a list because that's how the library is initialized, which
makes sense if you have multiple brokers.
* ``topic`` is the kafka topic messages are sent to. You need to define it in
aivens console as well.
* ``cafile``, ``cert`` and ``key`` are the ssl certificates you get when aivens
kafka service is ready.
2021-05-06 23:26:49 +02:00
* ``password`` your aiven provided password.