This is an old revision of the document!


Custom web archives

This feature must allow Sympa users to store archives in a different place that the classical web archives based on MHonArc.

The development was requested and is paid by Bendlin GmbH.

It was first supposed to be Amazon simple dB.

It seems now that it should be either Postgres or MySQL. Anyway, as long as a perl DBD exists for the chosen RDBMS, there should be no problem to transfer data from Sympa to this RDBMS.

Field name Data type Field description
message_key int(11) the SMTP message identifier
message_id varchar(150) the SMTP message identifier
group_id varchar(150) the name of the mailing list to which the message was sent
timestamp int(11) the epoch date when the message was sent
subject varchar(100) the content of the SMTP Subject field in the orginal message
message text the full SMTP message, including all the headers
uid ? ?

The message id doesn't seem to be a good primary key. If a message is sent to two lists, we could have a primary key collision. For this reason, we add an arbitrary numerical primary key.

A parameter in sympa.conf activates / desactivates this feature : custom_archive.

We add a test to archived.pl. When archiving a message, if the custom_archive parameter is set to on, an external script is called – instead of the code using MHonArc – that will store the message into the chosen database.

This script can't be used in a similar manner as what we thought for hooks, as it changes Sympa behaviour and doesn't only add new operations to an existing sub.

A few details about this script:

  • Content: The script contains only the code specifically used for archive storage in an RDBMS. It uses the Sympa scripts and libraries to work. It then contains the path to the Sympa bin directory. However, details about the connection to the database will have to be final user defined by editing the script. We don't think it is relevant to include the connection parameters to the archives database in one of the Sympa configuration files. Consequently, the distributed script must be considered as an example script. It can remain in the sources or be installed in one of the Sympa directories. Either ways, it must be copied in a place that will not be overwritten during the next Sympa upgrade.
  • Distribution. In the Sympa sources, you will find the script in the <sympa_home>/bin/etc/scripts/ directory under the name custom_archiving.pl.
  • Name and path. We have three solutions for the name and path of the script:
    1. We define it once and for all. It won't be changed without changing the code of archived.pl. It could be etc/custom_bin.
    2. We let people customize the path only. It could be a parameter like static_content_path which could be later used for hooks, and whose purpose would be to host custom codes for a Sympa install. I propose the name custom_code_path.
    3. We let people customize the script name. In this case, it must be a robot parameter. I propose the name custom_archiving_script_path for this parameter.

the two later options can be used separately from one another: you can force to use a specific path and customize the script name and the otheer alternative is possible.

  • dev/custom_archiving.1234954300.txt.gz
  • Last modified: 2009/02/18 11:51
  • by david.verdin@cru.fr