This is an old revision of the document!


Custom web archives

This feature must allow Sympa users to store archives in a different place that the classical web archives based on MHonArc.

The development was requested and is paid by Bendlin GmbH.

It was first supposed to be Amazon simple dB.

It seems now that it should be either Postgres or MySQL. Anyway, as long as a perl DBD exists for the chosen RDBMS, there should be no problem to transfer data from Sympa to this RDBMS.

Field name Data type Field description
message_key int(11) the SMTP message identifier
message_id varchar(150) the SMTP message identifier
group_id varchar(150) the name of the mailing list to which the message was sent
timestamp int(11) the epoch date when the message was sent
subject varchar(300) the content of the SMTP Subject field in the orginal message
message text the full SMTP message, including all the headers
uid ? ?

<box red|Primary key specification> The smtp header message-id: doesn't seem to be a good primary key. If a message is sent to two lists, we could have a primary key collision. For this reason, we propose to add an arbitrary numerical primary key. Search by message-id: would still be possible. </box>

A parameter in sympa.conf activates / desactivates this feature : custom_archiver.

We add a test to archived.pl. When archiving a message, if the custom_archiver parameter is set to on, an external script is called – instead of the code using MHonArc – that will store the message into the chosen database.

This script can't be used in a similar manner as what we thought for hooks, as it changes Sympa behaviour and doesn't only add new operations to an existing sub.

A few details about this script:

  • Content: The script contains only the code specifically used for archive storage in an RDBMS. It uses the Sympa scripts and libraries to work. It then contains the path to the Sympa bin directory. However, details about the connection to the database will have to be final user defined by editing the script. We don't think it is relevant to include the connection parameters to the archives database in one of the Sympa configuration files. Consequently, the distributed script must be considered as an example script. It can remain in the sources or be installed in one of the Sympa directories. Either ways, it must be copied in a place that will not be overwritten during the next Sympa upgrade.
  • Distribution. In the Sympa sources, you will find the script in the <sympa_home>/bin/etc/scripts/ directory under the name db_archiver.pl.
  • Name and path. We use a parameter named archiver_path that replaces the existing mhonarc parameter. This parameter can be defined at the list or virtual level. You can use any script name or path for the script. You can have as much versions of the script as you want. One for each list if you like.
  • dev/custom_archiving.1234975719.txt.gz
  • Last modified: 2009/02/18 17:48
  • by david.verdin@cru.fr