This is an old revision of the document!
Custom web archives
This feature must allow Sympa users to store archives in a different place that the classical web archives based on MHonArc.
The development was requested and is paid by Bendlin GmbH.
Storage database
It was first supposed to be Amazon simple dB.
It seems now that it should be either Postgres or MySQL. Anyway, as long as a perl DBD exists for the chosen RDBMS, there should be no problem to transfer data from Sympa to this RDBMS.
Storage structure
Field name | Data type | Field description |
---|---|---|
message_key | int(11) | the SMTP message identifier |
message_id | varchar(150) | the SMTP message identifier |
group_id | varchar(150) | the name of the mailing list to which the message was sent |
timestamp | int(11) | the epoch date when the message was sent |
subject | varchar(300) | the content of the SMTP Subject field in the orginal message |
message | text | the full SMTP message, including all the headers |
uid | ? | ? |
<box red|Primary key specification>
The smtp header message-id:
doesn't seem to be a good primary key. If a message is sent to two lists, we could have a primary key collision. For this reason, we propose to add an arbitrary numerical primary key. Search by message-id:
would still be possible.
</box>
Customization mechanism
A parameter in sympa.conf activates / desactivates this feature : custom_archiver
.
We add a test to archived.pl. When archiving a message, if the custom_archiver
parameter is set to on
, an external script is called – instead of the code using MHonArc – that will store the message into the chosen database.
This script can't be used in a similar manner as what we thought for hooks, as it changes Sympa behaviour and doesn't only add new operations to an existing sub.
A few details about this script:
- Content: The script contains only the code specifically used for archive storage in an RDBMS. It uses the Sympa scripts and libraries to work. It then contains the path to the Sympa bin directory. However, details about the connection to the database will have to be final user defined by editing the script. We don't think it is relevant to include the connection parameters to the archives database in one of the Sympa configuration files. Consequently, the distributed script must be considered as an example script. It can remain in the sources or be installed in one of the Sympa directories. Either ways, it must be copied in a place that will not be overwritten during the next Sympa upgrade.
- Distribution. In the Sympa sources, you will find the script in the
<sympa_home>/bin/etc/scripts/
directory under the namedb_archiver.pl
. - Name and path. We use a parameter named
archiver_path
that replaces the existingmhonarc
parameter. This parameter can be defined at the list or virtual level. You can use any script name or path for the script. You can have as much versions of the script as you want. One for each list if you like.