Regarding the current list server: * PPro 200 w/ 128MB RAM, 9GB, 18GB SCSI discs Paraphrasing Carl on the reasons it's getting overwhelmed: * mailman stores its data in huge mbox files * mbox files must be locked for each read & write. * large # of requests for the LKML archives: spiders, etc. Mailman also requires that each email to be "threaded" with the pipermail module of the program be submitted one-by-one through the global mailman wrapper script. There is no well-defined separation between the archive function of the list and the distribution function of the list. The libraries may support it with some hacking, but it's not there yet. Paraphrasing Carl's solutions: * rewrite mailman to use an SQL backend. * hardware upgrades: memory, extra harddrives, etc * new box for TCLUG SQL is overkill. It is also slower than standard file transfers. The linux kernel is much better at pushing out static pages than dynamic content anyway. Use it to its advantage. That's the reason mailman is web-ifying the list archive anyway. Regarding spiders, you can configure Apache to send the "No Archive" header when fulfilling requests for *.mbox files. Allow the spiders to hit the individual HTML files generated. This is how most people would prefer to view web-based archives anyway. Provide links on the site to download the full mbox files. (I'd be interested in seeing statistics on how many mbox files are searched as opposed to simply being downloaded). Store the list into separate mbox files based on size or date. gzip them. mbox files don't need to be accessed directly, ever. Let users download them to their mail directories and browse them there. Otherwise make them use the web front-end. As far as webifying the archive goes. Drop pipermail. Don't even bother with it. Use mhonarc. You can offload the html-izing of the archive on to another machine with this very nice and configurable tool. All you need is the mbox, perl, and mhonarc. Use rsync or ftp to push the html-ized archive to your website. Let the distribution of the lists remain separate from the web-ifying of the list. Don't bother pushing out "ancient emails" through an exploder list if you fall behind. Simply provide the mbox files to the subscribers and say: "Look. The email server was down for a while and we missed about 30,000 posts from LKML. Instead trying to re-inject them. You can download the mbox file from <URI>. The web-archives will be updated every 3 hours and can be found at <URI>. If you need help with integrating these mboxes into your own email client/folders, take the mbox and dump it on your spool with cat file >> <spoolfile>. A more elegant solution is..." Really. Until mailman separates the tasks of distribution from webifying the archive in a clean break, something that can be offloaded (hmm...separate mailman from pipermail perhaps), it'll continue to be a bad solution for high-volume lists. It's much better to go with a simple email list server, such as SmartList, and use the power of multiple tools, such as mhonarc, perl, and apache. DISCLAIMER This post is simply a continuation on the discussion about the TCLUG email list and it's use of mailman, the python email listserver / archiver. It is not intended as a criticism of the services that Real Time is providing us for Free! I just honestly disagree with the implementation being used. Regardless, I appreciate RTE's commitment to TCLUG! -- Chad Walstrom <chewie at wookimus.net> | a.k.a. ^chewie http://www.wookimus.net/ | s.k.a. gunnarr Get my public key, ICQ#, etc. $(mailx -s 'get info' chewie at wookimus.net) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 232 bytes Desc: not available Url : http://shadowknight.real-time.com/pipermail/tclug-list/attachments/20020129/5374d041/attachment.pgp