Opened 16 years ago

Closed 12 years ago

#26 closed defect (fixed)

Better HTML sanitization for the mail viewer

Reported by: rjl Owned by: mortonda@…
Priority: high Milestone: 1.0.3
Component: PHP scripts Version: 1.0.0 RC5
Severity: critical Keywords: html images sanitization mail viewer
Cc:

Description

A more thorough HTML-sanitizing routine should be used to make sure that web bugs don't slip through to the mail viewer. We can't trust MIME-type declarations, either, apparently--some parts advertised as "text/plain" in fact contain HTML that needs to be sanitized. Likewise, tag elements that support URLs as arguments (e.g. "background=...", "link=...") need to be stripped out.

Attachments (1)

sanitize.patch (2.8 KB) - added by mortonda@… 12 years ago.
a basic implementation of htmltidy

Download all attachments as: .zip

Change History (9)

comment:1 Changed 16 years ago by dmorton

  • Priority changed from normal to high
  • Severity changed from normal to critical

this can cause major formatting errors, but more importantly, it could potentially introduce some security problems with a certain browser that has so many security holes it couldn't even float on dry land.

comment:2 Changed 14 years ago by dmorton

  • Milestone 1.0.0 RC6 deleted

comment:3 Changed 13 years ago by rjl

  • patch set to 0

As David suggests in a comment to #456, we would probably be better off

using a third-party HTML filter like [http://htmlpurifier.org/ HTML Purifier].

comment:4 Changed 13 years ago by anonymous

  • Milestone set to 1.0.3

Changed 12 years ago by mortonda@…

a basic implementation of htmltidy

comment:5 Changed 12 years ago by mortonda@…

  • Owner changed from rjl to mortonda@…
  • Status changed from new to accepted

The patch I just uploaded is a basic implementation, but some discussion and testing is probably needed.

First, I disabled its cache for now, until we determine if and where that cache should be set up. Maybe that should be a maia_config.php option.

Also, I set it up to disallow all URL's. This effectively blocks all images and also prevents clicking through a bit of spam.

On first glance, it seems to do a nice job, and I don't see any major performance problems on an old 1Ghz athlon even with the cache disabled, keeping in mind that this is not one of the more frequently hit sections of code.

comment:6 Changed 12 years ago by mortonda@…

heh, I meant html purifier, not html tidy.

comment:7 Changed 12 years ago by mortonda@…

  • Resolution set to fixed
  • Status changed from accepted to testing

committed in [1307], [1311], and [1312]

It will function ok without the extra config item, and while it supposedly speeds up HTMLPurifier to have a cache, I don't know if it is enough to worry about.

comment:8 Changed 12 years ago by mortonda@…

  • Status changed from testing to closed
Note: See TracTickets for help on using tickets.