Opened 14 years ago

#295 new enhancement

Rework stats tables to avoid the need to store totals

Reported by: rjl Owned by: rjl
Priority: normal Milestone: 1.1.0
Component: SQL scripts Version: 1.0.0
Severity: normal Keywords: stats average totals maia_stats maia_stats_history
Cc:

Description

In the maia_stats and maia_stats_history tables, we currently store item counts as well as a "total" column that holds a running sum of all the items of a given type processed to date. Since this is an ever-growing figure, the total column is doomed to eventually overflow whatever storage container we use for it (currently BIGINT and UNSIGNED BIGINT). It would be best to avoid storing the total figure at all, particularly since we can always compute it by other means.

Specifically, I propose that instead of storing totals we store averages. Typically we think of averages as:

avg = sum / count

but we also know that

sum = avg * count

so if we only store averages and counts, we can always obtain the sum figure when we need it. When a new item gets processed, the average gets updated like so:

newavg = ((oldavg * count) + newitem) / (count + 1)

By computing the sum instead of storing it, we avoid having to deal with an ever-growing figure (and hence the use of disgustingly large data types like BIGINTs). Instead of:

total_*_size

we would store:

average_*_size

The same principle can be applied to other stats for which we store totals, particularly when we're only storing the totals in order to compute averages.

Change History (0)

Note: See TracTickets for help on using tickets.