Context and Problem Statement🔗
Currently there is a
Scheduler that consumes tasks off a queue in
the database. This allows multiple job executors running in parallel
racing for the next job to execute. This is for executing tasks
immediately – as long as there are enough resource.
What is missing, is a component that maintains periodic tasks. The reason for this is to have house keeping tasks that run regularily and clean up stale or unused data. Later, users should be able to create periodic tasks, for example to read e-mails from an inbox or to be notified of due items.
The problem is again, that it must work with multiple job executor
instances running at the same time. This is the same pattern as with
Scheduler: it must be ensured that only one task is used at a
time. Multiple job exectuors must not schedule a perdiodic task more
than once. If a periodic tasks takes longer than the time between
runs, it must wait for the next interval.
- Adding a
nextrunfield to the current
- Creating a separate table for periodic tasks
The 2. option.
For internal housekeeping tasks, it may suffice to reuse the existing
job queue by adding more fields such that a job may be considered
periodic. But this conflates with what the
Scheduler is doing now
(executing tasks as soon as possible while being bound to some
resource limits) with a completely different subject.
There will be a new
PeriodicScheduler that works on a new table in
the database that is representing periodic tasks. This table will
share fields with the
job table to be able to create
This new component is only taking care of periodically submitting jobs
to the job queue such that the
Scheduler will eventually pick it up
and run it. If the tasks cannot run (for example due to resource
limitation), the periodic scheduler can't do nothing but wait and try
"id" varchar(254) not null primary key, "enabled" boolean not null, "task" varchar(254) not null, "group_" varchar(254) not null, "args" text not null, "subject" varchar(254) not null, "submitter" varchar(254) not null, "priority" int not null, "worker" varchar(254), "marked" timestamp, "timer" varchar(254) not null, "nextrun" timestamp not null, "created" timestamp not null ); (
Preparing for other features, at some point periodic tasks will be
created by users. It should be possible to disable/enable them. The
next 6 properties are needed to insert jobs into the
job table. The
worker field (and
marked) are used to mark a periodic job as
"being worked on by a job executor".
timer is the schedule, which is a
calendar event string. This is parsed by this
nextrun field will
store the timestamp of the next time the task would need to be
executed. This is needed to query this table for the newest task.
PeriodicScheduler works roughly like this:
- Remove stale worker values. If the process has been killed, there may be marked tasks which must be cleared now.
Main-Loop: 0. Cancel current scheduled notify (see 4. below)
- get next (= earliest & enabled) periodic job
- if none: stop
- if triggered (=
nextrun <= 'now'):
- Mark periodic task. On fail: goto 1.
- Submit new job into the jobqueue:
- Check for non-final jobs of that name. This is required to not
run the same periodic task multiple times concurrently.
- if exist: goto 4.
- if not exist: submit job
- Unmark periodic task
- if future
- schedule notify: notify self to run again next time the task schedule triggers