Red Engine: modern scheduling framework for Python applications

krastanov · on July 3, 2022

This fits my use case so perfectly! I have a very small internal app taking care of organizing seminar talks, calendars, email announcements, recordings of the talks, and signups. It is a single python file of less than 1500 lines and an sqlite database. This library is so perfect for taking care of scheduled events. Everything else I have found is ridiculously over-complicated of a solution.

bckr · on July 3, 2022

Yeah, I'll definitely try this for little server-based tasks

hungryroark · on July 3, 2022

Can you share repo to your application?

krastanov · on July 4, 2022

I am a research scientist that knows how to write numerical code, but I do not trust myself to write secure web software. Which is the reason I regrettably keep the app in question internal and the source unreleased.

To give a more useful answer though: it just uses cherrypy as a web framework and the zoom python bindings and sqlite. Nothing sophisticated, just a CRUD app that occasionally needs to download a large file and transcoded it in the background (which is where this scheduler will be used).

windmark · on July 4, 2022

I assume that “internal” means it’s just that and not accessible.

cercatrova · on July 3, 2022

Not to be confused with REDengine by CD Projekt Red, used to make the Witcher and Cyberpunk 2077 games [0]. They're moving to Unreal Engine 5 though.

[0] https://witcher.fandom.com/wiki/REDengine

djvdq · on July 3, 2022

Yeah, when I read the title I was like: "wtf, CDPR released Python version of their engine?"

daemin · on July 3, 2022

I too first thought this had something to do with RED Engine 4 from CDPR.

Unfortunately CDPR's latest RED Engine 4 will not be released as open source in the forseeable future, it's basically dead and locked away, and will probably never see the light of day. Maybe it would be open sourced but in something like 20 years time.

wmichelin · on July 3, 2022

How does it handle state and restarts? What happens if a job is scheduled to run "before 10am", then the entire server restarts at 9:55am, will it try to run that same job again when it boots back up?

Closi · on July 3, 2022

It will run it if the scheduler is called before 10am according to the docs - at runtime the conditions must be met.

wmichelin · on July 4, 2022

Right but what if it already ran? Should the jobs be written such that they are tolerant to re-runs?

falcor84 · on July 4, 2022

I think that regardless of the scheduling tool, it is very worthwhile putting in the effort to make your tasks idempotent. I've had tons of cases when I had to repeatedly rerun failed tasks "in anger", and was always grateful to know I made it safe to do so.

angrais · on July 3, 2022

If it took less than 5 minutes to boot then I can't see why it wouldn't work.

How would that work for other schedulers? Also, if a server reboots that's quite bad all round anyway. Hopefully you'd be notified directly.

vore · on July 3, 2022

I think on the contrary, it's important to build software that's resilient to random reboots: you can't control things like losing power or your CPU failing, so the less toil you have to manage around that the better.

mplewis · on July 3, 2022

Other schedulers have a durable database of attempted runs. This doesn't seem to have anything like that.

lovelearning · on July 4, 2022

From its docs, I understand that it has a task status logging layer[1] which can persist to SQL/Mongo (using another framework called Red Bird[2]). I haven't tried them out though.

[1]: https://red-engine.readthedocs.io/en/stable/tutorial/basic.h...

[2]: https://red-bird.readthedocs.io/en/latest/

dr_kiszonka · on July 3, 2022

Would you have a recommendation for an easy to use Python scheduler with such a feature for a personal project?

lobocinza · on July 4, 2022

I think you can create ephemeral timers with Systemd if you're on Linux.

staticautomatic · on July 3, 2022

APScheduler

dr_kiszonka · on July 4, 2022

It looks pretty good. Thanks!

MonkeyMalarky · on July 3, 2022

Seconding this. I would love a middle tier job scheduler / manager for Python that has persistence. I feel like there's a missing middle between cron+scripts and the enterprise grade tooling built for ETL tasks.

beci · on July 5, 2022

https://www.prefect.io/ is also pretty easy.

tekromancr · on July 4, 2022

Celery Beat is pretty good

nomdep · on July 3, 2022

Looks nice but very limited.

I don’t see any mention to serialization, so I guess this is single-server and memory only. I also couldn’t find any mention to error handling or retries.

gigatexal · on July 3, 2022

What makes this true?

“Red Engine is not meant to be the scheduler for enterprise pipelines, unlike Airflow, but it is fantastic to power your Python applications.”

Wronnay · on July 3, 2022

It sounds like the author doesn't think enterprise apps can be Python applications

swagonomixxx · on July 3, 2022

Using Python decorators is a very strange choice for constructing computational graphs. Then again, I don't really use Python for this kind of thing, I roll my own solutions (of just functions being composed together).

One big issue I have with the proposed approach is that it's very difficult for me to see at a glance the actual compute graph. I suppose you can build some tools to visualize it from the DSL in the decorator call, but I'd much rather be able to see this directly in code, with no weird magic, so that I can very easily interpret and update it if need be.

efxhoy · on July 3, 2022

  from redengine import RedEngine
  
  app = RedEngine()
  
  @app.task('daily')
  def do_things():
      ...
  
  if __name__ == "__main__":
      app.run()

> We initialized the RedEngine application, created one task which runs every 10 seconds and then we started the app. https://red-engine.readthedocs.io/en/stable/tutorial/quick_s...

Might wanna fix that.

vorpalhex · on July 3, 2022

If you're going to build a framework like this, please please put your consistency behavior and retry behavior at the top of your docs.

ancieque · on July 3, 2022

Yes, 100% this. Too many frameworks claim that they are "easy" just to find out they left out things that other older frameworks have already solved.

While we are on this: Do you know of any task scheduler framework that is similar to celery, but has better guarantees around task execution than what acks_late= True gives you?

I always find myself building a system that stores the really important Tasks in Postgres so that I can recover from anything in the broker or celery crashing. What i use celery for is just scheduling these Tasks by creating a celery job with the Postgres Job ID as the parameter.

Then, to detect if something went horribly wrong, I have a sweeping job that checks if any job in Postgres has not run in celery for some reason. If that is the case, we just re-queue the job.

vorpalhex · on July 3, 2022

If your tasks are idempotent, Dramatiq is intended for your case.

https://dramatiq.io/

ancieque · on July 3, 2022

After reading the linked article in more detail, I might have misinterpreted things a bit, but my concerns are still related :)

kumare3 · on July 7, 2022

This is cool, you could provide resilience to Red Engine, by providing it a backend using Flyte.org. Checkout example of making a new API on top of Flyte which has similarish feel - https://unionml.readthedocs.io/en/latest/index.html#quicksta....

Thus users could continue using RED, and if they want to scale to multiple machines or want resilience, you could allow them to switch out the backend to Flyte.

Disclosure: I am maintainer of Flyte. This is just a suggestion. Great work!

SergeAx · on July 4, 2022

> Red Engine provides more features than Crontab and APScheduler and it is much easier to use than Airflow.

Correct me if I'm wrong, but the framework more powerful than Crontab and easier to use than Airflow is Celery. But Celery is not even mentioned here, why?

dirkc · on July 4, 2022

I've never considered using Celery on it's own - it's part of my typical web app stack. Do you mind sharing some examples of how you're using celery?

wodenokoto · on July 3, 2022

Off topic but,

Many scheduling system I’ve worked with have this weird tendency to run when deployed and then every time it’s time for it to run.

I’ve had this happen with kubernetes, Scheduled Quries in GCP big query and a few other systems.

That seems like absolute madness. Why would anything do that?

lobocinza · on July 4, 2022

I imagine it's to ensure it will run ASAP in case the scheduler crashes, power failure or whatever. Systemd has a similar feature "Persistence" for timers but it's configurable.

nuclearnice1 · on July 3, 2022

Airflow does this. Madness!

daenz · on July 3, 2022

As cool as it is, scheduled jobs (and jobs in general) should really be isolated from the rest of your system and from each other to limit the blast radius. Jobs are notorious for crashing and backing up, so you really don't want that impacting other systems on the same machine.

Clouds have managed systems for scheduled jobs now (AWS Eventbridge, GCP Cloud Scheduler) which handle this for you.

qwertox · on July 3, 2022

I'm using APScheduler for most of my scheduling tasks.

Does Red Engine integrate with asyncio? When searching the docs for this keyword no hits showed up.

bergundy · on July 3, 2022

Going to shamelessly plug Temporal’s Python SDK which was designed for asyncio.

https://github.com/temporalio/sdk-python

Disclaimer: I work for Temporal

nerdponx · on July 3, 2022

> Clean: Scheduling is just plain English

Ugh, no thanks.

First of all, English itself is not clean; it's a messy amalgamation of special cases and inconsistent spelling rules.

Second, it isn't actually English anyway. It might look like English, but it's actually a DSL that happens to correspond to English a lot of the time. English text is meant to be interpreted by humans, who understand context & connotation, and can resolve ambiguities by making educated guesses or discussing the text with other humans. But your English-like DSL can only be interpreted by a computer program, which cannot (and arguably should not) do such things. Ergo, the benefits of using natural language are lost, and you are left with the same strict interpretation rules as any other programming language, but without any of the syntactic rigor that would normally help you construct programs/expressions that are both syntactically correct and also do what you intended them to do. Finally, the passing similarity to another language is a newbie trap and it makes teaching more difficult. See also: SQL, Python.

Worse still, it's represented in code as a string literal. It cannot be reasonably syntax-highlighted or otherwise statistically analyzed, nor can it be easily constructed dynamically if needed, nor can scheduling primitives be combined or composed. It is the worst of all worlds, and you have no way to check if your program is valid other than to run it and see if it crashes. And you have to re-learn operator precedence / associativity rules, because they probably won't be identical to the rules in Python itself.

I'm sorry if there is a really high quality scheduling engine underneath this DSL, but I absolutely would never want to use something like this in production code.

(I'm sure you can guess how I feel about BDD frameworks and "expect.foo.to.be.equal.to" style test APIs).

Groxx · on July 3, 2022

    @app.task('daily & is foo', execution="process")

followed by

    @app.task("after task 'do_daily'")

Yeah, I did a hard turn towards "nope" right there.

Similar but not quite Python combined with similar but not quite English does not make a tasty dish. It makes yet another pointless one-off thing to learn and struggle with.

nerdponx · on July 3, 2022

These in particular remind me of "Baba Is You", which is probably not a good thing for an API.

aeyes · on July 3, 2022

For some reason they chose to put the concept which is hardest to understand on the front page. 'is foo' is showing how to define and use a custom condition, the conditions in the rest of the documentation are easy to understand and make sense to me.

The only way I could see me struggle with this would be to piece together a large call graph in my head. I can understand why the author says that Airflow is better suited for this case because you get a visualization.

elcapitan · on July 3, 2022

Just got hard AppleScript PTSD.

OJFord · on July 3, 2022

It's also just annoyingly redundant, and breaks searching, completion, etc.:

    @app.cond('is foo')
    def is_foo():

IMO

    @app.cond
    def is_foo():

And 'losing' that you can omit the underscore to make it look a bit like English is much better. Plus then I can jump from its use back to this definition without any DSL-aware tooling (that probably doesn't exist).

klysm · on July 3, 2022

The mistake of trying to make programming languages look like English looks like it will be repeated forever. English is not a good language for expressing things specifically to computers or other humans for that matter.

keredson · on July 4, 2022

100%. also, why re-invent python's already pretty decent datetime/timedeltas?

and why not a proper python DSL?

  from redengine import minute, hour
  
  @app.run_every(hour + 20*minute)
  def do_first(): [...]
  
  @app.run_after(do_first)
  def do_second1(): [...]
  
  @app.run_after(do_first)
  def do_second2(): [...]
  
  @app.run_after(do_second1 and do_second2)
  def do_last(): [...]

mft_ · on July 3, 2022

It’s plain English like SQL is plain English. It uses English words that’s kind-of make sense to a novice, but that doesn’t stop it being a new programming language.

Volker_W · on July 3, 2022

> See also: SQL, Python

I do not see what is wrong with SQL and I definitely do not see what is wrong with Python.

New_California · on July 3, 2022

You nailed it.

nrjames · on July 3, 2022

Is this meant for use instead of something like Luigi?