October 5, 2022

Robotic Notes

All technology News

Building a Distributed Task Queue in Python – Java Code Geeks

2 min read


Why not just use Celery/RQ/Huey/TaskTiger?

Unfortunately, WakaTime has been using Celery for almost 10 years now. During this time I encountered many critical bugs, some still open years after they were introduced. Celery was pretty good, but feature bloat made the project difficult to maintain. Also in my opinion, splitting the code into three separate GitHub repositories made the codebase difficult to read.

However, the main reason: Celery delayed tasks do not scale.

If you’re using Celery Deferred Tasks as your website eventually grows, you’ll start seeing this error message:

QoS: Disabled: prefetch_count exceeds 65535

When this happens, the worker stops processing all tasks, not just the delayed ones! As WakaTime grew, we started to encounter this error more often.

I tried RQ, Huey, and TaskTiger, but they lacked features and processed tasks more slowly than Celery. A distributed task queue is indispensable for a website like WakaTime and I was tired of running into errors. For this reason, I decided to create the simplest distributed task queue possible, while still providing all the features required by WakaTime.

Introducing WakaQ

WakaQ is a new distributed Python task queue. Use it to run code in the background so your website stays snappy and snappy and your users happy.

WakaQ is simple

That’s only 1264 lines of code!

$ find . -name '*.py' -not -path "./migrations*" -not -path "./venv*" | xargs wc -l | grep " total" | awk 'print $1' | numfmt --grouping
1,264

It took just one week from the first line of code to completely replacing Celery in WakaTime. That says something about its simplicity.

Each queue is implemented using a Redis list. Delayed tasks get their own queues implemented using Redis sorted sets. Broadcast tasks share a single Redis Pub/Sub queue.

WakaQ has all the features you need

  • Queue priorities
  • Delayed tasks (run tasks after the time delay)
  • Scheduled periodic cron jobs
  • Broadcast tasks (run a task to all workers)
  • Soft and hard constraints on task waiting
  • Optionally retry tasks on soft waits
  • Combats memory leaks by restarting workers when max_mem_percent is reached
  • Super minimal and maintainable

Features considered out of scope are rate limiting, exclusive locking, storing task results, and task chaining. These are easy to add in your app’s task code, and you probably want to implement them specifically for your app’s needs.

WakaQ is ready to use

WakaQ is still a new project, so use at your own risk. WakaQ currently powers all background tasks in prod for the WakaTime website, including but not limited to:

  • sending code statistics email reports
  • renewing our LetsEncrypt SSL certificates
  • pre-cached dashboards, repo badges and embeddable charts
  • anything else we don’t want to hold up web requests

It’s released under a BSD license, so you can use it in open source and closed source projects. If you find bugs, please open an issue, but think twice before asking for new features ๐Ÿ™‚

Happy coding!



Source link