Python – Celery difference between concurrency, workers and autoscaling

celeryconcurrencypython

In my /etc/defaults/celeryd config file, I've set:

CELERYD_NODES="agent1 agent2 agent3 agent4 agent5 agent6 agent7 agent8"
CELERYD_OPTS="--autoscale=10,3 --concurrency=5"

I understand that the daemon spawns 8 celery workers, but I'm fully not sure what autoscale and concurrency do together. I thought that concurrency was a way to specify the max number of threads that a worker can use and autoscale was a way for the worker to scale up and down child workers, if necessary.

The tasks have a largish payload (some 20-50kB) and there are like 2-3 million such tasks, but each task runs in less than a second. I'm seeing memory usage spike up because the broker distributes the tasks to every worker, thus replicating the payload multiple times.

I think the issue is in the config and that the combination of workers + concurrency + autoscaling is excessive and I would like to get a better understanding of what these three options do.

Best Answer

Let's distinguish between workers and worker processes. You spawn a celery worker, this then spawns a number of processes (depending on things like --concurrency and --autoscale, the default is to spawn as many processes as cores on the machine). There is no point in running more than one worker on a particular machine unless you want to do routing.

I would suggest running only 1 worker per machine with the default number of processes. This will reduce memory usage by eliminating the duplication of data between workers.

If you still have memory issues then save the data to a store and pass only an id to the workers.

Related Solutions

Python – Difference between staticmethod and classmethod

Maybe a bit of example code will help: Notice the difference in the call signatures of foo, class_foo and static_foo:

class A(object):
    def foo(self, x):
        print(f"executing foo({self}, {x})")

    @classmethod
    def class_foo(cls, x):
        print(f"executing class_foo({cls}, {x})")

    @staticmethod
    def static_foo(x):
        print(f"executing static_foo({x})")

a = A()

Below is the usual way an object instance calls a method. The object instance, a, is implicitly passed as the first argument.

a.foo(1)
# executing foo(<__main__.A object at 0xb7dbef0c>, 1)

With classmethods, the class of the object instance is implicitly passed as the first argument instead of self.

a.class_foo(1)
# executing class_foo(<class '__main__.A'>, 1)

You can also call class_foo using the class. In fact, if you define something to be a classmethod, it is probably because you intend to call it from the class rather than from a class instance. A.foo(1) would have raised a TypeError, but A.class_foo(1) works just fine:

A.class_foo(1)
# executing class_foo(<class '__main__.A'>, 1)

One use people have found for class methods is to create inheritable alternative constructors.

With staticmethods, neither self (the object instance) nor cls (the class) is implicitly passed as the first argument. They behave like plain functions except that you can call them from an instance or the class:

a.static_foo(1)
# executing static_foo(1)

A.static_foo('hi')
# executing static_foo(hi)

Staticmethods are used to group functions which have some logical connection with a class to the class.

foo is just a function, but when you call a.foo you don't just get the function, you get a "partially applied" version of the function with the object instance a bound as the first argument to the function. foo expects 2 arguments, while a.foo only expects 1 argument.

a is bound to foo. That is what is meant by the term "bound" below:

print(a.foo)
# <bound method A.foo of <__main__.A object at 0xb7d52f0c>>

With a.class_foo, a is not bound to class_foo, rather the class A is bound to class_foo.

print(a.class_foo)
# <bound method type.class_foo of <class '__main__.A'>>

Here, with a staticmethod, even though it is a method, a.static_foo just returns a good 'ole function with no arguments bound. static_foo expects 1 argument, and a.static_foo expects 1 argument too.

print(a.static_foo)
# <function static_foo at 0xb7d479cc>

And of course the same thing happens when you call static_foo with the class A instead.

print(A.static_foo)
# <function static_foo at 0xb7d479cc>

Python – the difference between Python’s list methods append and extend

append: Appends object at the end.

x = [1, 2, 3]
x.append([4, 5])
print(x)

gives you: [1, 2, 3, [4, 5]]

extend: Extends list by appending elements from the iterable.

x = [1, 2, 3]
x.extend([4, 5])
print(x)

gives you: [1, 2, 3, 4, 5]

Best Answer

Related Solutions

Python – Difference between staticmethod and classmethod

Python – the difference between Python’s list methods append and extend

Related Topic