Woozy Song
2024-05-24 02:34:48 UTC
So the atd supposedly will not start another job until load factor falls
below a limit. Different documentation gives the default as 0.8 or 1.5
Now I launch a job that uses 4 cores on a 6-core CPU. If I run top
command, I see four processes running close to 100%.
Now if I submit another job 10 seconds later, that starts thereby
overloading the CPU. Documentation suggests setting load limit to more
than n-1 for n CPU cores, but I think that is intended for single-thread
jobs. I have tried altering the load limit in atd.service file to all
sorts of values, but second job keeps starting while the first is
flogging the CPU. I check with 'ps -ef|grep atd' to see it is using the
desired load limit. I am aware that the load factor is an average, you
can see it changes slowly in top/htop/glances. So I also increase the
delay between jobs to 30 seconds, but still nothing works. So it looks
like I have to specify a time like 'now+60 minutes' when I submit,
requiring some guess how long first job runs. I know I can install a
proper job scheduler such as Some Grid Engine, but that is more work.
This is on Debian 11, by the way.
below a limit. Different documentation gives the default as 0.8 or 1.5
Now I launch a job that uses 4 cores on a 6-core CPU. If I run top
command, I see four processes running close to 100%.
Now if I submit another job 10 seconds later, that starts thereby
overloading the CPU. Documentation suggests setting load limit to more
than n-1 for n CPU cores, but I think that is intended for single-thread
jobs. I have tried altering the load limit in atd.service file to all
sorts of values, but second job keeps starting while the first is
flogging the CPU. I check with 'ps -ef|grep atd' to see it is using the
desired load limit. I am aware that the load factor is an average, you
can see it changes slowly in top/htop/glances. So I also increase the
delay between jobs to 30 seconds, but still nothing works. So it looks
like I have to specify a time like 'now+60 minutes' when I submit,
requiring some guess how long first job runs. I know I can install a
proper job scheduler such as Some Grid Engine, but that is more work.
This is on Debian 11, by the way.