On Wed, Nov 23, 2022 at 08:21:57AM +0000, haifeng.xu wrote:
When change the 'cpuset.mems' under some cgroup, system will hungThis is only a problem in cgroup1 and cgroup1 doesn't require the threads of
for a long time. From the dmesg, many processes or theads are
stuck in fork/exit. The reason is show as follows.
thread A:
cpuset_write_resmask /* takes cpuset_rwsem */
...
update_tasks_nodemask
mpol_rebind_mm /* waits mmap_lock */
thread B:
worker_thread
...
cpuset_migrate_mm_workfn
do_migrate_pages /* takes mmap_lock */
thread C:
cgroup_procs_write /* takes cgroup_mutex and cgroup_threadgroup_rwsem */
...
cpuset_can_attach
percpu_down_write /* waits cpuset_rwsem */
Once update the nodemasks of cpuset, thread A wakes up thread B to
migrate mm. But when thread A iterates through all tasks, including
child threads and group leader, it has to wait the mmap_lock which
has been take by thread B. Unfortunately, thread C wants to migrate
tasks into cgroup at this moment, it must wait thread A to release
cpuset_rwsem. If thread B spends much time to migrate mm, the
fork/exit which acquire cgroup_threadgroup_rwsem also need to
wait for a long time.
There is no need to migrate the mm of child threads which is
shared with group leader.
a given task to be in the same cgroup. I don't think you can optimize it
this way.