• Dmitry Monakhov's avatar
    bfq: fix blkio cgroup leakage v4 · 2de791ab
    Dmitry Monakhov authored
    Changes from v1:
        - update commit description with proper ref-accounting justification
    
    commit db37a34c ("block, bfq: get a ref to a group when adding it to a service tree")
    introduce leak forbfq_group and blkcg_gq objects because of get/put
    imbalance.
    In fact whole idea of original commit is wrong because bfq_group entity
    can not dissapear under us because it is referenced by child bfq_queue's
    entities from here:
     -> bfq_init_entity()
        ->bfqg_and_blkg_get(bfqg);
        ->entity->parent = bfqg->my_entity
    
     -> bfq_put_queue(bfqq)
        FINAL_PUT
        ->bfqg_and_blkg_put(bfqq_group(bfqq))
        ->kmem_cache_free(bfq_pool, bfqq);
    
    So parent entity can not disappear while child entity is in tree,
    and child entities already has proper protection.
    This patch revert commit db37a34c ("block, bfq: get a ref to a group when adding it to a service tree")
    
    bfq_group leak trace caused by bad commit:
    -> blkg_alloc
       -> bfq_pq_alloc
         -> bfqg_get (+1)
    ->bfq_activate_bfqq
      ->bfq_activate_requeue_entity
        -> __bfq_activate_entity
           ->bfq_get_entity
             ->bfqg_and_blkg_get (+1)  <==== : Note1
    ->bfq_del_bfqq_busy
      ->bfq_deactivate_entity+0x53/0xc0 [bfq]
        ->__bfq_deactivate_entity+0x1b8/0x210 [bfq]
          -> bfq_forget_entity(is_in_service = true)
    	 entity->on_st_or_in_serv = false   <=== :Note2
    	 if (is_in_service)
    	     return;  ==> do not touch reference
    -> blkcg_css_offline
     -> blkcg_destroy_blkgs
      -> blkg_destroy
       -> bfq_pd_offline
        -> __bfq_deactivate_entity
             if (!entity->on_st_or_in_serv) /* true, because (Note2)
    		return false;
     -> bfq_pd_free
        -> bfqg_put() (-1, byt bfqg->ref == 2) because of (Note2)
    So bfq_group and blkcg_gq  will leak forever, see test-case below.
    
    ##TESTCASE_BEGIN:
    #!/bin/bash
    
    max_iters=${1:-100}
    #prep cgroup mounts
    mount -t tmpfs cgroup_root /sys/fs/cgroup
    mkdir /sys/fs/cgroup/blkio
    mount -t cgroup -o blkio none /sys/fs/cgroup/blkio
    
    # Prepare blkdev
    grep blkio /proc/cgroups
    truncate -s 1M img
    losetup /dev/loop0 img
    echo bfq > /sys/block/loop0/queue/scheduler
    
    grep blkio /proc/cgroups
    for ((i=0;i<max_iters;i++))
    do
        mkdir -p /sys/fs/cgroup/blkio/a
        echo 0 > /sys/fs/cgroup/blkio/a/cgroup.procs
        dd if=/dev/loop0 bs=4k count=1 of=/dev/null iflag=direct 2> /dev/null
        echo 0 > /sys/fs/cgroup/blkio/cgroup.procs
        rmdir /sys/fs/cgroup/blkio/a
        grep blkio /proc/cgroups
    done
    ##TESTCASE_END:
    
    Fixes: db37a34c
    
     ("block, bfq: get a ref to a group when adding it to a service tree")
    Tested-by: default avatarOleksandr Natalenko <oleksandr@natalenko.name>
    Signed-off-by: default avatarDmitry Monakhov <dmtrmonakhov@yandex-team.ru>
    Signed-off-by: default avatarJens Axboe <axboe@kernel.dk>
    2de791ab