[kernel] 带着问题看源码 —— setreuid 何时更新 saved-set-uid (SUID)

前言

在写《[apue] 进程控制那些事儿》/"进程创建"/"更改进程用户 ID 和组 ID"一节时,发现 setreuid 更新实际用户 ID (RUID) 或有效用户 ID (EUID) 时,保存的设置用户 ID (saved set-user-id SUID) 只会随 EUID 变更,并不像 man 上说的会随 RUID 变更 (man setreuid):

 If the real user ID is set (i.e., ruid is not -1) or the effective user ID is set to a value not equal to the
 previous real user ID, the saved set-user-ID will be set to the new effective user ID.

下面是实测结果:

调用参数 (root 身份)RUIDEUIDSUID
setreuid (bar, foo)barfoofoo
setreuid (foo, bar)foobarbar
setreuid (-1, foo)rootfoofoo
setreuid (bar, -1)barrootroot
setreuid (bar, bar)barbarbar
setreuid (foo, foo)foofoofoo

特别是第 5 行 setreuid(bar, -1),RUID 变更为了 bar,SUID 仍保持 root 不变。

为了解答这个问题,找来系统对应版本的 linux 源码查看:

> uname -a
Linux goodcitizen.bcc-gzhxy.baidu.com 3.10.0-1160.80.1.el7.x86_64 #1 SMP Tue Nov 8 15:48:59 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

这里是 3.10。之前推荐过 CodeBrowser,新近又发现一个源码阅读神器 bootlin

能选代码库版本,功能上也比 CodeBrowser 好用,比如关键字搜索,不仅准确,还能把定义、声明、成员、引用区分清楚,很好的解决了 CodeBrowser 找不到代码的痛点,墙裂推荐~

问题复现

在撸源码之前,先复习一下这个问题的来龙去脉,为节省读者时间,把之前的 demo 贴上来 (setuid):

#include "../apue.h"
#include <sys/types.h>
#include <sys/file.h>
#include <sys/stat.h>
#include <unistd.h>
void print_ids ()
{
 uid_t ruid = 0;
 uid_t euid = 0;
 uid_t suid = 0;
 int ret = getresuid (&ruid, &euid, &suid);
 if (ret == 0)
 printf ("%d: ruid %d, euid %d, suid %d\n", getpid(), ruid, euid, suid);
 else
 err_sys ("getresuid");
}
int main (int argc, char *argv[])
{
 if (argc == 2)
 {
 char* uid=argv[1];
 int ret = setuid(atol(uid));
 if (ret != 0)
 err_sys ("setuid");
 print_ids();
 }
 else if (argc == 3)
 {
 char* ruid=argv[1];
 char* euid=argv[2];
 int ret = setreuid(atol(ruid), atol(euid));
 if (ret != 0)
 err_sys ("setreuid");
 print_ids();
 }
 else if (argc > 1)
 {
 char* uid=argv[1];
 int ret = seteuid(atol(uid));
 if (ret != 0)
 err_sys ("seteuid");
 print_ids();
 }
 else
 {
 print_ids();
 }
 return 0;
}

做个简单说明:

  • print_ids 打印当前进程 3 个权限ID:RUID / EUID / SUID,其中用到的 getresuid 仅 Linux 支持,能展示用户当前的 SUID 值
  • ./setuid 123:触发 setuid 调用,并打印调用后的结果
  • ./setuid 123 456:触发 setreuid 调用,并打印调用后的结果
  • ./setuid 123 456 789:触发 seteuid 调用,并打印调用后的结果。注意仅第一个参数有用,后两个占位

通过传递 setuid 程序不同参数,就可以验证不同的接口了,这里只需验证 setreuid,固定传递两个参数即可。把测试脚本也贴上来 (setreuid.sh):

#!/bin/sh
groupadd test
echo "create group ok"
useradd -g test foo
useradd -g test bar
foo_uid=$(id -u foo)
bar_uid=$(id -u bar)
echo "create user ok"
echo " foo: ${foo_uid}"
echo " bar: ${bar_uid}"
cd /tmp
#chown bar:test ./setuid
echo "test foo"
./setuid
#chmod u+s ./setuid
#echo "test set-uid bar"
#su foo -c ./setuid
echo "test setreuid(bar, foo)"
./setuid ${bar_uid} ${foo_uid}
echo "test setreuid(foo, bar)"
./setuid ${foo_uid} ${bar_uid}
echo "test setreuid(-1, foo)"
./setuid -1 ${foo_uid}
echo "test setreuid(bar, -1)"
./setuid ${bar_uid} -1
echo "test setreuid(bar, bar)"
./setuid ${bar_uid} ${bar_uid}
echo "test setreuid(foo, foo)"
./setuid ${foo_uid} ${foo_uid}
userdel bar
userdel foo
echo "remove user ok"
rm -rf /home/bar
rm -rf /home/foo
echo "remove user home ok"
groupdel test
echo "delete group ok"

自动创建测试账户并调用 setuid,验证了 6 种用例,需要使用超级用户身份启动:

> sudo sh setreuid.sh
create group ok
create user ok
 foo: 1003
 bar: 1004
test foo
27253: ruid 0, euid 0, suid 0
test setreuid(bar, foo)
27254: ruid 1004, euid 1003, suid 1003
test setreuid(foo, bar)
27255: ruid 1003, euid 1004, suid 1004
test setreuid(-1, foo)
27256: ruid 0, euid 1003, suid 1003
test setreuid(bar, -1)
27257: ruid 1004, euid 0, suid 0
test setreuid(bar, bar)
27258: ruid 1004, euid 1004, suid 1004
test setreuid(foo, foo)
27259: ruid 1003, euid 1003, suid 1003
remove user ok
remove user home ok

现象与表中列出的一致。

源码分析

在 kernel 3.10.0 版本中搜索 setreuid,没搜到,可能是 kernel 在系统函数上加了一堆宏识别不了,搜索 setuid 可以,它俩在同一个文件:

/*
 * Unprivileged users may change the real uid to the effective uid
 * or vice versa. (BSD-style)
 *
 * If you set the real uid at all, or set the effective uid to a value not
 * equal to the real uid, then the saved uid is set to the new effective uid.
 *
 * This makes it possible for a setuid program to completely drop its
 * privileges, which is often a useful assertion to make when you are doing
 * a security audit over a program.
 *
 * The general idea is that a program which uses just setreuid() will be
 * 100% compatible with BSD. A program which uses just setuid() will be
 * 100% compatible with POSIX with saved IDs. 
 */
SYSCALL_DEFINE2(setreuid, uid_t, ruid, uid_t, euid)
{
	struct user_namespace *ns = current_user_ns();
	const struct cred *old;
	struct cred *new;
	int retval;
	kuid_t kruid, keuid;
	kruid = make_kuid(ns, ruid);
	keuid = make_kuid(ns, euid);
	if ((ruid != (uid_t) -1) && !uid_valid(kruid))
	return -EINVAL;
	if ((euid != (uid_t) -1) && !uid_valid(keuid))
	return -EINVAL;
	new = prepare_creds();
	if (!new)
	return -ENOMEM;
	old = current_cred();
	retval = -EPERM;
	if (ruid != (uid_t) -1) {
	new->uid = kruid;
	if (!uid_eq(old->uid, kruid) &&
	 !uid_eq(old->euid, kruid) &&
	 !nsown_capable(CAP_SETUID))
	goto error;
	}
	if (euid != (uid_t) -1) {
	new->euid = keuid;
	if (!uid_eq(old->uid, keuid) &&
	 !uid_eq(old->euid, keuid) &&
	 !uid_eq(old->suid, keuid) &&
	 !nsown_capable(CAP_SETUID))
	goto error;
	}
	if (!uid_eq(new->uid, old->uid)) {
	retval = set_user(new);
	if (retval < 0)
	goto error;
	}
	if (ruid != (uid_t) -1 ||
	 (euid != (uid_t) -1 && !uid_eq(keuid, old->uid)))
	new->suid = new->euid;
	new->fsuid = new->euid;
	retval = security_task_fix_setuid(new, old, LSM_SETID_RE);
	if (retval < 0)
	goto error;
	return commit_creds(new);
error:
	abort_creds(new);
	return retval;
}

代码不长没做删减,主体就是下面的框架:

 ...
	new = prepare_creds();
 ...
 old = current_cred();
 ...
	return commit_creds(new);
error:
	abort_creds(new);

prepare_creds 返回的 new 代表新用户权限,会从当前权限复制一份;current_cred 返回的 old 代表原用户权限。经过对 new 的一番操作,如果成功就将它提交 (commit_creds),原权限被替换;否则回滚 (abort_creds),原权限不变。现在关注焦点转移到 new 的变更逻辑: 

	retval = -EPERM;
	if (ruid != (uid_t) -1) {
	new->uid = kruid;
	if (!uid_eq(old->uid, kruid) &&
	 !uid_eq(old->euid, kruid) &&
	 !nsown_capable(CAP_SETUID))
	goto error;
	}

 先看 ruid 参数,如果参数有效就将它设置到 new 的 uid,但需要同时满足以下条件:

  • ruid == old->uid
  • ruid == old->euid
  • 原用户具有超级用户权限

否则出错。做为对比,再来看 euid 参数:

	if (euid != (uid_t) -1) {
	new->euid = keuid;
	if (!uid_eq(old->uid, keuid) &&
	 !uid_eq(old->euid, keuid) &&
	 !uid_eq(old->suid, keuid) &&
	 !nsown_capable(CAP_SETUID))
	goto error;
	}

与 ruid 差不多:

  • euid == old->uid
  • euid == old->euid
  • euid == old->suid
  • 原用户具有超级用户权限

多了一条规则,可以将 euid 设置为 old->suid。最后来看 SUID 的更新规则:

	if (ruid != (uid_t) -1 ||
	 (euid != (uid_t) -1 && !uid_eq(keuid, old->uid)))
	new->suid = new->euid;
	new->fsuid = new->euid;

第三行准确无误的告诉我们,new ->suid 是固定被设置为 new->euid 的,时机是以下条件之一:

  • ruid 有效
  • euid 有效且 euid 与原 RUID 不同

看起来 ruid 参数只是影响 SUID 要不要从新 EUID 复制,即便 RUID 没变更、只要 ruid 参数有效就能产生这种作用。再回顾一下 man 的说明:

 If the real user ID is set (i.e., ruid is not -1) or the effective user ID is set to a value not equal to the
 previous real user ID, the saved set-user-ID will be set to the new effective user ID.

简直就是代码的"直译",包括对 ruid 参数有效的说明、对 euid 参数与原 RUID 不同的说明、对 SUID 从新 EUID 复制的说明,一毛一样。之前把这里理解成 SUID 从 RUID 复制了,粗心大意了!

问题验证

了解 SUID 设置规则后,回头来看上面的表,有进一步的理解:

调用参数 (root 身份)RUIDEUIDSUIDSUID 复制
setreuid (bar, foo)bar *foo *foo条件 I
setreuid (foo, bar)foo *bar *bar条件 I
setreuid (-1, foo)rootfoo *foo条件 II
setreuid (bar, -1)bar *rootroot条件 I
setreuid (bar, bar)bar *bar *bar条件 I
setreuid (foo, foo)foo *foo *foo条件 I

表中第二列中的星号表示 ruid 参数有效;第三列的星号表示 euid 变更 (!= old.uid);由于 2 个条件之间是短路或的关系,第一个条件满足后就不再检测第二个条件,所以需要最后一列表示 SUID 复制最终是哪个条件触发的。看起来大部分是条件一 ruid 有效,这些用例对条件二的测试不足,需要构造一组新的用例进行验证。

考查这样一个场景,将 demo 设置为 set-user-id 为 root,以普通用户 foo 启动该进程后:RUID = foo、EUID = SUID = root,此时 RUID 与 EUID 不同,满足了上述的条件二;保持 RUID 无效 (-1) 不满足条件一,是不是就能走条件二了?下面来做个测试 (setreuid-setroot.sh):

#!/bin/sh
groupadd test
echo "create group ok"
useradd -g test foo
useradd -g test bar
foo_uid=$(id -u foo)
bar_uid=$(id -u bar)
root_uid=0
echo "create user ok"
echo " foo: ${foo_uid}"
echo " bar: ${bar_uid}"
echo " root: ${root_uid}"
cd /tmp
chown root:test ./setuid
echo "test foo"
su foo -c ./setuid
chmod u+s ./setuid
echo "test set-uid root"
su foo -c ./setuid
echo "test setreuid(-1, foo)"
su foo -c "./setuid -1 ${foo_uid}"
echo "test setreuid(-1, bar)"
su foo -c "./setuid -1 ${bar_uid}"
echo "test setreuid(foo, foo)"
su foo -c "./setuid ${foo_uid} ${foo_uid}"
echo "test setreuid(root, foo)"
su foo -c "./setuid ${root_uid} ${foo_uid}"
userdel foo
userdel bar
echo "remove user ok"
rm -rf /home/foo
rm -rf /home/bar
echo "remove user home ok"
groupdel test
echo "delete group ok"

与之前的最大区别是,这里使用 foo 用户身份启动测试程序 (su foo -c),且它是 set-user-id 为 root 的。验证以下 4 种 setreuid 用例:

> sudo sh setreuid-setroot.sh
create group ok
create user ok
 foo: 1003
 bar: 1004
 root: 0
test foo
29332: ruid 1003, euid 1003, suid 1003
test set-uid root
29345: ruid 1003, euid 0, suid 0
test setreuid(-1, foo)
29357: ruid 1003, euid 1003, suid 0
test setreuid(-1, bar)
29369: ruid 1003, euid 1004, suid 1004
test setreuid(foo, foo)
29382: ruid 1003, euid 1003, suid 1003
test setreuid(root, foo)
29396: ruid 0, euid 1003, suid 1003
remove user ok
remove user home ok
delete group ok

清晰起见列表如下:

调用参数 (foo 身份 set-user-id  root)RUIDEUIDSUIDSUID 复制
启动后foorootrootn/a
setreuid (-1, foo)foofooroot未触发
setreuid (-1, bar)foobar *bar条件 II
setreuid (foo, foo)foo *foofoo条件 I
setreuid (root, foo)root *foofoo条件 I

表中星号含义同前,下面分别解释:

  • 第 3 行:仅设置 euid 为 foo,此时 ruid == -1 条件 1 不触发;euid == old.uid 条件 2 不触发,所以 SUID 保持 root 不变
  • 第 4 行:仅设置 euid 为 bar,此时 ruid == -1 条件 1 不触发;euid != old.uid 条件 2 触发,所以 SUID 被复制为新 EUID:bar
  • 第 5 行:同时设置 ruid 和 euid 为 foo,此时 ruid != -1 条件 1 触发;euid == old.uid 条件 2 不触发,所以 SUID 被复制为新 EUID:foo
  • 第 6 行:设置 ruid 为 root,euid 为 foo,此时 ruid != -1 条件 1 触发;euid == old.uid 条件 2 不触发,最终 SUID 被复制为新 EUID:foo

与源码逻辑对应上了,正好也解释了原文《[apue] 进程控制那些事儿》中在这种场景下 setreuid(-1, foo) 用例中 SUID 保持为 root 的疑惑。

看看上表中第 3 行在 setreuid(-1,foo) 后的情形:RUID = EUID = foo,SUID = root,如果此时调用 setreuid(foo, -1) 按理说 SUID 会被更新为 foo,一试究竟:

uid_t ruid = 0;
 uid_t euid = 0;
 uid_t suid = 0;
 int ret = getresuid (&ruid, &euid, &suid);
 if (ret == 0)
 {
 printf ("%d: ruid %d, euid %d, suid %d\n", getpid(), ruid, euid, suid);
#ifdef TEST_UPDATE_RUID
 if (ruid == euid && euid != suid)
 {
 printf ("all uid same except suid %d, try to update ruid\n", ruid);
 ret = setreuid (ruid, -1);
 if (ret != 0)
 err_sys ("setreuid");
 else
 {
 getresuid (&ruid, &euid, &suid);
 printf ("%d: ruid %d, euid %d, suid %d\n", getpid(), ruid, euid, suid);
 }
 }
#endif
 }
 else
 err_sys ("getresuid");

在 print_ids 中检测到 RUID = EUID != SUID 时,将 RUID 重新设置一下,ruid 参数与原 RUID 一致。重新运行:

> sudo sh setreuid-setroot.sh
...
test set-uid root
29511: ruid 1003, euid 0, suid 0
test setreuid(-1, foo)
29523: ruid 1003, euid 1003, suid 0
all uid same except suid 1003, try to update ruid
29523: ruid 1003, euid 1003, suid 1003
...

SUID 果然随之变更了!这个用例更能说明问题,因为调用 setreuid 前后 RUID 与 EUID 没有发生改变,SUID 却因为 ruid 参数有效而发生了变更,有点意思。下面的表总结了上述过程:

调用参数 (foo 身份 set-user-id  root)RUIDEUIDSUIDSUID 复制
启动后foorootrootn/a
setreuid (-1, foo)foofooroot未触发
setreuid (foo, -1)foo *foofoo条件 I

这个表与之前不同的是所有 setreuid 调用均在一个进程中。

意义探寻

知其然,还要知其所以然,上面的探索只是第一步,对于 SUID 复制 EUID 的目的,《[apue] 进程控制那些事儿》已有讨论,这里聚焦 SUID 何时复制 EUID,按照直觉设计成下面的条件看起来更通顺:

ruid != old.uid || euid != old.euid

为了解答这个疑问,就按照设想的条件重跑一下 setreuid 的所有用例:

调用参数 (root 身份)RUIDEUIDSUID
setreuid (bar, foo)barfoofoo
setreuid (foo, bar)foobarbar
setreuid (-1, foo)rootfoofoo
setreuid (bar, -1)barrootroot
setreuid (bar, bar)barbarbar
setreuid (foo, foo)foofoofoo
调用参数 (foo 身份 set-user-id  root)RUIDEUIDSUID
启动后foorootroot
setreuid (-1, foo)foofoofoo
setreuid (-1, bar)foobarbar
setreuid (foo, foo)foofoofoo
setreuid (root, foo)rootfoofoo

发现只有一个用例的结果会发生变化 (表中高亮字体):进程 set-user-id 为 root 且以普通用户身份启动 setreuid(-1, foo),从 SUID 不变到现在 SUID 跟随 EUID 改变,这导致整个进程变为普通进程失去重新转变为特权进程的机会。再看这个调用形式特别眼熟,这不就是 seteuid 嘛!它在改变 EUID 时是不希望 SUID 变更的,所以这下全明白了:setreuid 这样的设计是为了给 seteuid 切换特权身份留后门,从而有机会再切换回之前的身份

结语

关于 seteuid,man 中有一段说明:

 Under libc4, libc5 and glibc 2.0 seteuid(euid) is equivalent to setreuid(-1, euid) and hence may change the
 saved set-user-ID. Under glibc 2.1 and later it is equivalent to setresuid(-1, euid, -1) and hence does not
 change the saved set-user-ID. Analogous remarks hold for setegid(), with the difference that the change in
 implementation from setregid(-1, egid) to setresgid(-1, egid, -1) occurred in glibc 2.2 or 2.3 (dependeing on
 the hardware architecture).

大意是说 seteuid 到底等价于setreuid(-1,euid) 还是setresuid(-1,euid,-1)要看 glibc 版本,前者在改变 SUID 的逻辑上遵循上面的讨论;后者不遵循,或者说 SUID 不会随 EUID 变更。之前曾经比对过 setuid / setreuid / seteuid,并且推荐使用 seteuid,如果 seteuid 只是 setreuid 的分身,则它们的区别没想象那么大,只是写起来更方便一些。

后记

文章最后再推荐一波 bootlin:

代码库除了 kernel,还可以选择 freebsd、glibc、qemu、dpdk、grub、llvm、busybox 等,点击符号跳转,使用浏览器后退做调用栈回退,非常方便。

参考

[1]. https://elixir.bootlin.com

作者:goodcitizen原文地址:https://www.cnblogs.com/goodcitizen/p/18111594/when_setreuid_updates_suid

%s 个评论

要回复文章请先登录注册