线程基础

# 线程

# 1、线程概述

与进程（process）类似，线程（thread）是允许应用程序并发执行多个任务的一种机制。一个进程可以包含多个线程。同一个程序中的所有线程均会独立执行相同程序，且共享同一份全局内存区域，其中包括初始化数据段、未初始化数据段，以及堆内存段。（传统意义上的 UNIX 进程只是多线程程序的一个特例，该进程只包含一个线程）
进程是 CPU 分配资源的最小单位，线程是操作系统调度执行的最小单位。
线程是轻量级的进程（LWP：Light Weight Process），在 Linux 环境下线程的本质仍是进程。
查看指定进程的 LWP 号：ps –Lf pid

# 2、线程与进程的区别

进程间的信息难以共享。由于除去只读代码段外，父子进程并未共享内存，因此必须采用一些进程间通信方式，在进程间进行信息交换。
调用 fork() 来创建进程的代价相对较高，即便利用写时复制技术，仍然需要复制诸如内存页表和文件描述符表之类的多种进程属性，这意味着 fork() 调用在时间上的开销依然不菲。
线程之间能够方便、快速地共享信息。只需将数据复制到共享（全局或堆）变量中即可。
创建线程比创建进程通常要快 10 倍甚至更多。线程间是共享虚拟地址空间的，无需采用写时复制来复制内存，也无需复制页表。

# 3、线程和进程虚拟地址空间

线程与进程的虚拟地址空间

线程是共享虚拟地址空间的，只不过线程在栈空间，和代码段（.text）都有自己的一块区域，每个线程执行自己的那块区域的代码
main线程被称为主线程

# 4、线程之间共享和非共享资源

# 4.1 共享资源

进程 ID 和父进程 ID
进程组 ID 和会话 ID
用户 ID 和用户组 ID
文件描述符表
信号处置（注册的信号处理）
文件系统的相关信息：文件权限掩码（umask）、当前工作目录
虚拟地址空间（除栈、.text）

仔细一看上面处理最后一条，都是内核里的

# 4.2 非共享资源

线程 ID
信号掩码（阻塞信号集）
线程特有数据
error 变量
实时调度策略和优先级
栈，本地变量和函数的调用链接信

# 5、NPTL

当 Linux 最初开发时，在内核中并不能真正支持线程。但是它的确可以通过 clone() 系统调用将进程作为可调度的实体。这个调用创建了调用进程（calling process）的一个拷贝，这个拷贝与调用进程共享相同的地址空间。LinuxThreads 项目使用这个调用来完成在用户空间模拟对线程的支持。不幸的是，这种方法有一些缺点，尤其是在信号处理、调度和进程间同步等方面都存在问题。另外，这个线程模型也不符合 POSIX 的要求。
要改进 LinuxThreads，需要内核的支持，并且重写线程库。有两个相互竞争的项目开始来满足这些要求。一个包括 IBM 的开发人员的团队开展了 NGPT（Next-Generation POSIX Threads）项目。同时，Red Hat 的一些开发人员开展了 NPTL 项目。NGPT 在 2003 年中期被放弃了，把这个领域完全留给了 NPTL。
NPTL，或称为 Native POSIX Thread Library，是 Linux 线程的一个新实现，它克服了 LinuxThreads 的缺点，同时也符合 POSIX 的需求。与 LinuxThreads 相比，它在性能和稳定性方面都提供了重大的改进。
查看当前 pthread 库版本：getconf GNU_LIBPTHREAD_VERSION

# 6、线程操作

int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);

pthread_t pthread_self(void);

int pthread_equal(pthread_t t1, pthread_t t2);

void pthread_exit(void *retval);

int pthread_join(pthread_t thread, void **retval);

int pthread_detach(pthread_t thread);

int pthread_cancel(pthread_t thread);

# 6.1 创建线程

# 6.1.1 pthread_create函数

//一般情况下,main函数所在的线程我们称之为主线程（main线程），其余创建的线程称之为子线程。
//程序中默认只有一个进程，fork()函数调用，2个进程（父与子）
//程序中默认只有一个线程，pthread_create()函数调用，产生2个线程（主线程和线程）。
#include <pthread.h>
int pthread_create(pthread_t *thread, const pthread_attr_t *attr,
void *(*start_routine) (void *), void *arg);
    - 功能：创建一个子线程
    - 参数：
        - thread：传出参数，线程创建成功后，子线程的线程ID被写到该变量中。
        - attr : 设置线程的属性，一般使用默认值，NULL
        - start_routine : 函数指针，这个函数是子线程需要处理的逻辑代码
        - arg : 给第三个参数使用，传参
    - 返回值：
        成功：0
        失败：返回错误号。这个错误号和之前errno不太一样。
        获取错误号的信息：  char * strerror(int errnum);

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16

# 6.1.2 代码

//pthread_create.c
#include <stdio.h>
#include <pthread.h>
#include <string.h>  //strerror()
#include <unistd.h>

void *callback(void *arg)
{
    printf("child thread...\n");
    // 因为 arg 是空指针，所以先给强转为int型指针，再解引用
    printf("arg value: %d\n", *(int *)arg);
    return NULL;
}

int main()
{

    pthread_t tid;

    int num = 10;

    // 创建一个子线程
    // 因为我们要传入的是空指针，所以我们要把num强转为 （void*）
    int ret = pthread_create(&tid, NULL, callback, (void *)&num);
    //创建失败
    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error : %s\n", errstr);
    }

    for (int i = 0; i < 5; i++)
    {
        printf("%d\n", i);
    }
    // sleep(1);
    return 0; // exit(0);
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38

创建线程

注意：我们创建线程使用的是第三方的库，所以我们在编译时要指定使用的库
- 上面两种写法都可以，在末尾加 -lpthread 或 -pthread
另外我们这里发现，我们只输出了主线程代码应该输出的，子线程的输出并没有执行，这是因为我们主线程执行后直接就退出了，同时把虚拟地址空间都释放掉了，但是线程是共享虚拟地址空间的，所以子线程就没有执行。
- 解决方法：主线程执行完后休眠，让子线程有时间执行（上面代码sleep(1)取消注释）

# 6.2 终止线程

# 6.2.1 终止线程

#include <pthread.h>
void pthread_exit(void *retval);
    功能：终止一个线程，在哪个线程中调用，就表示终止哪个线程
    参数：
        retval：需要传递一个指针，作为一个返回值，可以在pthread_join中

1
2
3
4
5

# 6.2.2 获取线程id

pthread_t pthread_self(void);
    功能：获取当前的线程的线程ID

1
2

# 6.2.3 比较线程的id是否相等

int pthread_equal(pthread_t t1,pthread_t t2);
    功能：比较两个线程ID是否相等
    返回值： 如果两个线程 ID t1 和 t2 相等，则返回一个非零值；否则返回 0
    //不同的操作系统，pthread_t类型的实现不一样，有的是无符号的长整形，有的是使用结构体去实现的，所以我们用pthread_equal来判断兼容性更强。

1
2
3
4

# 6.2.4 代码

//pthread_exit.c
#include <pthread.h>
#include <stdio.h>
#include <string.h> //strerror

void *callback(void *arg)
{
    printf("child thread id : %ld\n", pthread_self());
    // 在子线程中return NULL其实就相当于 pthread_exit(NULL);
    return NULL;
}

int main()
{

    // 创建一个子线程
    pthread_t tid;
    int ret = pthread_create(&tid, NULL, callback, NULL);

    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error : %s\n", errstr);
    }

    // 主线程
    for (int i = 0; i < 5; i++)
    {
        printf("%d\n", i);
    }

    printf("tid : %ld, main thread id : %ld\n", tid, pthread_self());

    // 让主线程退出，当主线程退出时，不会影响其他正常运行的线程
    pthread_exit(NULL);

    printf("main thread exit\n");
    return 0; // exit(0)
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39

pthread_exit

使用pthread 线程的函数，在编译时要指定引用的库 -pthread
🟥：可以看出，pthread_create 的返回值为子线程的id
🍃：没有打印 main thread exit，声明主线程在执行到 pthread_exit(NULL) ，后面的语句就没有执行了，（没有执行return 0也就没有释放虚拟地址空间）
（这里看不出来），和父子进程是交替执行的一样，子线程和主线程也是交替执行的

# 6.3 回收子线程

# 6.3.1 pthread_join

#include <pthread.h>
int pthread_join(pthread_t thread, void **retval);
    - 功能：和一个已经终止的线程进行连接
            回收子线程的资源
            这个函数是❗❗阻塞函数❗❗，调用一次只能回收一个子线程
            一般在主线程中使用
    - 参数：
        - thread：需要回收的子线程的ID
        - retval: 接收子线程退出时的返回值
    - 返回值：
        0 : 成功
        非0 : 失败，返回的错误号

1
2
3
4
5
6
7
8
9
10
11
12

# 6.3.2 代码

//pthread_join.c
#include <pthread.h>
#include <stdio.h>
#include <string.h> //strerror
#include <unistd.h> //sleep

void *callback(void *arg)
{
    printf("child thread id : %ld\n", pthread_self());
    sleep(3);
    // return NULL;
    int value = 10;
    // 因为pthread_exit(void *retval)出入的是void型指针，所以我们要进行强转
    pthread_exit((void *)&value); // return (void)*&value; 相等
}

int main()
{

    // 创建一个子线程
    pthread_t tid;
    int ret = pthread_create(&tid, NULL, callback, NULL);

    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error : %s\n", errstr);
    }

    // 主线程
    for (int i = 0; i < 5; i++)
    {
        printf("%d\n", i);
    }

    printf("tid : %ld, main thread id : %ld\n", tid, pthread_self());

    // 主线程调用 pthread_join() 回收子线程的资源
    int *thread_retval;
    // 第二个参数应该是一个二级指针，所以我们要定义一个指针，然后传入该指针的地址
    ret = pthread_join(tid, (void **)&thread_retval);
    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error : %s\n", errstr);
    }

    printf("exit data : %d\n", *thread_retval);

    printf("回收子线程资源成功！\n");

    // 让主线程退出，当主线程退出时，不会影响其他正常运行的线程
    pthread_exit(NULL);

    return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56

pthread_join(1)

执行后，发现我们打印出来的子线程退出时的返回值并不是 10，这是因为我们上面代码里的子进程执行结束后返回的 value是局部变量，（是存放在子线程对应栈空间里的）在子线程执行结束后，子线程的这块栈空间被释放掉了，所以我们打印出来的值不是 10。
所以说，子线程要返回值的话，不要返回局部变量，要返回全局变量。
🟥和下面打印的语句是在3s后输出的，证明了pthread_join是阻塞的，需要等回收完子线程的资源才能继续往下执行

# 6.3.3 思考

为什么int pthread_join(pthread_t thread, void **retval)第二个参数要传入的是二级指针？
答：（以上面代码为例）
- 函数callback返回的是一个指针，要想接收这个返回值需要一个指针类型。所以定义了 int *thread_retval去接收返回的指针。
- 而pthread_join() 函数在完成子线程资源回收的同时要把子线程返回值赋给第二个参数（修改了thread_retval的值），用二级指针是为了修改主线程中定义的一级指针 thread_retval，不然修改的只是传入的副本。

# 6.4 线程分离

# 6.4.1 pthread_detach

#include <pthread.h>
int pthread_detach(pthread_t thread);
    - 功能：分离一个线程。被分离的线程在终止的时候，会自动释放资源返回给系统。
      1.不能多次分离，会产生不可预料的行为。
      2.不能去连接(join)一个已经分离的线程，会报错。
    - 参数：需要分离的线程的ID
    - 返回值：
        成功：0
        失败：返回错误号

1
2
3
4
5
6
7
8
9

# 6.4.2 代码测试

//pthread_detach.c

#include <pthread.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>

void *callback(void *arg)
{
    printf("child thread id : %ld\n", pthread_self());
    return 0;
}

int main()
{

    // 创建线程
    pthread_t tid;
    int ret = pthread_create(&tid, NULL, callback, NULL);

    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error1 : %s\n", errstr);
    }

    // 输出主线程和子线程的id
    printf("tid : %ld, main thread id : %ld\n", tid, pthread_self());

    // 设置子线程分离,子线程分离后，子线程结束时对应的资源就不需要主线程释放
    ret = pthread_detach(tid);
    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error2 : %s\n", errstr);
    }

    // 设置分离后，对分离的子线程进行连接 pthread_join()
    ret = pthread_join(tid, NULL);
    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error3 : %s\n", errstr);
    }

    pthread_exit(NULL);

    return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49

pthread_detach

🟥可以看出，我们在进行线程分离后不能再调用 pthread_join()函数去连接，因为线程分离后被分离的线程在终止的时候，会自动释放资源返回给系统，不需要使用join去连接回收子线程的资源

# 6.5 线程取消

# 6.5.1 pthread_cancel

#include <pthread.h>
int pthread_cancel(pthread_t thread);
    - 功能：取消线程（让线程终止）
        取消某个线程，可以终止某个线程的运行，
        但是并不是立马终止，而是当子线程执行到一个取消点，线程才会终止。
      取消点：系统规定好的一些系统调用，我们可以粗略的理解为从用户区到内核区的切换这个位置称之为取消点。

1
2
3
4
5
6

# 6.5.2 代码测试

//pthread_cancel.c
#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>

void *callback(void *arg)
{
    printf("chid thread id : %ld\n", pthread_self());
    for (int i = 0; i < 5; i++)
    {
        printf("child : %d\n", i);
    }
    return NULL;
}

int main()
{

    // 创建一个子线程
    pthread_t tid;
    int ret = pthread_create(&tid, NULL, callback, NULL);
    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error1 : %s\n", errstr);
    }
    // 取消线程
    pthread_cancel(tid);

    for (int i = 0; i < 5; i++)
    {
        printf("%d\n", i);
    }
    // 输出主线程和子线程的id
    printf("tid : %ld, main thread id : %ld\n", tid, pthread_self());

    pthread_exit(NULL);
    return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40

pthread_cancel

printf() 不是一个取消点，但是printf 会调用 write在stdout里面写数据，而write是要进行一个用户态到内核态的切换的
🟧子线程先执行完了，再执行到的线程取消，所以全部都输出了出来
🟨子线程还没执行for循环，主线程就中写到了 pthread_cancel,又因为printf() 就是取消点，所以子线程没有输出 0~4

# 7、线程属性

# 7.1 查看线程属性函数

man pthread_attr_然后按两次 table键

查看线程属性函数

# 7.2 常用函数

线程属性类型 pthread_attr_t

int pthread_attr_init(pthread_attr_t *attr);
    - 初始化线程属性变量
int pthread_attr_destroy(pthread_attr_t *attr);
    - 释放线程属性的资源
        
int pthread_attr_getdetachstate(const pthread_attr_t *attr, int *detachstate);
    - 获取线程分离的状态属性

int pthread_attr_setdetachstate(pthread_attr_t *attr, int detachstate);
     - 设置线程分离的状态属性
     - 参数：
         - detachstate：
             PTHREAD_CREATE_DETACHED 使用 attr 创建的线程将是独立的，设置线程分离的意思。
             PTHREAD_CREATE_JOINABLE（默认是这个） 使用 attr 创建的线程是可以被join的。

1
2
3
4
5
6
7
8
9
10
11
12
13
14

# 7.3 代码

//pthread_attr.c

#include <stdio.h>
#include <pthread.h>
#include <string.h>
#include <unistd.h>

void *callback(void *arg)
{
    printf("chid thread id : %ld\n", pthread_self());
    return NULL;
}

int main()
{

    // 创建一个线程属性变量
    pthread_attr_t attr;
    // 初始化属性变量
    pthread_attr_init(&attr);
    // 设置属性
    pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);

    // 创建一个子线程
    pthread_t tid;
    int ret = pthread_create(&tid, &attr, callback, NULL);
    if (ret != 0)
    {
        char *errstr = strerror(ret);
        printf("error1 : %s\n", errstr);
    }

    // 获取线程的栈的大小
    size_t size;
    pthread_attr_getstacksize(&attr, &size);
    printf("thread stack size : %ld\n", size);

    // 输出主线程和子线程的id
    printf("tid : %ld, main thread id : %ld\n", tid, pthread_self());

    // 释放线程属性资源
    pthread_attr_destroy(&attr);
    pthread_exit(NULL);
    return 0;
}

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46

pthread_attr

一般系统默认为每个线程分配8MB的栈空间
- 即 8 MB = 8388608 B
一般不需要自己来管理线程堆栈，Linux默认为每个线程分配了足够的堆栈空间（一般是8MB），可以用ulimit -s 命令查看或修改这个默认值；

编辑

上次更新: 2023/09/13, 12:29:52

← 守护进程线程同步→