FFmpeg 笔记

基础概念

https://www.ruanyifeng.com/blog/2020/01/ffmpeg.html

码率

比特率,bps,每秒位数。

muxer

封装,把视频流、音频流、字幕流等封装到一个容器格式中。

demuxer

从容器格式中分离出流。

容器

视频文件本身其实是一个容器(container),里面包括了视频和音频,也可能有字幕等其他内容。

查看 ffmpeg 支持的容器:

ffmpeg -formats

编码格式

视频和音频都需要经过编码,才能保存成文件。不同的编码格式(CODEC),有不同的压缩率,会导致文件大小和清晰度的差异。比如H.264就是一种视频编码格式。

查看 ffmpeg 支持的编码格式:

ffmpeg -codecs

编码器

编码器(encoders)是实现某种编码格式的库文件。只有安装了某种格式的编码器,才能实现该格式视频/音频的编码和解码。比如libx264是一种H.264编码器。

查看 ffmpeg 已安装的编码器:

ffmpeg -encoders

查看文件的元信息

ffmpeg -i input.mp4 -hide_banner

一个容器包含多个流(stream),视频流、音频流、字幕流等。

以下示例中包含一个视频流、一个音频流、一个字幕流和三个附加文件流(字体文件)。

Input #0, matroska,webm, from 'F:\Why.Poverty.1of8.Poor.Us.-.An.Animated.History.of.Poverty.1080p.WEB-DL.AVC.AAC-Conan06.mkv':
  Metadata:
    title           : Poor Us: An Animated History of Poverty
    encoder         : libebml v1.3.0 + libmatroska v1.4.0
    creation_time   : 2013-01-26T09:49:37.000000Z
  Duration: 00:58:05.04, start: 0.000000, bitrate: 2719 kb/s
  Stream #0:0(eng): Video: h264 (High), yuv420p(progressive), 1920x1080, SAR 1:1 DAR 16:9, 25 fps, 25 tbr, 1k tbn, 50 tbc (default)
    Metadata:
      title           : Episode 1 of 8
  Stream #0:1(eng): Audio: aac (LC), 44100 Hz, stereo, fltp (default)
    Metadata:
      title           : Conan06
  Stream #0:2(chi): Subtitle: ass (default)
    Metadata:
      title           : by Conan06
  Stream #0:3: Attachment: ttf
    Metadata:
      filename        : 方正准园.ttf
      mimetype        : application/x-truetype-font
  Stream #0:4: Attachment: ttf
    Metadata:
      filename        : 方正兰亭特黑长.TTF
      mimetype        : application/x-truetype-font
  Stream #0:5: Attachment: ttf
    Metadata:
      filename        : 张海山锐线体简.ttf
      mimetype        : application/x-truetype-font

下面这个示例包含一个视频流、两个音频流(即两个音轨):

Input #0, matroska,webm, from 'D:\IronMan.mkv':
  Metadata:
    encoder         : libebml v1.3.0 + libmatroska v1.4.0
    creation_time   : 2014-08-26T16:56:08.000000Z
  Duration: 02:06:01.81, start: 0.000000, bitrate: 2048 kb/s
  Stream #0:0: Video: h264 (High), yuv420p(progressive), 1280x720 [SAR 1:1 DAR 16:9], 23.98 fps, 23.98 tbr, 1k tbn, 47.95 tbc (default)
  Stream #0:1(eng): Audio: aac (HE-AAC), 48000 Hz, 5.1, fltp (default)
    Metadata:
      title           : 英语
  Stream #0:2(chi): Audio: aac (HE-AAC), 48000 Hz, 5.1, fltp
    Metadata:
      title           : 国语

音频包含0个或多个音轨,一个音轨包含0个或多个声道。

教程

http://dranger.com/ffmpeg/ffmpeg.html

vs2017 + ffmpeg4.4.1

教程中的某些函数在 ffmpeg4.4.1 中已被废弃,所以代码有一些改动。

截图

实现功能:加载视频文件,将开头几帧图像保存为.ppm文件(ppm文件可用photoshop查看)

#include <iostream>
#include <fstream>

extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#include <libavutil/imgutils.h>
}

void saveFrame(AVFrame *frame, int width, int height, int frameIndex)
{
    std::stringstream filename;
    filename << "frame" << frameIndex << ".ppm";
    std::ofstream file(filename.str(), std::ios_base::binary);

    // header
    file << "P6\n" << width << " " << height << "\n255\n";

    // pixel data
    for (int i = 0; i < height; ++i) {
        file.write((const char*)(frame->data[0] + i * frame->linesize[0]), width * 3);
    }
}

// Making Screencaps
void testffmpeg1()
{
    // 弃用,无需调用
    //av_register_all();

    int ret(0);
    AVFormatContext *formatCtx(nullptr);
    AVCodecContext *codecCtx(nullptr);
    AVFrame *frame(nullptr);
    AVFrame *frameRGB(nullptr);
    uint8_t *buffer(nullptr);

    try {
        // 打开文件
        std::string filename = "D:/Pictures/新建文件夹/11/0012.mp4";
        ret = avformat_open_input(&formatCtx, filename.c_str(), nullptr, nullptr);
        if (ret < 0) {
            std::cout << "error: avformat_open_input() failed" << std::endl;
            throw -1;
        }

        // 初始化流信息
        ret = avformat_find_stream_info(formatCtx, nullptr);
        if (ret < 0) {
            std::cout << "error: avformat_find_stream_info() failed" << std::endl;
            throw -1;
        }

        //av_dump_format(formatCtx, 0, filename.c_str(), 0);

        // 查找视频流
        int videoStream(-1);
        for (uint32_t i = 0; i < formatCtx->nb_streams; ++i) {
            if (formatCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
                videoStream = i;
                break;
            }
        }

        if (videoStream == -1) {
            std::cout << "error: Didn't find a video stream";
            throw -1;
        }

        // 获取解码器
        AVCodec *codec = avcodec_find_decoder(formatCtx->streams[videoStream]->codecpar->codec_id);
        if (!codec) {
            std::cout << "error: Unsupported codec" << std::endl;
            throw -1;
        }

        // 创建解码器上下文
        codecCtx = avcodec_alloc_context3(codec);
        if (!codecCtx) {
            std::cout << "error: avcodec_alloc_context3() failed" << std::endl;
            throw -1;
        }

        ret = avcodec_parameters_to_context(codecCtx, formatCtx->streams[videoStream]->codecpar);
        if (ret < 0) {
            std::cout << "error: avcodec_parameters_to_context() failed" << std::endl;
            throw -1;
        }

        // 初始化解码器上下文
        ret = avcodec_open2(codecCtx, codec, nullptr);
        if (ret < 0) {
            std::cout << "error: avcodec_open2() failed" << std::endl;
            throw -1;
        }

        // 截图帧,用于保存到本地文件
        frameRGB = av_frame_alloc();
        if (!frameRGB) {
            std::cout << "error: av_frame_alloc() failed" << std::endl;
            throw -1;
        }

        // 计算截图帧所需的内存大小
        int numBytes = av_image_get_buffer_size(AV_PIX_FMT_RGB24, codecCtx->width, codecCtx->height, 1);
        if (numBytes < 0) {
            std::cout << "error: av_image_get_buffer_size() failed" << std::endl;
            throw -1;
        }

        // 分配内存
        buffer = (uint8_t*)av_malloc(numBytes);
        if (!buffer) {
            std::cout << "error: av_malloc() failed" << std::endl;
            throw -1;
        }

        // 将内存关联到截图帧frameRGB
        ret = av_image_fill_arrays(frameRGB->data, frameRGB->linesize,
            buffer, AV_PIX_FMT_RGB24, codecCtx->width, codecCtx->height, 1);
        if (ret < 0) {
            std::cout << "error: av_image_fill_arrays() failed" << std::endl;
            throw -1;
        }

        // 用于执行视频帧到截图帧之间的转换
        SwsContext *swsCtx = sws_getContext(codecCtx->width, codecCtx->height, codecCtx->pix_fmt,
            codecCtx->width, codecCtx->height, AV_PIX_FMT_RGB24, SWS_BILINEAR, nullptr, nullptr, nullptr);

        // 视频帧
        frame = av_frame_alloc();
        if (!frame) {
            std::cout << "error: av_frame_alloc() failed" << std::endl;
            throw -1;
        }

        // 读取视频帧
        AVPacket packet;
        int frameFinished(0);
        int i(0);
        while (av_read_frame(formatCtx, &packet) >= 0) {
            if (packet.stream_index == videoStream) {
                // 解码视频帧
                ret = avcodec_send_packet(codecCtx, &packet);
                if (ret < 0) {
                    std::cout << "error: avcodec_send_packet() failed" << std::endl;
                    throw -1;
                }
                ret = avcodec_receive_frame(codecCtx, frame);
                if (ret == 0) {
                    // 获取到一帧
                    // 执行到视频帧到截图帧的转换
                    sws_scale(swsCtx, frame->data, frame->linesize, 0, codecCtx->height,
                        frameRGB->data, frameRGB->linesize);

                    // 将截图帧保存到文件,前5帧
                    if (i < 5) {
                        saveFrame(frameRGB, codecCtx->width, codecCtx->height, i);
                        ++i;
                    }
                    else {
                        break;
                    }
                }
            }
            av_packet_unref(&packet);
        }
    }
    catch (...) {
    }

    av_free(buffer);
    av_frame_free(&frameRGB);
    av_frame_free(&frame);
    avcodec_free_context(&codecCtx);
    avformat_close_input(&formatCtx);
}

输出到屏幕

实现功能:加载视频文件,通过 SDL 显示到窗口上。

项目属性需要改为窗口类型:/SUBSYSTEM:WINDOWS。

#include <iostream>
#include <fstream>
#include <SDL/SDL.h>
#include <SDL/SDL_thread.h>

extern "C" {
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#include <libswscale/swscale.h>
#include <libavutil/imgutils.h>
}

// Outputting to the Screen
void testffmpeg2()
{
    // 初始化SDL
    if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER) < 0) {
        std::cout << "error: Could not initialize SDL - " << SDL_GetError() << std::endl;
        return;
    }

    int ret(0);
    AVFormatContext *formatCtx(nullptr);
    AVCodecContext *codecCtx(nullptr);
    AVFrame *frame(nullptr);
    SDL_Overlay *sdlOverlay(nullptr);

    try {
        // 打开文件
        std::string filename = "D:/Pictures/新建文件夹/11/0012.mp4";
        ret = avformat_open_input(&formatCtx, filename.c_str(), nullptr, nullptr);
        if (ret < 0) {
            std::cout << "error: avformat_open_input() failed" << std::endl;
            throw -1;
        }

        // 初始化流信息
        ret = avformat_find_stream_info(formatCtx, nullptr);
        if (ret < 0) {
            std::cout << "error: avformat_find_stream_info() failed" << std::endl;
            throw -1;
        }

        // 查找视频流
        int videoStream(-1);
        for (uint32_t i = 0; i < formatCtx->nb_streams; ++i) {
            if (formatCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO) {
                videoStream = i;
                break;
            }
        }

        if (videoStream == -1) {
            std::cout << "error: Didn't find a video stream";
            throw -1;
        }

        // 获取解码器
        AVCodec *codec = avcodec_find_decoder(formatCtx->streams[videoStream]->codecpar->codec_id);
        if (!codec) {
            std::cout << "error: Unsupported codec" << std::endl;
            throw -1;
        }

        // 创建解码器上下文
        codecCtx = avcodec_alloc_context3(codec);
        if (!codecCtx) {
            std::cout << "error: avcodec_alloc_context3() failed" << std::endl;
            throw -1;
        }

        ret = avcodec_parameters_to_context(codecCtx, formatCtx->streams[videoStream]->codecpar);
        if (ret < 0) {
            std::cout << "error: avcodec_parameters_to_context() failed" << std::endl;
            throw -1;
        }

        // 初始化解码器上下文
        ret = avcodec_open2(codecCtx, codec, nullptr);
        if (ret < 0) {
            std::cout << "error: avcodec_open2() failed" << std::endl;
            throw -1;
        }

        // 创建 SDL_Surface
        SDL_Surface *screen = SDL_SetVideoMode(codecCtx->width, codecCtx->height, 0, 0);
        if (!screen) {
            std::cout << "error: SDL: could not set video mode" << std::endl;
            throw -1;
        }

        // YV12 SDL_Overlay
        sdlOverlay = SDL_CreateYUVOverlay(codecCtx->width, codecCtx->height, SDL_YV12_OVERLAY, screen);

        // 用于执行视频帧到截图帧之间的转换
        SwsContext *swsCtx = sws_getContext(codecCtx->width, codecCtx->height, codecCtx->pix_fmt,
            codecCtx->width, codecCtx->height, AV_PIX_FMT_YUV420P, SWS_BILINEAR, nullptr, nullptr, nullptr);

        // 视频帧
        frame = av_frame_alloc();
        if (!frame) {
            std::cout << "error: av_frame_alloc() failed" << std::endl;
            throw -1;
        }

        // 读取视频帧
        AVPacket packet;
        int frameFinished(0);
        int i(0);
        while (av_read_frame(formatCtx, &packet) >= 0) {
            if (packet.stream_index == videoStream) {
                // 解码视频帧
                ret = avcodec_send_packet(codecCtx, &packet);
                if (ret < 0) {
                    std::cout << "error: avcodec_send_packet() failed" << std::endl;
                    throw -1;
                }
                ret = avcodec_receive_frame(codecCtx, frame);
                if (ret == 0) {
                    SDL_LockYUVOverlay(sdlOverlay);

                    AVFrame frameDst;
                    // YV12 和 YUV420P 格式区别,UV分量顺序不同
                    // https://blog.csdn.net/dss875914213/article/details/120836765
                    frameDst.data[0] = sdlOverlay->pixels[0];
                    frameDst.data[1] = sdlOverlay->pixels[2];
                    frameDst.data[2] = sdlOverlay->pixels[1];
                    frameDst.linesize[0] = sdlOverlay->pitches[0];
                    frameDst.linesize[1] = sdlOverlay->pitches[2];
                    frameDst.linesize[2] = sdlOverlay->pitches[1];

                    // 获取到一帧
                    // 执行到视频帧到 YV12 的格式转换
                    sws_scale(swsCtx, frame->data, frame->linesize, 0, codecCtx->height,
                        frameDst.data, frameDst.linesize);
                    av_frame_unref(frame);

                    SDL_UnlockYUVOverlay(sdlOverlay);

                    SDL_Rect rect = { 0, 0, (Uint16)codecCtx->width, (Uint16)codecCtx->height };
                    SDL_DisplayYUVOverlay(sdlOverlay, &rect);
                }
            }
            av_packet_unref(&packet);

            SDL_Event event;
            SDL_PollEvent(&event);
            switch (event.type) {
            case SDL_QUIT:
                SDL_Quit();
                throw -1;
                break;
            default:
                break;
            }
        }
    }
    catch (...) {
    }

    SDL_FreeYUVOverlay(sdlOverlay);
    av_frame_free(&frame);
    avcodec_free_context(&codecCtx);
    avformat_close_input(&formatCtx);
}

播放音频

实现功能:在上一个程序的基础上播放音频。

#define MAX_AUDIO_FRAME_SIZE 192000

std::mutex mutex;
std::condition_variable waitCondition;
std::queue<AVPacket> packets;

void audioCallback(void *userdata, Uint8 *stream, int len);

// Playing Sound
void testffmpeg3()
{
    // 初始化SDL
    if (SDL_Init(SDL_INIT_VIDEO | SDL_INIT_AUDIO | SDL_INIT_TIMER) < 0) {
        std::cout << "error: Could not initialize SDL - " << SDL_GetError() << std::endl;
        return;
    }

    int ret(0);
    AVFormatContext *formatCtx(nullptr);
    AVCodecContext *videoCodecCtx(nullptr);
    AVCodecContext *audioCodecCtx(nullptr);
    AVFrame *frame(nullptr);
    SDL_Overlay *sdlOverlay(nullptr);

    try {
        // 打开文件
        std::string filename = "D:/Pictures/新建文件夹/11/0012.mp4";
        ret = avformat_open_input(&formatCtx, filename.c_str(), nullptr, nullptr);
        if (ret < 0) {
            std::cout << "error: avformat_open_input() failed" << std::endl;
            throw - 1;
        }

        // 初始化流信息
        ret = avformat_find_stream_info(formatCtx, nullptr);
        if (ret < 0) {
            std::cout << "error: avformat_find_stream_info() failed" << std::endl;
            throw - 1;
        }

        // 查找视频流、音频流
        int videoStream(-1);
        int audioStream(-1);
        for (uint32_t i = 0; i < formatCtx->nb_streams; ++i) {
            if (formatCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_VIDEO && videoStream < 0) {
                videoStream = i;
            }
            else if (formatCtx->streams[i]->codecpar->codec_type == AVMEDIA_TYPE_AUDIO && audioStream < 0) {
                audioStream = i;
            }
        }

        if (videoStream == -1) {
            std::cout << "error: Didn't find a video stream";
            throw - 1;
        }

        if (audioStream == -1) {
            std::cout << "error: Didn't find a audio stream";
            throw - 1;
        }

        //////////////////////////////////////////////////////////////////////////
        // 获取视频解码器
        AVCodec *videoCodec = avcodec_find_decoder(formatCtx->streams[videoStream]->codecpar->codec_id);
        if (!videoCodec) {
            std::cout << "error: Unsupported codec" << std::endl;
            throw - 1;
        }

        // 创建解码器上下文
        videoCodecCtx = avcodec_alloc_context3(videoCodec);
        if (!videoCodecCtx) {
            std::cout << "error: avcodec_alloc_context3() failed" << std::endl;
            throw - 1;
        }

        ret = avcodec_parameters_to_context(videoCodecCtx, formatCtx->streams[videoStream]->codecpar);
        if (ret < 0) {
            std::cout << "error: avcodec_parameters_to_context() failed" << std::endl;
            throw - 1;
        }

        // 初始化解码器上下文
        ret = avcodec_open2(videoCodecCtx, videoCodec, nullptr);
        if (ret < 0) {
            std::cout << "error: avcodec_open2() failed" << std::endl;
            throw - 1;
        }
        //////////////////////////////////////////////////////////////////////////

        //////////////////////////////////////////////////////////////////////////
        // 获取音频解码器
        AVCodec *audioCodec = avcodec_find_decoder(formatCtx->streams[audioStream]->codecpar->codec_id);
        if (!audioCodec) {
            std::cout << "error: Unsupported codec" << std::endl;
            throw - 1;
        }

        // 创建解码器上下文
        audioCodecCtx = avcodec_alloc_context3(audioCodec);
        if (!audioCodecCtx) {
            std::cout << "error: avcodec_alloc_context3() failed" << std::endl;
            throw - 1;
        }

        ret = avcodec_parameters_to_context(audioCodecCtx, formatCtx->streams[audioStream]->codecpar);
        if (ret < 0) {
            std::cout << "error: avcodec_parameters_to_context() failed" << std::endl;
            throw - 1;
        }

        // 初始化解码器上下文
        ret = avcodec_open2(audioCodecCtx, audioCodec, nullptr);
        if (ret < 0) {
            std::cout << "error: avcodec_open2() failed" << std::endl;
            throw - 1;
        }
        //////////////////////////////////////////////////////////////////////////

        // 创建 SDL_Surface
        SDL_Surface *screen = SDL_SetVideoMode(videoCodecCtx->width, videoCodecCtx->height, 0, 0);
        if (!screen) {
            std::cout << "error: SDL: could not set video mode" << std::endl;
            throw - 1;
        }

        // YV12 SDL_Overlay
        sdlOverlay = SDL_CreateYUVOverlay(videoCodecCtx->width, videoCodecCtx->height, SDL_YV12_OVERLAY, screen);

        // 初始化 SDL Audio
        SDL_AudioSpec audioSpec;
        audioSpec.freq = audioCodecCtx->sample_rate;
        audioSpec.format = AUDIO_S16SYS; 
        audioSpec.channels = audioCodecCtx->channels;
        audioSpec.silence = 0;
        audioSpec.samples = 1024;
        audioSpec.callback = audioCallback;
        audioSpec.userdata = audioCodecCtx;
        if (SDL_OpenAudio(&audioSpec, nullptr) != 0) {
            std::cout << "error: SDL_OpenAudio() failed, " << SDL_GetError() << std::endl;
            throw - 1;
        }

        SDL_PauseAudio(0);

        // 用于执行视频帧到截图帧之间的转换
        SwsContext *swsCtx = sws_getContext(videoCodecCtx->width, videoCodecCtx->height, videoCodecCtx->pix_fmt,
            videoCodecCtx->width, videoCodecCtx->height, AV_PIX_FMT_YUV420P, SWS_BILINEAR, nullptr, nullptr, nullptr);

        // 视频帧
        frame = av_frame_alloc();
        if (!frame) {
            std::cout << "error: av_frame_alloc() failed" << std::endl;
            throw - 1;
        }

        // 读取视频帧
        AVPacket packet;
        int frameFinished(0);
        int i(0);
        while (av_read_frame(formatCtx, &packet) >= 0) {
            if (packet.stream_index == videoStream) {
                // 解码视频帧
                ret = avcodec_send_packet(videoCodecCtx, &packet);
                if (ret < 0) {
                    std::cout << "error: avcodec_send_packet() failed" << std::endl;
                    throw - 1;
                }
                ret = avcodec_receive_frame(videoCodecCtx, frame);
                if (ret == 0) {
                    SDL_LockYUVOverlay(sdlOverlay);

                    AVFrame frameDst;
                    // YV12 和 YUV420P 格式区别,UV分量顺序不同
                    // https://blog.csdn.net/dss875914213/article/details/120836765
                    frameDst.data[0] = sdlOverlay->pixels[0];
                    frameDst.data[1] = sdlOverlay->pixels[2];
                    frameDst.data[2] = sdlOverlay->pixels[1];
                    frameDst.linesize[0] = sdlOverlay->pitches[0];
                    frameDst.linesize[1] = sdlOverlay->pitches[2];
                    frameDst.linesize[2] = sdlOverlay->pitches[1];

                    // 获取到一帧
                    // 执行到视频帧到 YV12 的格式转换
                    sws_scale(swsCtx, frame->data, frame->linesize, 0, videoCodecCtx->height,
                        frameDst.data, frameDst.linesize);
                    av_frame_unref(frame);

                    SDL_UnlockYUVOverlay(sdlOverlay);

                    SDL_Rect rect = { 0, 0, (Uint16)videoCodecCtx->width, (Uint16)videoCodecCtx->height };
                    SDL_DisplayYUVOverlay(sdlOverlay, &rect);

                }
            }
            else if (packet.stream_index == audioStream) {
                AVPacket copyPacket;
                av_packet_ref(&copyPacket, &packet);
                std::unique_lock<std::mutex> lock(mutex);
                packets.push(copyPacket);
                waitCondition.notify_all();
            }

            av_packet_unref(&packet);

            SDL_Event event;
            SDL_PollEvent(&event);
            switch (event.type) {
            case SDL_QUIT:
                SDL_Quit();
                throw - 1;
                break;
            default:
                break;
            }

            //std::this_thread::sleep_for(std::chrono::milliseconds(5));
        }
    }
    catch (...) {
    }

    SDL_FreeYUVOverlay(sdlOverlay);
    SDL_Quit();
    av_frame_free(&frame);
    avcodec_free_context(&audioCodecCtx);
    avcodec_free_context(&videoCodecCtx);
    avformat_close_input(&formatCtx);
}

int audioDecodeFrame(AVCodecContext *codecCtx, uint8_t *audioBuf, int bufSize)
{
    int dataSize(-1);

    std::unique_lock<std::mutex> lock(mutex);
    if (packets.empty()) {
        waitCondition.wait(lock);
    }

    AVPacket packet = packets.front();
    packets.pop();
    lock.unlock();

    avcodec_send_packet(codecCtx, &packet);
    av_packet_unref(&packet);

    SwrContext *swrContext = swr_alloc_set_opts(nullptr, AV_CH_LAYOUT_STEREO, AV_SAMPLE_FMT_S16, 44100,
        codecCtx->channel_layout, codecCtx->sample_fmt, codecCtx->sample_rate, 0, nullptr);
    swr_init(swrContext);

    AVFrame *frame = av_frame_alloc();
    if (avcodec_receive_frame(codecCtx, frame) == 0) {
        swr_convert(swrContext, &audioBuf, frame->nb_samples, (const uint8_t**)frame->data, frame->nb_samples);

        dataSize = av_samples_get_buffer_size(nullptr, frame->channels, frame->nb_samples, AV_SAMPLE_FMT_S16, 1);
        av_frame_free(&frame);
        swr_free(&swrContext);
        return dataSize;
    }

    av_frame_free(&frame);
    swr_free(&swrContext);
    return dataSize;
}

void audioCallback(void *userdata, Uint8 *stream, int len)
{
    AVCodecContext *codecCtx = (AVCodecContext*)userdata;
    int len1, audioSize;

    static uint8_t audioBuf[(MAX_AUDIO_FRAME_SIZE * 3) / 2];
    static uint32_t audioBufSize = 0; 
    static uint32_t audioBufIndex = 0;

    while (len > 0) {
        if (audioBufIndex >= audioBufSize) {
            audioSize = audioDecodeFrame(codecCtx, audioBuf, sizeof(audioBuf));
            if (audioSize < 0) {
                // if error, output silence
                audioBufSize = 1024;
                std::memset(audioBuf, 0, audioBufSize);
            }
            else {
                audioBufSize = audioSize;
            }
            audioBufIndex = 0;
        }

        len1 = audioBufSize - audioBufIndex;
        if (len1 > len)
            len1 = len;
        std::memcpy(stream, (uint8_t*)audioBuf + audioBufIndex, len1);
        len -= len1;
        stream += len1;
        audioBufIndex += len1;
    }
}

-- 待续 --