开发者
鲲鹏数学库(KML)天元直接法求解器(DSS)开发实践

鲲鹏数学库(KML)天元直接法求解器(DSS)开发实践

HPC

发表于 2026/05/09

0

1. 开发流程总览

使用 KML 直接法求解器的标准流程:

步骤操作
1环境准备与数据结构定义
2创建矩阵与向量对象
3求解器初始化与配置
4执行分析、分解与求解
5结果处理与查询
6资源释放

2. 分步开发指南

2.1 环境准备与数据结构定义

引入头文件:

#include "kml_solver.h"

稀疏矩阵 CSR 格式

示例采用 CSR(压缩稀疏行) 格式存储稀疏矩阵:

int n = 8;                 // 矩阵阶数
int ia[9] = {0, 2, 4, 6, 7, 8, 10, 12, 14};
int ja[14] = {0, 7, 1, 6, 2, 5, 3, 4, 2, 5, 1, 6, 0, 7};
double a[14] = {1.0, 2.0, -2.0, 3.0, 3.0, 4.0, -4.0, 5.0, 4.0, -6.0, 3.0, 7.0, 2.0, 8.0};

字段说明

数组长度说明
ian+1行偏移,第 i 行首个非零元在 ja/a 中的起始索引
jannz列索引,每个非零元的列号
annz非零元数值

右手边向量

int nrhs = 1;              // 右手边项数量
double b[8] = {3.0, 1.0, 7.0, -4.0, 5.0, -2.0, 10.0, 10.0};
double x[8] = {0};         // 解向量缓冲区
实践建议:准备数据前明确矩阵规模(n)、非零元数量(nnz)及右端项个数(nrhs)。

2.2 创建矩阵与向量对象

通过 KmlSolverMatrixStore(描述数据存储)和 KmlSolverMatrixOption(描述矩阵属性),调用 KmlSolverMatrixCreate 封装对象。

创建矩阵 A

KmlSolverMatrixStore storeA;
storeA.indexType = KMLSS_INDEX_INT32;     // 索引类型
storeA.valueType = KMLSS_VALUE_FP64;     // 数值类型
storeA.nRow = n;
storeA.nCol = n;
storeA.format = KMLSS_MATRIX_STORE_CSR;  // 稀疏格式
storeA.csr.rowOffset = ia;
storeA.csr.colIndex = ja;
storeA.csr.value = a;

KmlSolverMatrixOption optA;
optA.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optA.type = KMLSS_MATRIX_GEN;            // 一般矩阵

KmlSolverMatrix *A;
int ierr = KmlSolverMatrixCreate(&A, &storeA, &optA);
if (ierr != KMLSS_NO_ERROR) {
    printf("ERROR when create A: %d\n", ierr);
    return 1;
}

创建向量 b 和 x

// 向量 b
KmlSolverMatrixStore storeB;
storeB.indexType = KMLSS_INDEX_INT32;
storeB.valueType = KMLSS_VALUE_FP64;
storeB.nRow = n;
storeB.nCol = nrhs;
storeB.format = KMLSS_MATRIX_STORE_DENSE_COL_MAJOR;
storeB.dense.value = b;
storeB.dense.ld = n;                     // 主维度

KmlSolverMatrixOption optB;
optB.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optB.type = KMLSS_MATRIX_GEN;

KmlSolverMatrix *B;
ierr = KmlSolverMatrixCreate(&B, &storeB, &optB);

// 向量 x(方式相同)
KmlSolverMatrixStore storeX;
storeX.indexType = KMLSS_INDEX_INT32;
storeX.valueType = KMLSS_VALUE_FP64;
storeX.nRow = n;
storeX.nCol = nrhs;
storeX.format = KMLSS_MATRIX_STORE_DENSE_COL_MAJOR;
storeX.dense.value = x;
storeX.dense.ld = n;

KmlSolverMatrixOption optX;
optX.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optX.type = KMLSS_MATRIX_GEN;

KmlSolverMatrix *X;
ierr = KmlSolverMatrixCreate(&X, &storeX, &optX);

关键参数说明

参数可选值说明
indexTypeKMLSS_INDEX_INT32 / KMLSS_INDEX_INT64索引数据类型
valueTypeKMLSS_VALUE_FP32 / KMLSS_VALUE_FP64数值数据类型
formatKMLSS_MATRIX_STORE_CSR / KMLSS_MATRIX_STORE_DENSE_COL_MAJOR存储格式
typeKMLSS_MATRIX_GEN / KMLSS_MATRIX_SPD矩阵类型(一般/对称正定)
实践要点:对称正定矩阵可将 type 设为 KMLSS_MATRIX_SPD 以加速求解。

2.3 求解器初始化与配置

KmlDssInitOption opt;
opt.fieldMask = KMLDSS_INIT_OPTION_BWR_MODE | KMLDSS_INIT_OPTION_NTHREADS;
opt.bwrMode = KMLDSS_BWR_OFF;     // 关闭二进制可重现
opt.nThreads = 32;                // 并行线程数

KmlDssSolver *solver;
ierr = KmlDssInit(&solver, &opt);
if (ierr != KMLSS_NO_ERROR) {
    printf("ERROR in KmlDssInit: %d\n", ierr);
    return ierr;
}

初始化选项说明

选项可选值说明
bwrModeKMLDSS_BWR_OFF / KMLDSS_BWR_FIXED_THREADS二进制可重现控制选项
nThreads并行线程数
实践建议:fieldMask 决定哪些配置项生效,通过 位或(\|) 组合nThreads 根据 CPU 核心数与实际情况调整

2.4 分析、分解与求解

阶段一:分析(Analyze)

仅依赖矩阵的稀疏结构。

KmlDssAnalyzeOption optAnalyze;
optAnalyze.fieldMask = KMLDSS_ANALYZE_OPTION_MATCHING_TYPE |
                       KMLDSS_ANALYZE_OPTION_RDR_TYPE |
                       KMLDSS_ANALYZE_OPTION_NTHREADS_RDR;
optAnalyze.matchingType = KMLDSS_MATCHING_OFF;
optAnalyze.rdrType = KMLDSS_RDR_KRDR;       // 重排序算法
optAnalyze.nThreadsRdr = 8;                 // 重排序并行度

ierr = KmlDssAnalyze(solver, A, &optAnalyze);

分析选项说明

选项可选值说明
matchingTypeKMLDSS_MATCHING_OFF / KMLDSS_MATCHING_MC64 / KMLDSS_MATCHING_SMC64匹配策略,可改善数值稳定性
rdrTypeKMLDSS_RDR_KRDR / KMLDSS_RDR_CUSTOM重排序算法,影响 fill-in 和分解性能
nThreadsRdr重排序阶段的并行线程数

阶段二:分解(Factorize)

执行数值分解,依赖分析阶段的结果。

KmlDssFactorizeOption optFact;
optFact.fieldMask = KMLDSS_FACTORIZE_OPTION_PERTURBATION_THRESHOLD;
optFact.perturbationThreshold = 1e-8;       // 扰动阈值

ierr = KmlDssFactorize(solver, A, &optFact);

分析选项说明

选项可选值说明
perturbationThreshold扰动阈值,若对角线存在零元可提高数值稳定性

阶段三:求解(Solve)

使用分解结果前代/回代,多右端项时可重复调用。

KmlDssSolveOption optSolve;
optSolve.fieldMask = KMLDSS_SOLVE_OPTION_SOLVE_STAGE | KMLDSS_SOLVE_OPTION_REFINE_METHOD;
optSolve.stage = KMLDSS_SOLVE_ALL;          // 完整求解
optSolve.refineMethod = KMLDSS_REFINE_OFF;  // 关闭迭代精化

ierr = KmlDssSolve(solver, B, X, &optSolve);

求解选项说明

选项可选值说明
stageKMLDSS_SOLVE_FORWARD / KMLDSS_SOLVE_BACKWARD / KMLDSS_SOLVE_ALL求解阶段
refineMethodKMLDSS_REFINE_OFF / KMLDSS_REFINE_CLASSICAL迭代修正方法
调优关键:rdrType 直接影响 fill-in,也可尝试选择开源重排算法如perturbationThreshold 是解决数值稳定性问题的首选参数多右端项时,仅重复调用 KmlDssSolve,复用分解结果

2.5 结果输出与信息查询

输出解向量

printf("Result of first factorize and solve:\n");
for (int i = 0; i < n; i++) {
    printf("%lf ", x[i]);
}
printf("\n");

运行结果

Result of first factorize and solve:
1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 

查询求解器信息

KmlDssInfo info;
info.fieldMask = KMLDSS_INFO_PEAK_MEM;      // 查询峰值内存
ierr = KmlDssQuery(solver, &info);
printf("Peak memory is %ld Byte\n", info.peakMem);

可查询信息类型

选项说明
peakMem峰值内存使用量(字节)

2.6 资源释放

按"先求解器,后矩阵/向量"的顺序释放:

KmlDssClean(&solver);
KmlSolverMatrixDestroy(&A);
KmlSolverMatrixDestroy(&B);
KmlSolverMatrixDestroy(&X);

3. 进阶实践

3.1 数值复用(结构不变、数值变)

示例演示了高效复用流程——跳过初始化与分析,仅重新分解和求解:

// 更新矩阵 A 的数值
double a1[14] = {2.0, 3.0, -3.0, 4.0, 4.0, 5.0, -5.0, 6.0,
                 5.0, -7.0, 4.0, 8.0, 3.0, 9.0};
KmlSolverMatrixSetValue(A, a1);

// 仅重新分解
ierr = KmlDssFactorize(solver, A, &optFact);

// 更新右端项 B 的数值并求解
double b1[8] = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0};
KmlSolverMatrixSetValue(B, b1);
ierr = KmlDssSolve(solver, B, X, &optSolve);

运行结果

Result of second factorize and solve:
0.666667 -0.100000 0.226415 -0.200000 0.166667 0.018868 0.175000 -0.111111 
关键技巧:当求解一系列非零模式相同、数值不同的系统时,该机制可大幅节省分析阶段耗时。

3.2 多右端项求解

对于多个右端项(nrhs > 1),只需:

  1. 创建包含多列的稠密矩阵 B 和 X
  2. 首次调用 KmlDssSolve 时设置 stage = KMLDSS_SOLVE_ALL
  3. 分解结果自动复用

4. 错误排除

错误现象可能原因检查点
ERROR when create ACSR 数组配置错误ia[0] 是否为 0;ia[n] 是否等于 nnz;列索引是否越界
ERROR in KmlDssAnalyze矩阵结构异常是否有全零行/列;索引是否升序排列
ERROR in KmlDssFactorize矩阵数值问题或内存不足矩阵是否奇异;调整 perturbationThreshold;检查可用内存
计算结果不正确输入数据问题CSR 列索引是否行内升序;索引基是否为 0-based;向量 ld 设置是否正确

错误处理模板

ierr = KmlDssAnalyze(solver, A, &optAnalyze);
if (ierr != KMLSS_NO_ERROR) {
    printf("ERROR in KmlDssAnalyze: %d\n", ierr);
    // 根据错误码执行相应处理
    return ierr;
}

常见错误码

错误码含义
KMLSS_NO_ERROR成功
KMLSS_INTERNAL_ERROR内部错误
KMLSS_NULL_ARGUMENT参数无效

5. 完整代码及编译运行方式

完整代码

#include <stdio.h>
#include <stdlib.h>
#include "kml_solver.h"

int main()
{
    int ierr;
    int n = 8;
    int nrhs = 1;
    // Create matrix A
    int ia[9] = {0, 2, 4, 6, 7, 8, 10, 12, 14};
    int ja[14] = {0, 7, 1, 6, 2, 5, 3, 4, 2, 5, 1, 6, 0, 7};
    double a[14] = {1.0, 2.0, -2.0, 3.0, 3.0, 4.0, -4.0, 5.0, 4.0, -6.0, 3.0, 7.0, 2.0, 8.0};
    KmlSolverMatrixStore storeA;
    storeA.indexType = KMLSS_INDEX_INT32;
    storeA.valueType = KMLSS_VALUE_FP64;
    storeA.nRow = n;
    storeA.nCol = n;
    storeA.format = KMLSS_MATRIX_STORE_CSR;
    storeA.csr.rowOffset = ia;
    storeA.csr.colIndex = ja;
    storeA.csr.value = a;
    KmlSolverMatrixOption optA;
    optA.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
    optA.type = KMLSS_MATRIX_GEN;
    KmlSolverMatrix *A;
    ierr = KmlSolverMatrixCreate(&A, &storeA, &optA);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR when create A: %d\n", ierr);
        return 1;
    }
    // Create vector b
    double b[8] = {3.0, 1.0, 7.0, -4.0, 5.0, -2.0, 10.0, 10.0};
    KmlSolverMatrixStore storeB;
    storeB.indexType = KMLSS_INDEX_INT32;
    storeB.valueType = KMLSS_VALUE_FP64;
    storeB.nRow = n;
    storeB.nCol = nrhs;
    storeB.format = KMLSS_MATRIX_STORE_DENSE_COL_MAJOR;
    storeB.dense.value = b;
    storeB.dense.ld = n;
    KmlSolverMatrixOption optB;
    optB.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
    optB.type = KMLSS_MATRIX_GEN;
    KmlSolverMatrix *B;
    ierr = KmlSolverMatrixCreate(&B, &storeB, &optB);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR when create b: %d\n", ierr);
        return 1;
    }
    // Create vector x
    double x[8] = {0};
    KmlSolverMatrixStore storeX;
    storeX.indexType = KMLSS_INDEX_INT32;
    storeX.valueType = KMLSS_VALUE_FP64;
    storeX.nRow = n;
    storeX.nCol = nrhs;
    storeX.format = KMLSS_MATRIX_STORE_DENSE_COL_MAJOR;
    storeX.dense.value = x;
    storeX.dense.ld = n;
    KmlSolverMatrixOption optX;
    optX.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
    optX.type = KMLSS_MATRIX_GEN;
    KmlSolverMatrix *X;
    ierr = KmlSolverMatrixCreate(&X, &storeX, &optX);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR when create x: %d\n", ierr);
        return 1;
    }
    // Init solver
    KmlDssInitOption opt;
    opt.fieldMask = KMLDSS_INIT_OPTION_BWR_MODE | KMLDSS_INIT_OPTION_NTHREADS;
    opt.bwrMode = KMLDSS_BWR_OFF;
    opt.nThreads = 32;
    KmlDssSolver *solver;
    ierr = KmlDssInit(&solver, &opt);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR in KmlDssInit: %d\n", ierr);
        return ierr;
    }
    // Analyze
    KmlDssAnalyzeOption optAnalyze;
    optAnalyze.fieldMask = KMLDSS_ANALYZE_OPTION_MATCHING_TYPE |
                        KMLDSS_ANALYZE_OPTION_RDR_TYPE |
                        KMLDSS_ANALYZE_OPTION_NTHREADS_RDR;
    optAnalyze.matchingType = KMLDSS_MATCHING_OFF;
    optAnalyze.rdrType = KMLDSS_RDR_KRDR;
    optAnalyze.nThreadsRdr = 8;
    ierr = KmlDssAnalyze(solver, A, &optAnalyze);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR in KmlDssAnalyze: %d\n", ierr);
        return ierr;
    }
    // Factorize
    KmlDssFactorizeOption optFact;
    optFact.fieldMask = KMLDSS_FACTORIZE_OPTION_PERTURBATION_THRESHOLD;
    optFact.perturbationThreshold = 1e-8;
    ierr = KmlDssFactorize(solver, A, &optFact);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR in KmlDssFactorize: %d\n", ierr);
        return ierr;
    }
    // Solve
    KmlDssSolveOption optSolve;
    optSolve.fieldMask = KMLDSS_SOLVE_OPTION_SOLVE_STAGE | KMLDSS_SOLVE_OPTION_REFINE_METHOD;
    optSolve.stage = KMLDSS_SOLVE_ALL;
    optSolve.refineMethod = KMLDSS_REFINE_OFF;
    ierr = KmlDssSolve(solver, B, X, &optSolve);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR in KmlDssSolve: %d\n", ierr);
        return ierr;
    }
    // Output result x
    printf("Result of first factorize and solve:\n");
    for (int i = 0; i < n; i++) {
        printf("%lf ", x[i]);
    }
    printf("\n");
    // Set new values of A
    double a1[14] = {2.0, 3.0, -3.0, 4.0, 4.0, 5.0, -5.0, 6.0, 5.0, -7.0, 4.0, 8.0, 3.0, 9.0};
    KmlSolverMatrixSetValue(A, a1);
    // Factorize with new values
    ierr = KmlDssFactorize(solver, A, &optFact);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR in KmlDssFactorize: %d\n", ierr);
        return ierr;
    }
    // Set new values of B
    double b1[8] = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0};
    KmlSolverMatrixSetValue(B, b1);
    // Solve with new values
    ierr = KmlDssSolve(solver, B, X, &optSolve);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR in KmlDssSolve: %d\n", ierr);
        return ierr;
    }
    // Output new result x
    printf("Result of second factorize and solve:\n");
    for (int i = 0; i < n; i++) {
        printf("%lf ", x[i]);
    }
    printf("\n");
    // Query
    KmlDssInfo info;
    info.fieldMask = KMLDSS_INFO_PEAK_MEM;
    ierr = KmlDssQuery(solver, &info);
    if (ierr != KMLSS_NO_ERROR) {
        printf("ERROR in KmlDssQuery: %d\n", ierr);
        return ierr;
    }
    printf("Peak memory is %ld Byte\n", info.peakMem);
    // Destroy
    KmlDssClean(&solver);
    KmlSolverMatrixDestroy(&A);
    KmlSolverMatrixDestroy(&B);
    KmlSolverMatrixDestroy(&X);
    return 0;
}

编译运行

# 编译
gcc demo.c -lksolver -lklapack -lkblas -o tydss_demo
# 运行
./tydss_demo


本页内容