鲲鹏数学库(KML)天元直接法求解器(DSS)开发实践
发表于 2026/05/09
0
1. 开发流程总览
使用 KML 直接法求解器的标准流程:
| 步骤 | 操作 |
|---|---|
| 1 | 环境准备与数据结构定义 |
| 2 | 创建矩阵与向量对象 |
| 3 | 求解器初始化与配置 |
| 4 | 执行分析、分解与求解 |
| 5 | 结果处理与查询 |
| 6 | 资源释放 |
2. 分步开发指南
2.1 环境准备与数据结构定义
引入头文件:
#include "kml_solver.h"稀疏矩阵 CSR 格式
示例采用 CSR(压缩稀疏行) 格式存储稀疏矩阵:
int n = 8; // 矩阵阶数
int ia[9] = {0, 2, 4, 6, 7, 8, 10, 12, 14};
int ja[14] = {0, 7, 1, 6, 2, 5, 3, 4, 2, 5, 1, 6, 0, 7};
double a[14] = {1.0, 2.0, -2.0, 3.0, 3.0, 4.0, -4.0, 5.0, 4.0, -6.0, 3.0, 7.0, 2.0, 8.0};
字段说明:
| 数组 | 长度 | 说明 |
|---|---|---|
ia | n+1 | 行偏移,第 i 行首个非零元在 ja/a 中的起始索引 |
ja | nnz | 列索引,每个非零元的列号 |
a | nnz | 非零元数值 |
右手边向量
int nrhs = 1; // 右手边项数量
double b[8] = {3.0, 1.0, 7.0, -4.0, 5.0, -2.0, 10.0, 10.0};
double x[8] = {0}; // 解向量缓冲区实践建议:准备数据前明确矩阵规模(n)、非零元数量(nnz)及右端项个数(nrhs)。
2.2 创建矩阵与向量对象
通过 KmlSolverMatrixStore(描述数据存储)和 KmlSolverMatrixOption(描述矩阵属性),调用 KmlSolverMatrixCreate 封装对象。
创建矩阵 A
KmlSolverMatrixStore storeA;
storeA.indexType = KMLSS_INDEX_INT32; // 索引类型
storeA.valueType = KMLSS_VALUE_FP64; // 数值类型
storeA.nRow = n;
storeA.nCol = n;
storeA.format = KMLSS_MATRIX_STORE_CSR; // 稀疏格式
storeA.csr.rowOffset = ia;
storeA.csr.colIndex = ja;
storeA.csr.value = a;
KmlSolverMatrixOption optA;
optA.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optA.type = KMLSS_MATRIX_GEN; // 一般矩阵
KmlSolverMatrix *A;
int ierr = KmlSolverMatrixCreate(&A, &storeA, &optA);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR when create A: %d\n", ierr);
return 1;
}创建向量 b 和 x
// 向量 b
KmlSolverMatrixStore storeB;
storeB.indexType = KMLSS_INDEX_INT32;
storeB.valueType = KMLSS_VALUE_FP64;
storeB.nRow = n;
storeB.nCol = nrhs;
storeB.format = KMLSS_MATRIX_STORE_DENSE_COL_MAJOR;
storeB.dense.value = b;
storeB.dense.ld = n; // 主维度
KmlSolverMatrixOption optB;
optB.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optB.type = KMLSS_MATRIX_GEN;
KmlSolverMatrix *B;
ierr = KmlSolverMatrixCreate(&B, &storeB, &optB);
// 向量 x(方式相同)
KmlSolverMatrixStore storeX;
storeX.indexType = KMLSS_INDEX_INT32;
storeX.valueType = KMLSS_VALUE_FP64;
storeX.nRow = n;
storeX.nCol = nrhs;
storeX.format = KMLSS_MATRIX_STORE_DENSE_COL_MAJOR;
storeX.dense.value = x;
storeX.dense.ld = n;
KmlSolverMatrixOption optX;
optX.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optX.type = KMLSS_MATRIX_GEN;
KmlSolverMatrix *X;
ierr = KmlSolverMatrixCreate(&X, &storeX, &optX);关键参数说明:
| 参数 | 可选值 | 说明 |
|---|---|---|
indexType | KMLSS_INDEX_INT32 / KMLSS_INDEX_INT64 | 索引数据类型 |
valueType | KMLSS_VALUE_FP32 / KMLSS_VALUE_FP64 | 数值数据类型 |
format | KMLSS_MATRIX_STORE_CSR / KMLSS_MATRIX_STORE_DENSE_COL_MAJOR | 存储格式 |
type | KMLSS_MATRIX_GEN / KMLSS_MATRIX_SPD | 矩阵类型(一般/对称正定) |
实践要点:对称正定矩阵可将 type 设为 KMLSS_MATRIX_SPD 以加速求解。
2.3 求解器初始化与配置
KmlDssInitOption opt;
opt.fieldMask = KMLDSS_INIT_OPTION_BWR_MODE | KMLDSS_INIT_OPTION_NTHREADS;
opt.bwrMode = KMLDSS_BWR_OFF; // 关闭二进制可重现
opt.nThreads = 32; // 并行线程数
KmlDssSolver *solver;
ierr = KmlDssInit(&solver, &opt);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssInit: %d\n", ierr);
return ierr;
}
初始化选项说明:
| 选项 | 可选值 | 说明 |
|---|---|---|
bwrMode | KMLDSS_BWR_OFF / KMLDSS_BWR_FIXED_THREADS | 二进制可重现控制选项 |
nThreads | 并行线程数 |
实践建议:fieldMask 决定哪些配置项生效,通过 位或(\|) 组合nThreads 根据 CPU 核心数与实际情况调整
2.4 分析、分解与求解
阶段一:分析(Analyze)
仅依赖矩阵的稀疏结构。
KmlDssAnalyzeOption optAnalyze;
optAnalyze.fieldMask = KMLDSS_ANALYZE_OPTION_MATCHING_TYPE |
KMLDSS_ANALYZE_OPTION_RDR_TYPE |
KMLDSS_ANALYZE_OPTION_NTHREADS_RDR;
optAnalyze.matchingType = KMLDSS_MATCHING_OFF;
optAnalyze.rdrType = KMLDSS_RDR_KRDR; // 重排序算法
optAnalyze.nThreadsRdr = 8; // 重排序并行度
ierr = KmlDssAnalyze(solver, A, &optAnalyze);
分析选项说明:
| 选项 | 可选值 | 说明 |
|---|---|---|
matchingType | KMLDSS_MATCHING_OFF / KMLDSS_MATCHING_MC64 / KMLDSS_MATCHING_SMC64 | 匹配策略,可改善数值稳定性 |
rdrType | KMLDSS_RDR_KRDR / KMLDSS_RDR_CUSTOM | 重排序算法,影响 fill-in 和分解性能 |
nThreadsRdr | 重排序阶段的并行线程数 |
阶段二:分解(Factorize)
执行数值分解,依赖分析阶段的结果。
KmlDssFactorizeOption optFact;
optFact.fieldMask = KMLDSS_FACTORIZE_OPTION_PERTURBATION_THRESHOLD;
optFact.perturbationThreshold = 1e-8; // 扰动阈值
ierr = KmlDssFactorize(solver, A, &optFact);
分析选项说明:
| 选项 | 可选值 | 说明 |
|---|---|---|
perturbationThreshold | 扰动阈值,若对角线存在零元可提高数值稳定性 |
阶段三:求解(Solve)
使用分解结果前代/回代,多右端项时可重复调用。
KmlDssSolveOption optSolve;
optSolve.fieldMask = KMLDSS_SOLVE_OPTION_SOLVE_STAGE | KMLDSS_SOLVE_OPTION_REFINE_METHOD;
optSolve.stage = KMLDSS_SOLVE_ALL; // 完整求解
optSolve.refineMethod = KMLDSS_REFINE_OFF; // 关闭迭代精化
ierr = KmlDssSolve(solver, B, X, &optSolve);求解选项说明:
| 选项 | 可选值 | 说明 |
|---|---|---|
stage | KMLDSS_SOLVE_FORWARD / KMLDSS_SOLVE_BACKWARD / KMLDSS_SOLVE_ALL | 求解阶段 |
refineMethod | KMLDSS_REFINE_OFF / KMLDSS_REFINE_CLASSICAL | 迭代修正方法 |
调优关键:rdrType 直接影响 fill-in,也可尝试选择开源重排算法如perturbationThreshold 是解决数值稳定性问题的首选参数多右端项时,仅重复调用 KmlDssSolve,复用分解结果
2.5 结果输出与信息查询
输出解向量
printf("Result of first factorize and solve:\n");
for (int i = 0; i < n; i++) {
printf("%lf ", x[i]);
}
printf("\n");运行结果:
Result of first factorize and solve:
1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 查询求解器信息
KmlDssInfo info;
info.fieldMask = KMLDSS_INFO_PEAK_MEM; // 查询峰值内存
ierr = KmlDssQuery(solver, &info);
printf("Peak memory is %ld Byte\n", info.peakMem);可查询信息类型:
| 选项 | 说明 |
|---|---|
peakMem | 峰值内存使用量(字节) |
2.6 资源释放
按"先求解器,后矩阵/向量"的顺序释放:
KmlDssClean(&solver);
KmlSolverMatrixDestroy(&A);
KmlSolverMatrixDestroy(&B);
KmlSolverMatrixDestroy(&X);3. 进阶实践
3.1 数值复用(结构不变、数值变)
示例演示了高效复用流程——跳过初始化与分析,仅重新分解和求解:
// 更新矩阵 A 的数值
double a1[14] = {2.0, 3.0, -3.0, 4.0, 4.0, 5.0, -5.0, 6.0,
5.0, -7.0, 4.0, 8.0, 3.0, 9.0};
KmlSolverMatrixSetValue(A, a1);
// 仅重新分解
ierr = KmlDssFactorize(solver, A, &optFact);
// 更新右端项 B 的数值并求解
double b1[8] = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0};
KmlSolverMatrixSetValue(B, b1);
ierr = KmlDssSolve(solver, B, X, &optSolve);运行结果:
Result of second factorize and solve:
0.666667 -0.100000 0.226415 -0.200000 0.166667 0.018868 0.175000 -0.111111 关键技巧:当求解一系列非零模式相同、数值不同的系统时,该机制可大幅节省分析阶段耗时。
3.2 多右端项求解
对于多个右端项(nrhs > 1),只需:
- 创建包含多列的稠密矩阵 B 和 X
- 首次调用
KmlDssSolve时设置stage = KMLDSS_SOLVE_ALL - 分解结果自动复用
4. 错误排除
| 错误现象 | 可能原因 | 检查点 |
|---|---|---|
ERROR when create A | CSR 数组配置错误 | ia[0] 是否为 0;ia[n] 是否等于 nnz;列索引是否越界 |
ERROR in KmlDssAnalyze | 矩阵结构异常 | 是否有全零行/列;索引是否升序排列 |
ERROR in KmlDssFactorize | 矩阵数值问题或内存不足 | 矩阵是否奇异;调整 perturbationThreshold;检查可用内存 |
| 计算结果不正确 | 输入数据问题 | CSR 列索引是否行内升序;索引基是否为 0-based;向量 ld 设置是否正确 |
错误处理模板
ierr = KmlDssAnalyze(solver, A, &optAnalyze);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssAnalyze: %d\n", ierr);
// 根据错误码执行相应处理
return ierr;
}常见错误码
| 错误码 | 含义 |
|---|---|
KMLSS_NO_ERROR | 成功 |
KMLSS_INTERNAL_ERROR | 内部错误 |
KMLSS_NULL_ARGUMENT | 参数无效 |
5. 完整代码及编译运行方式
完整代码
#include <stdio.h>
#include <stdlib.h>
#include "kml_solver.h"
int main()
{
int ierr;
int n = 8;
int nrhs = 1;
// Create matrix A
int ia[9] = {0, 2, 4, 6, 7, 8, 10, 12, 14};
int ja[14] = {0, 7, 1, 6, 2, 5, 3, 4, 2, 5, 1, 6, 0, 7};
double a[14] = {1.0, 2.0, -2.0, 3.0, 3.0, 4.0, -4.0, 5.0, 4.0, -6.0, 3.0, 7.0, 2.0, 8.0};
KmlSolverMatrixStore storeA;
storeA.indexType = KMLSS_INDEX_INT32;
storeA.valueType = KMLSS_VALUE_FP64;
storeA.nRow = n;
storeA.nCol = n;
storeA.format = KMLSS_MATRIX_STORE_CSR;
storeA.csr.rowOffset = ia;
storeA.csr.colIndex = ja;
storeA.csr.value = a;
KmlSolverMatrixOption optA;
optA.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optA.type = KMLSS_MATRIX_GEN;
KmlSolverMatrix *A;
ierr = KmlSolverMatrixCreate(&A, &storeA, &optA);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR when create A: %d\n", ierr);
return 1;
}
// Create vector b
double b[8] = {3.0, 1.0, 7.0, -4.0, 5.0, -2.0, 10.0, 10.0};
KmlSolverMatrixStore storeB;
storeB.indexType = KMLSS_INDEX_INT32;
storeB.valueType = KMLSS_VALUE_FP64;
storeB.nRow = n;
storeB.nCol = nrhs;
storeB.format = KMLSS_MATRIX_STORE_DENSE_COL_MAJOR;
storeB.dense.value = b;
storeB.dense.ld = n;
KmlSolverMatrixOption optB;
optB.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optB.type = KMLSS_MATRIX_GEN;
KmlSolverMatrix *B;
ierr = KmlSolverMatrixCreate(&B, &storeB, &optB);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR when create b: %d\n", ierr);
return 1;
}
// Create vector x
double x[8] = {0};
KmlSolverMatrixStore storeX;
storeX.indexType = KMLSS_INDEX_INT32;
storeX.valueType = KMLSS_VALUE_FP64;
storeX.nRow = n;
storeX.nCol = nrhs;
storeX.format = KMLSS_MATRIX_STORE_DENSE_COL_MAJOR;
storeX.dense.value = x;
storeX.dense.ld = n;
KmlSolverMatrixOption optX;
optX.fieldMask = KMLSS_MATRIX_OPTION_TYPE;
optX.type = KMLSS_MATRIX_GEN;
KmlSolverMatrix *X;
ierr = KmlSolverMatrixCreate(&X, &storeX, &optX);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR when create x: %d\n", ierr);
return 1;
}
// Init solver
KmlDssInitOption opt;
opt.fieldMask = KMLDSS_INIT_OPTION_BWR_MODE | KMLDSS_INIT_OPTION_NTHREADS;
opt.bwrMode = KMLDSS_BWR_OFF;
opt.nThreads = 32;
KmlDssSolver *solver;
ierr = KmlDssInit(&solver, &opt);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssInit: %d\n", ierr);
return ierr;
}
// Analyze
KmlDssAnalyzeOption optAnalyze;
optAnalyze.fieldMask = KMLDSS_ANALYZE_OPTION_MATCHING_TYPE |
KMLDSS_ANALYZE_OPTION_RDR_TYPE |
KMLDSS_ANALYZE_OPTION_NTHREADS_RDR;
optAnalyze.matchingType = KMLDSS_MATCHING_OFF;
optAnalyze.rdrType = KMLDSS_RDR_KRDR;
optAnalyze.nThreadsRdr = 8;
ierr = KmlDssAnalyze(solver, A, &optAnalyze);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssAnalyze: %d\n", ierr);
return ierr;
}
// Factorize
KmlDssFactorizeOption optFact;
optFact.fieldMask = KMLDSS_FACTORIZE_OPTION_PERTURBATION_THRESHOLD;
optFact.perturbationThreshold = 1e-8;
ierr = KmlDssFactorize(solver, A, &optFact);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssFactorize: %d\n", ierr);
return ierr;
}
// Solve
KmlDssSolveOption optSolve;
optSolve.fieldMask = KMLDSS_SOLVE_OPTION_SOLVE_STAGE | KMLDSS_SOLVE_OPTION_REFINE_METHOD;
optSolve.stage = KMLDSS_SOLVE_ALL;
optSolve.refineMethod = KMLDSS_REFINE_OFF;
ierr = KmlDssSolve(solver, B, X, &optSolve);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssSolve: %d\n", ierr);
return ierr;
}
// Output result x
printf("Result of first factorize and solve:\n");
for (int i = 0; i < n; i++) {
printf("%lf ", x[i]);
}
printf("\n");
// Set new values of A
double a1[14] = {2.0, 3.0, -3.0, 4.0, 4.0, 5.0, -5.0, 6.0, 5.0, -7.0, 4.0, 8.0, 3.0, 9.0};
KmlSolverMatrixSetValue(A, a1);
// Factorize with new values
ierr = KmlDssFactorize(solver, A, &optFact);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssFactorize: %d\n", ierr);
return ierr;
}
// Set new values of B
double b1[8] = {1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0};
KmlSolverMatrixSetValue(B, b1);
// Solve with new values
ierr = KmlDssSolve(solver, B, X, &optSolve);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssSolve: %d\n", ierr);
return ierr;
}
// Output new result x
printf("Result of second factorize and solve:\n");
for (int i = 0; i < n; i++) {
printf("%lf ", x[i]);
}
printf("\n");
// Query
KmlDssInfo info;
info.fieldMask = KMLDSS_INFO_PEAK_MEM;
ierr = KmlDssQuery(solver, &info);
if (ierr != KMLSS_NO_ERROR) {
printf("ERROR in KmlDssQuery: %d\n", ierr);
return ierr;
}
printf("Peak memory is %ld Byte\n", info.peakMem);
// Destroy
KmlDssClean(&solver);
KmlSolverMatrixDestroy(&A);
KmlSolverMatrixDestroy(&B);
KmlSolverMatrixDestroy(&X);
return 0;
}编译运行
# 编译
gcc demo.c -lksolver -lklapack -lkblas -o tydss_demo
# 运行
./tydss_demo

