Introduction to KNewPfordelta
Latest Updates
- [2025.09.30]: KNewPfordelta is released on the GitCode platform. It supports block-based processing, exception handling, and SIMD acceleration.
Project Introduction
KNewPfordelta (an improved PForDelta algorithm) is an integer compression algorithm optimized by Kunpeng based on the open-source PForDelta algorithm. It is designed for efficient compression and fast decompression of inverted indexes. It is widely used in search engines, recommendation systems, and other scenarios that require fast processing of large-scale ordered integer sequences, such as document ID lists, and term frequencies and positions. KNewPfordelta leverages block-based processing, exception handling, and SIMD acceleration to achieve an optimal balance between storage costs and query performance in inverted index compression. It is especially suitable for large-scale data systems that have strict requirements on real-time performance. Its advantages in decompression efficiency make it one of the go-to compression solutions for high-performance systems such as search engines and recommendation engines.
The KNewPfordelta algorithm for decompressing inverted indexes is applicable to the recall phase of recommendation systems. It is widely used in search engines, recommendation systems, and other scenarios that require fast processing of large-scale ordered integer sequences, such as document ID lists, and term frequencies and positions.
Directory Structure
The directory structure of KNewPfordelta is as follows:
knewpfordelta
├─ LICENSE
├─ Makefile // Compilation file
├─ README.md
├─ coding_policy.c // Operator implementation file, which implements the compression and decompression functionalities of KNewPfordelta.
├─ coding_policy.h // Operator header file defining the APIs for KNewPfordelta compression and decompression functions
├─ coding_policy_helper.h // Helper header file for coding policies
├─ howtouse.c // Sample program demonstrating how to use the KNewPfordelta library
├─ pack.c // Bit-packing file that packs integer data using a specified bit width
├─ pack.h // Bit-packing header file that defines the bit-packing API
├─ pfordelta.c // PForDelta algorithm file that implements compression and decompression functionalities of Pfordelta
├─ pfordelta.h // PForDelta algorithm header file that defines compression and decompression APIs of Pfordelta
├─ unpack.c // Bit-unpacking file that unpacks integer data packed using a specified bit width
├─ unpack.h // Bit-unpacking header file that defines the bit-unpacking API
├─ test/
coding_policy.h: header file containing interface declarations for KNewPfordeltaMakefile: compilation file for functional and performance testingtest: test file directorygencover.sh: file for generating visualized code coverage reportsgen_data: directory for generating inverted index test datasets
Release Notes
For details about the version updates of the KNewPfordelta algorithm, see Release Notes.
Compatibility Information
Documents
Resource Type |
Resource Name |
Resource Description |
|---|---|---|
Document |
Provides the basic information and feature updates of each released version of hnswlib. |
|
Document |
Provides guidance for deploying the KNewPfordelta decompression algorithm and verifying its decompression performance and functions. |
|
Document |
Provides detailed guidance on how to compile and install the KNewPfordelta source code. |
|
Document |
Provides definitions and descriptions of KNewPfordelta APIs. |
|
Document |
Provides practice cases of KNewPfordelta. |
|
Document |
Describes the optimization principles and features of KNewPfordelta. |
Disclaimer
To KNewPfordelta Users
This project is intended solely for debugging and development. You are responsible for any risks and should carefully review the following information:
- Data processing and deletion: Users are responsible for managing and deleting any data generated while using this tool. Users are advised to delete such data promptly after use to prevent information leakage.
- Data confidentiality and transmission: Users understand and agree not to share or transmit any data generated by this tool. Neither the tool nor its developers are responsible for any information leaks, data breaches, or other negative consequences.
- User input security: Users are responsible for the security of any commands they enter and for any risks or losses resulting from improper input. The tool and its developers are not liable for issues caused by incorrect command usage.
Disclaimer scope: This disclaimer applies to all individuals and entities using this tool. By using the tool, you acknowledge and accept this statement and assume all risks and responsibilities arising from its use. If you do not agree, please stop using the tool immediately.
Before using this tool, please read and understand the preceding disclaimer. If you have any questions, contact the developer.
To Data Owners
If you do not want your model or dataset to be mentioned in this project, or if you wish to update its description, please submit an issue on GitCode. We will delete or update your description according to your request. Thank you for your understanding and contribution to this project.
License
KNewPfordelta is licensed under the 3-Clause BSD License, which allows modification and redistribution of derivative works as open source. For details, see LICENSE.
The documents of this project are licensed under CC-BY 4.0. For details, see LICENSE.
Contribution Statement
We welcome your contributions to the community. If you have any questions/suggestions or want to provide feedback on feature requirements and bug reports, you can submit issues. For details, see Contribution Guideline. You are also welcome to share insights in the Discussions. Thank you for your support.
Acknowledgments
KNewPfordelta is jointly developed by the following Huawei department:
- Kunpeng Computing BoostKit Development Dept
Thank you to everyone in the community for your PRs. We warmly welcome contributions to KNewPfordelta!