Rate This Document
Findability
Accuracy
Completeness
Readability

System-Level Optimization

Jason Evans malloc (jemalloc) is a high-performance and general-purpose memory allocator. To improve the performance of TensorFlow Serving in high-concurrency inference scenarios, jemalloc is introduced to manage memory more efficiently, reduce lock contention and mitigate fragmentation. This leads to a lower variance in memory usage and higher throughput and stability for inference requests.

  1. Obtain the jemalloc source archive and decompress it.
    1
    2
    wget https://github.com/jemalloc/jemalloc/archive/refs/tags/5.3.0.tar.gz --no-check-certificate
    tar zxvf 5.3.0.tar.gz
    
  2. Go to the installation directory.
    1
    cd jemalloc-5.3.0/
    
  3. Compile and install jemalloc.
    1
    2
    3
    4
    ./autogen.sh
    ./configure
    make -j
    make install
    
  4. Verify the installation.
    1
    ll /usr/local/lib/libjemalloc*
    

    The installation is successful if the following information is displayed:

  5. jemalloc can be enabled by setting the LD_PRELOAD environment variable and the MALLOC_CONF environment variable is used to configure the memory manager's behavior. This document provides the enablement commands and the optimal configurations for the Kunpeng platform.
    export LD_PRELOAD="/usr/local/lib/libjemalloc.so"
    export MALLOC_CONF="background_thread:true,metadata_thp:auto,dirty_decay_ms:20000,muzzy_decay_ms:20000"