Overview
Tuning Strategy
It is difficult to locate the precision problem caused by the application code. Therefore, use the precision analysis tool to locate the problem. Besides, you are advised to check whether there are historical code differences.
Major Code Differences
Case 1: In WRF, variables exceed the preset value range.

The code comments indicate that the values of i_cos and i_tau range from 1 to 10.
When the values of cosz(i) and cld_alb are 1, the values of i_cos and i_tau are 11, which exceeds the preset value range. This problem has been resolved in the code of WRF 4.1.2, which ensures that the values of i_cos and i_tau do not exceed 10. The modified code is as follows:

Case 2: In WRF, array out-of-bounds access occurs.
The terrain data used by WRF Processing System (WPS) contains 33 terrain types.
num_land_cat =33
In the MPTABLE.TBL configuration file of the WRF code, the terrain data is 27, and the data dimensions of related variables (such as SLA) are only 27 dimensions. If the data beyond 27 dimensions is accessed, array out-of-bounds access occurs. As a result, the SLA values obtained by different compilers differ.

To solve the problem of the phys/module_sf_noahmpdrv.F code, you are advised to check the value of IVGTYP. When the value of IVGTYP is 31, 32, or 33, change the subscript of SLA to ISURBAN_TABLE, that is, 13. The modified code is as follows:

Case 3: In WRF, the calculation results vary according to the number of processes.
The following figure shows the code of SUBROUTINE toposhad_init in module_radiation_driver.F. The value of ht_loc is related to the range of its, ite, jts, and jte (area range allocated to each thread/tile). When the number of processes are different, the area range allocated to each process is different. When the number of processes is large, the area range allocated to each process is small. When the number of processes is small, the area range allocated to each process is large. As a result, the value of ht_loc varies according to the number of processes. According to the subsequent calculation of SUBROUTINE toposhad_init, the value of ht_loc affects the shadowmask result. When the values of shadowmask are different, the values of shadow of SUBROUTINE TOPO_RAD_ADJ in module_surface_driver.F can be different (0 or 1). As a result, the SWDOWN_teradj results are different, causing different calculation results of SWDOWN, GSW, and SWNORM in SUBROUTINE TOPO_RAD_ADJ_DRVR.

Case 4: In GRAPES, array random initialization behaviors are different.
The uninitialized one-dimensional integer array KUO is directly transferred to the SHALCV function for calculation. By default, the KUO array is initialized to 0. If so, physical process calculations will be performed. However, on x86 platform, the KUO array is initialized to a random value. As a result, quite a few physical process calculations are not performed, although the results are close to the actual values by an odd coincidence. On Kunpeng platform, half of the arrays are initialized to 0 with the other half to random values. Half of the calculations are correct, which caused differences between calculation results and actual values.

To modify, you only need to initialize the uninitialized KUO to a value other than 0 for the results to be highly consistent to that of x86.

Case 5: In GRAPES, file re-reading causes result differences.
The CoLM/odata/ srfdata-g1 file exists in the computing test case directory of GRAPES. If the file is not deleted before calculation, GRAPES will read the existing srfdata-g1 file as the input, causing different results. If the file does not exist, it is regenerated by calculations. After the calculation is complete, the file is saved in the CoLM/odata/ directory. Because the CoLM/odata/ directory is often soft-connected, the srfdata-g1 file is the calculation result before the re-reading, instead of a newly generated one. You can add mv CoLM/odata/srfdata-g1 CoLM/odata/srfdata-g1-old or a similar command before running the computing test case to delete the srfdata-g1 file.
To modify, delete CoLM/odata/ srfdata-g1 from the computing test case directory before calculation.