Project

General

Profile

Feature #35721

Speed up get_profiles and get_temperature by numba

Added by Xin Zhang 9 months ago. Updated 9 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
03/19/2022
Due date:
% Done:

100%


Description

Issue

The for loop in both functions slows down the script a lot. I have tried to speed it up by numba.

Solution

from numba import jit

@jit(nopython=True)
def calc_field(weights, spatial_index_lat, spatial_index_lon,
               surface_pressure_external, surface_pressure, field,
               result_field, result_pressure, hya, hyb):
    for pix_idx in range(weights.shape[1]):
        if surface_pressure_external is not None:
            ps = surface_pressure_external.flat[pix_idx]
        else:
            ps = 0.0 
        for i in range(4):
            if surface_pressure_external is None:
                ps += (weights[i, pix_idx] * 
                        surface_pressure[spatial_index_lat[i, pix_idx], 
                                         spatial_index_lon[i, pix_idx]])
            result_field[:, pix_idx] += (weights[i, pix_idx] * 
                                         field[:, spatial_index_lat[i, pix_idx], 
                                         spatial_index_lon[i, pix_idx]])

        result_pressure[:, pix_idx] = hya + ps * hyb

    return result_field, result_pressure

def get_temperature(self, surface_pressure_external=None):
    ...........
    result_field, result_pressure = calc_field(self.weights, self.spatial_index_lat, self.spatial_index_lon,
                                               surface_pressure_external, surface_pressure, field,
                                               result_field, result_pressure, hyam, hybm)

def get_profiles(self, surface_pressure_external=None):
    ...........
    result_field, result_pressure = calc_field(self.weights, self.spatial_index_lat, self.spatial_index_lon,
                                               surface_pressure_external, surface_pressure, field,
                                               result_field, result_pressure, hyai, hybi)

Note that I don't use the jitclass which needs to specify all variables of class.

I just put the jit function outside the class and use it directly.

Test

And the execution time of the loop in get_profiles decreased from ~ 470 seconds to ~ 2.5 seconds.

For my case, the maximum differences between current slow loop and numba are 3.5527137e-15 and 9.1552734e-05 respectively for no2_vmr and temperature.

History

#1 Updated by Maarten Sneep 9 months ago

Hi, if you don't add watchers, or don't assign the issue then this won't get picked up. It is by chance that I see this. I'll see if I can include this in a way that is compatible with an installation without numba.

#2 Updated by Xin Zhang 9 months ago

  • Assignee set to Maarten Sneep

#3 Updated by Maarten Sneep 9 months ago

  • Status changed from New to Resolved
  • % Done changed from 0 to 100

The new script will be included in the next PyCAMA (intermediate) release.

Also available in: Atom PDF