Utility Functions and Tools

This page documents additional utilities and tools that are part of the lpjg2nc2 package.

NaN Value Analysis

The count_nans module provides tools for analyzing the sparsity of data in NetCDF files, which is particularly useful for land-only data on global grids.

from lpjg2nc.count_nans import analyze_netcdf, print_short_summary

# Analyze a NetCDF file for NaN values
stats = analyze_netcdf('/path/to/file.nc', verbose=True, return_stats=True)

# Print a concise summary
print_short_summary(stats)

The analysis provides: - Percentage of valid data points - Percentage of NaN values - Per-variable statistics - Warnings for highly sparse datasets (>95% NaN)

CDO Grid Remapping

The cdo_interpolation module offers functionality to remap output to regular global grids using Climate Data Operators (CDO).

from lpjg2nc.cdo_interpolation import remap_to_regular_grid

# Remap a NetCDF file to a 1-degree global grid
remapped_file = remap_to_regular_grid(
    input_file='/path/to/input.nc',
    grid_param='360x180',  # Can also use resolution: 1.0
    verbose=True
)

Key features: - Support for both resolution-based (e.g., 1.0) and grid dimension-based (e.g., 360x180) specifications - Automatic grid file generation - Nearest-neighbor interpolation for unstructured grids

File Structure Detection

The file_parser module provides tools for automatically detecting the structure of LPJ-GUESS output files.

from lpjg2nc.file_parser import detect_file_structure

# Detect the structure of an LPJ-GUESS output file
structure = detect_file_structure('/path/to/output.out')

# Access the detected structure
columns = structure['columns']
has_time_index = structure['has_time_index']
is_yearly = structure['is_yearly']

The detected structure includes: - Column names and their meanings - Whether the file has a time index - Whether the data is yearly or contains sub-annual values - Data types for each column

Grid Utilities

The grid_utils module provides functions for working with spatial grids and coordinates.

from lpjg2nc.grid_utils import read_grid_information, match_coordinates_to_grid

# Read grid information from a grids.nc file
grid_info = read_grid_information('/path/to/base_dir/')

# Match coordinates to a grid
grid_index = match_coordinates_to_grid(lat, lon, grid_info)

These utilities help convert between irregular coordinate points and structured grids, which is essential for creating properly formatted NetCDF files.

Command-line Tools

In addition to the main lpjg2nc.py script, the package includes several command-line utilities:

  1. Test-run with ifs_input files:

    ./lpjg2nc.py -p /path/to/data/ --test ifs_input
    
  2. Run only for specific file pattern:

    ./lpjg2nc.py -p /path/to/data/ --pattern vegc.out
    
  3. Run with custom parallelization:

    ./lpjg2nc.py -p /path/to/data/ -j 16 --inner-jobs 32