API Reference¶

This section provides an overview of the API of ParallelKDEpy as well as other tools available in the package. For more detailed information, please refer to the documentation of ParallelKDE.jl.

Grids¶

The package exposes the grid objects available in ParallelKDE.jl for use in Python. These grids can be used to define the grid on which the kernel density estimation is performed.

class parallelkdepy.Grid(ranges: Sequence[tuple] = [], *, device: str = 'cpu', b32: bool | None = None, grid_jl=None)[source]

Bases: object

Higher level implementation of a grid to use over meshgrid.

bounds() → list[tuple][source]: List of tuples of bounds for each dimension of the grid.

property device: Device type, e.g., ‘cpu’ or ‘cuda’.

fftgrid() → Grid[source]: Returns a grid of frequency components.

property grid_jl: Underlying Julia grid object.

initial_bandwidth() → list[source]: List of the minimum bandwidth that the grid can support in each dimension.

lower_bounds() → list[source]: List of lower bounds for each dimension of the grid.

property shape: Shape of the grid.

step() → list[source]: List of step sizes for each dimension of the grid.

to_meshgrid() → tuple[ndarray, ...][source]: Mesh grid coordinates

upper_bounds() → list[source]: List of upper bounds for each dimension of the grid.

Estimation¶

DensityEstimation is the main class for performing kernel density estimation in ParallelKDEpy. It provides methods for estimating densities on various grids and with different parameters.

The actual density estimation takes place when calling the estimate_density method. It takes the name of an estimator as a string, and keyword arguments corresponding to the parameters of the estimator. To use the "threaded" method for the estimators that allow it, set the environmental variable JULIA_NUM_THREADS to the number of threads you want to use before importing ParallelKDEpy. This can be done in Python as follows:

import os
os.environ["JULIA_NUM_THREADS"] = "4"  # Set to the desired number of threads

or before running your script:

export JULIA_NUM_THREADS=4  # Set to the desired number of threads

Note

Available estimators and their parameters are described in the [ParallelKDE.jl documentation].

class parallelkdepy.DensityEstimation(data: ndarray, *, grid: Grid | bool = False, dims: Sequence | None = None, grid_bounds: Sequence | None = None, grid_padding: Sequence | None = None, device: str = 'cpu')[source]

Bases: object

Main API object for density estimation.

property data: Numpy array of data points for density estimation.

property density: Numpy array representing the estimated density.

property device: Device type, e.g., ‘cpu’ or ‘cuda’.

estimate_density(estimation: str, **kwargs) → None[source]: Executes the density estimation algorithm on the data.

generate_grid(dims: Sequence | None = None, grid_bounds: Sequence | None = None, grid_padding: Sequence | None = None, overwrite: bool = False) → Grid[source]

Generates a grid based on the data and specified parameters.

Returns:: A Grid object representing the generated grid.
Return type:: Grid

get_density(**kwargs) → ndarray[source]: Returns the estimated density as a Numpy array.

property grid: Grid used for density estimation, if any.

Dirac sequences¶

For convenience, the Dirac sequences corresponding to a dataset on a grid can be generated with a Grid instance with initialize_dirac_sequence.

parallelkdepy.initialize_dirac_sequence(data: ndarray, grid: Grid, *, bootstrap_indices: ndarray | None = None, device: str = 'cpu', method: str | None = None) → ndarray[source]

Initialize a Dirac sequence on the given grid.

Parameters:

data (np.ndarray) – Data points to initialize the Dirac sequence.
grid (Grid) – The grid on which to initialize the Dirac sequence.
bootstrap_indices (Optional[np.ndarray], optional) – Numpy array of bootstrap indices, by default None. If provided, the shape should be (n_bootstraps, n_samples).
device (str, optional) – Device to store the array, e.g., ‘cpu’ or ‘cuda’, by default ‘cpu’.
method (str, optional) – Method to use for initialization, e.g., ‘serial’ or ‘parallel’, by default ‘serial’.

Returns:

Numpy array representing the initialized Dirac sequence.

Return type:

np.ndarray

Complete list of modules¶

parallelkdepy.wrapper

High-level API: Functions and objects that wrap Julia calls.

parallelkdepy.core

Low-level plumbing: Manage Julia session and interfacing between Python and Julia.