Step 4: Generating LCI samples

Introduction

This step is where the actual LCI arrays are calculated. It works by successively instantiating as many brightway2 MonteCarloLCA as there are activities in the LCI database, and calculating the resulting LCI results for the same number of iterations as those contained in the corresponding presamples package.

What ensures that the LCI are dependently sampled is the use of presample packages, which inject the same values for all technosphere (A) and biosphere (B) elements.

The resulting LCI arrays are stored in the LCI subdirectory of the result_dir directory. Each samples_batch has its own subdirectory

Warning

Because of the large number of LCI datasets in LCI databases, this step is extremely long (on the order of days). bw2preagg implements three strategies to help reduce this time:

1- samples_batch: Allows calculating multiple sets of dependently sampled arrays with smaller number of iterations (one batch). These “batches” can then be concatenated into arrays with the required number of iterations.

2- parallel_jobs: Allows the parallel calculation of LCI arrays on multiple CPU of a given computer (default=1).

3- slices: Useful on computer clusters, allows further splitting up of the activity list into smaller slices and sending these to different jobs.

Several functions can be of interest, all pf which are imported in the namespace with from bw2preagg import *. The calculate_lci_array calculates the actual LCI arrays. It is rarely directly invoked by a user, but rather called from set_up_lci_calculations, that gathers the necessary information to run calculate_lci_array for a specified set of activities. set_up_lci_calculations itself is rarely invoked direclty by a user, but rather from a top-level dispatch_lci_calculators function that splits the LCI calculation across machines (slices) and across CPUs (parallel_jobs).

Technical reference

dispatch_lci_calculators

The top-level function is dispatch_lci_calculators. It is typically the only one a user will interact with.

dispatch_lci_calculators verifies that all required data (project, presamples, common files, etc.) are actually available, splits the work first across slices (to run on multiple computers in a cluster) and then across CPUs (to use MultiProcessing) and invokes set_up_lci_calculations.

bw2preagg.lci.dispatch_lci_calculators(project_name, database_name, result_dir, samples_batch=0, parallel_jobs=1, slice_id=None, number_of_slices=None)

Dispatches LCI array calculations to distinct processes (multiprocessing)

If number_of_slices/slice_id are not None, then only a subset of database activities are processed.

The number of iterations is determined by the number of columns in the presample packages referenced in the corresponding campaign.

If slice_id and number_of_slices are not None, will only treat a subset of activities.

Parameters:
  • project_name (str) – Name of the brightway2 project where the database is imported
  • database_name (str) – Name of the LCI database
  • result_dir (str) – Path to directory where results are stored
  • samples_batch (int, default=0) – Integer id for sample batch. Used for campaigns names and for generating a seed for the RNG. The maximum value is 14.
  • parallel_jobs (int, default=1) – Number of parallel jobs to run using multiprocessing
  • slice_id (int, default=None) – ID of slice. Useful when calculations are split across many computers or jobs on a computer cluster. If None, LCI arrays are generated for all activities in the database.
  • number_of_slices (int, default=None) – Number of slices over which the calculations are split. Useful when calculations are split across many computers or jobs on a computer cluster. If None, LCI arrays are generated for all activities in the database.

set_up_lci_calculations

The set_up_lci_calculations function then gathers the necessary information for the specified subset of activities and invokes the subsequent calculate_lci_array for each activity for which LCI arrays need to be calculated.

bw2preagg.lci.set_up_lci_calculations(activity_list, result_dir, worker_id, database_name, samples_batch, project_name)

Dispatch LCI calculation for a list of activities

Parameters:
  • activity_list (list) – List of codes to activities for which LCI arrays should be calculated
  • result_dir (str) – Path to directory where results are stored
  • worker_id (int) – Identification of the worker if using MultiProcessing, used only for error messages.
  • database_name (str) – Name of the LCI database
  • samples_batch (int) – Integer id for sample batch. Used for campaigns names and for generating a seed for the RNG. The maximum value is 14.
  • project_name (str) – Name of the brightway2 project where the database is imported
Returns:

Return type:

None

calculate_lci_array

calculate_lci_array is the function where the actual LCI calculation occurs, using the brightway2 MonteCarloLCA class.

bw2preagg.lci.calculate_lci_array(database_name, act_code, presamples_paths, g_dimensions, total_iterations, g_samples_dir)

Return LCI array using specified presamples for given activity

Typically invoked from generate_LCI_samples.set_up_lci_calculations

Parameters:
  • database_name (str) – Name of the LCI database
  • act_code (str) – code of the activity
  • presamples_paths (list) – list of paths to presamples packages
  • g_dimensions (int) – Number of rows in biosphere matrix
  • total_iterations (int) – Number of iterations (i.e. number of columns in presample arrays)
  • g_samples_dir (str) – Path to directory where LCI arrays will be saved
Returns:

Return type:

None