pain001.csv package#

Submodules#

pain001.csv.load_csv_data module#

pain001.csv.load_csv_data.load_csv_data(file_path: str) list[dict[str, Any]][source]#

Load CSV data from a file.

Parameters:

file_path (str) – The path to the CSV file.

Returns:

A list of dictionaries containing the CSV data.

Return type:

list

Raises:
  • FileNotFoundError – If the file does not exist.

  • IOError – If there is an issue reading the file.

  • UnicodeDecodeError – If there is an issue decoding the file’s content.

  • ValueError – If the CSV file is empty.

Note

For large files, consider using load_csv_data_streaming() to reduce memory footprint.

pain001.csv.load_csv_data.load_csv_data_streaming(file_path: str, chunk_size: int = 1000) Generator[list[dict[str, Any]], None, None][source]#

Load CSV data from a file in chunks for memory-efficient processing.

This function yields chunks of CSV data instead of loading the entire file into memory, making it suitable for large files.

Parameters:
  • file_path (str) – The path to the CSV file.

  • chunk_size (int) – Number of rows to yield per chunk. Default is 1000.

Yields:

list – A list of dictionaries containing chunk_size rows of CSV data.

Raises:
  • FileNotFoundError – If the file does not exist.

  • IOError – If there is an issue reading the file.

  • UnicodeDecodeError – If there is an issue decoding the file’s content.

  • ValueError – If the CSV file is empty.

Example

>>> for chunk in load_csv_data_streaming('large_file.csv', chunk_size=500):
...     # Process chunk
...     process_payment_batch(chunk)
Performance:
  • Memory usage: ~90% reduction for large files (10K+ rows)

  • Enables processing of files larger than available RAM

  • Slightly slower than load_csv_data() due to yielding overhead

pain001.csv.validate_csv_data module#

pain001.csv.validate_csv_data.validate_csv_data(data: list[dict[str, Any]]) bool[source]#

Validate the CSV data before processing it.

Parameters:

data (list) – A list of dictionaries containing the CSV data.

Returns:

True if the data is valid, False otherwise.

Return type:

bool

Module contents#

CSV operations module for pain001.