pain001.csv package#
Submodules#
pain001.csv.load_csv_data module#
- pain001.csv.load_csv_data.load_csv_data(file_path: str) list[dict[str, Any]][source]#
Load CSV data from a file.
- Parameters:
file_path (str) – The path to the CSV file.
- Returns:
A list of dictionaries containing the CSV data.
- Return type:
list
- Raises:
FileNotFoundError – If the file does not exist.
IOError – If there is an issue reading the file.
UnicodeDecodeError – If there is an issue decoding the file’s content.
ValueError – If the CSV file is empty.
Note
For large files, consider using load_csv_data_streaming() to reduce memory footprint.
- pain001.csv.load_csv_data.load_csv_data_streaming(file_path: str, chunk_size: int = 1000) Generator[list[dict[str, Any]], None, None][source]#
Load CSV data from a file in chunks for memory-efficient processing.
This function yields chunks of CSV data instead of loading the entire file into memory, making it suitable for large files.
- Parameters:
file_path (str) – The path to the CSV file.
chunk_size (int) – Number of rows to yield per chunk. Default is 1000.
- Yields:
list – A list of dictionaries containing chunk_size rows of CSV data.
- Raises:
FileNotFoundError – If the file does not exist.
IOError – If there is an issue reading the file.
UnicodeDecodeError – If there is an issue decoding the file’s content.
ValueError – If the CSV file is empty.
Example
>>> for chunk in load_csv_data_streaming('large_file.csv', chunk_size=500): ... # Process chunk ... process_payment_batch(chunk)
- Performance:
Memory usage: ~90% reduction for large files (10K+ rows)
Enables processing of files larger than available RAM
Slightly slower than load_csv_data() due to yielding overhead
pain001.csv.validate_csv_data module#
Module contents#
CSV operations module for pain001.