Salurefunctions helpers

Salurefunctions helpers

SalureFunctions

This module provides a class SalureFunctions which contains various utility functions that can be used for data processing and error handling in a data manipulation context.

The class contains the following functions:

  1. applymap(key: pd.Series, mapping: dict, default=None): Maps a given column of a dataframe to new values, according to the specified mapping. This function takes three parameters: key which is the input column you want to apply the rename to, mapping which is the mapping dictionary to look up the mapping, and default which is the fallback value if the mapping value is not in the mapping dictionary. If this is not specified, the function will return the key.

detect_changes_between_dataframes(df_old: pd.DataFrame, df_actual: pd.DataFrame, check_columns: list, unique_key: str, keep_old_values: Union[str, bool] = False, detect_column_changes = False): Compares two dataframes (old and actual) and detects changes between them. The function takes parameters such as df_old and df_actual which are the old and actual dataframes respectively, check_columns which is a list of columns to be checked for changes, unique_key which is a unique key column for grouping data, keep_old_values which is an optional parameter that determines how to handle old values, and detect_column_changes which is an optional boolean parameter to detect column changes between dataframes. The function returns a dataframe with new columns change_type and changed_fields, indicating the type of change and the changed fields respectively. Here's an example of how to use the detect_changes_between_dataframes method. Let's say we have two dataframes df_old and df_actual representing data from two different days, and we want to detect any changes between them.

python
  1. import pandas as pd
  2. from salure_functions import SalureFunctions
  3. # Sample data for demonstration
  4. data_old = {
  5. 'id': [1, 2, 3],
  6. 'name': ['Alice', 'Bob', 'Cathy'],
  7. 'age': [25, 30, 22] }
  8. data_actual = {
  9. 'id': [1, 2, 3, 4],
  10. 'name': ['Alice', 'Bob', 'Cathy', 'David'],
  11. 'age': [25, 31, 22, 28]
  12. } df_old = pd.DataFrame(data_old) df_actual = pd.DataFrame(data_actual)
  13. } df_old = pd.DataFrame(data_old) df_actual = pd.DataFrame(data_actual)
  14. # Detect changes between the two dataframes
  15. check_columns = ['name', 'age'] unique_key =
  16. unique_key = 'id'
  17. keep_old_values = 'dict'
  18. result = SalureFunctions.detect_changes_between_dataframes(df_old, df_actual, check_columns, unique_key, keep_old_values)
  19. print(result)

In this example, we have two dataframes df_old and df_actual. We want to compare the columns 'name' and 'age' using the 'id' column as the unique identifier. The keep_old_values parameter is set to 'dict' which will store the changed fields and their old values in a dictionary.

The output of this example would look like:

python
  1. id name age flag_old freq change_type changes
  2. 1 2 Bob 31 0 2 edited {'age': '30'}
  3. 3 4 David 28 0 1 new {}

This shows that the 'age' of Bob (id=2) has changed from 30 to 31, and David (id=4) is a new entry.

  1. scheduler_error_handling(): [Documentation missing. Please provide more information about the function.]

  2. convert_empty_columns_type(): [Documentation missing. Please provide more information about the function.]

  3. dfdate_to_datetime(): [Documentation missing. Please provide more information about the function.]

  4. send_error_to_slack(): [Documentation missing. Please provide more information about the function.]

  5. gen_dict_extract(): [Documentation missing. Please provide more information about the function.]

  6. archive_old_files(): [Documentation missing. Please provide more information about the function.]

  7. df_to_xlsx(): [Documentation missing. Please provide more information about the function.]

  8. zip_files(): [Documentation missing. Please provide more information about the function.]

  9. intervalmatch_dates(): [Documentation missing. Please provide more information about the function.]