SalureFunctions
SalureFunctions
which contains various utility functions that can be used for data processing and error handling in a data manipulation context.The class contains the following functions:
applymap(key: pd.Series, mapping: dict, default=None)
:
Maps a given column of a dataframe to new values, according to the specified mapping. This function takes three parameters: key
which is the input column you want to apply the rename to, mapping
which is the mapping dictionary to look up the mapping, and default
which is the fallback value if the mapping value is not in the mapping dictionary. If this is not specified, the function will return the key.
detect_changes_between_dataframes(df_old: pd.DataFrame, df_actual: pd.DataFrame, check_columns: list, unique_key: str, keep_old_values: Union[str, bool] = False, detect_column_changes = False)
:
Compares two dataframes (old and actual) and detects changes between them. The function takes parameters such as df_old
and df_actual
which are the old and actual dataframes respectively, check_columns
which is a list of columns to be checked for changes, unique_key
which is a unique key column for grouping data, keep_old_values
which is an optional parameter that determines how to handle old values, and detect_column_changes
which is an optional boolean parameter to detect column changes between dataframes. The function returns a dataframe with new columns change_type
and changed_fields
, indicating the type of change and the changed fields respectively. Here's an example of how to use the detect_changes_between_dataframes
method. Let's say we have two dataframes df_old
and df_actual
representing data from two different days, and we want to detect any changes between them.
pythonimport pandas as pd
from salure_functions import SalureFunctions
# Sample data for demonstration
data_old = {
'id': [1, 2, 3],
'name': ['Alice', 'Bob', 'Cathy'],
'age': [25, 30, 22]
}
data_actual = {
'id': [1, 2, 3, 4],
'name': ['Alice', 'Bob', 'Cathy', 'David'],
- 'age': [25, 31, 22, 28]
}
df_old = pd.DataFrame(data_old)
df_actual = pd.DataFrame(data_actual)
}
df_old = pd.DataFrame(data_old)
df_actual = pd.DataFrame(data_actual)
# Detect changes between the two dataframes
check_columns = ['name', 'age']
unique_key =
unique_key = 'id'
keep_old_values = 'dict'
result = SalureFunctions.detect_changes_between_dataframes(df_old, df_actual, check_columns, unique_key, keep_old_values)
print(result)
In this example, we have two dataframes df_old
and df_actual
. We want to compare the columns 'name' and 'age' using the 'id' column as the unique identifier. The keep_old_values
parameter is set to 'dict' which will store the changed fields and their old values in a dictionary.
The output of this example would look like:
python id name age flag_old freq change_type changes
1 2 Bob 31 0 2 edited {'age': '30'}
3 4 David 28 0 1 new {}
This shows that the 'age' of Bob (id=2) has changed from 30 to 31, and David (id=4) is a new entry.
scheduler_error_handling()
:
[Documentation missing. Please provide more information about the function.]
convert_empty_columns_type()
:
[Documentation missing. Please provide more information about the function.]
dfdate_to_datetime()
:
[Documentation missing. Please provide more information about the function.]
send_error_to_slack()
:
[Documentation missing. Please provide more information about the function.]
gen_dict_extract()
:
[Documentation missing. Please provide more information about the function.]
archive_old_files()
:
[Documentation missing. Please provide more information about the function.]
df_to_xlsx()
:
[Documentation missing. Please provide more information about the function.]
zip_files()
:
[Documentation missing. Please provide more information about the function.]
intervalmatch_dates()
:
[Documentation missing. Please provide more information about the function.]