Overview
The .chem.mutate() method is a custom pandas accessor that enables conditional assignment of values to a column in a DataFrame using a query string. It allows for clean, chainable DataFrame transformations and always returns a modified copy of the DataFrame.
This is particularly useful when working with chemistry-related tabular data and transformations, but it can be applied more broadly.
For a concrete example of usage, refer to https://pychemist.com/mutate/.
Accessor Registration
This method is registered using the pandas API extensions system. This means you can access the method via:
|
1 |
df.chem.mutate(...) |
Method Signature
|
1 |
df.chem.mutate(query_str, column, value, other=None) |
Parameters
| Parameter | Type | Description |
|---|---|---|
| query_str | str | A pandas query string used to select rows that satisfy the condition. |
| column | str | The column to update or create. |
| value | scalar or array-like | The value(s) assigned to rows that meet the query condition. |
| other | scalar or array-like, optional | The value(s) assigned to rows not meeting the condition. If None, rows not matching the condition are left unchanged. |
Returns
- A modified copy of the original
pd.DataFrame.
This method does not modify the DataFrame in-place.
Behavior
- The method evaluates the
query_stron the DataFrame. - For rows matching the query, the specified column is set to
value. - For rows not matching the query, the column is set to
otheronly ifotheris provided. - The column is created if it does not already exist.
Notes
- This method is part of a custom accessor named
.chem. - The original DataFrame is left unchanged.
- You can use it in method chains, e.g.: df2 = df.chem.mutate(query_str=”mass > 10″, column=”label”, value=”heavy”, other=”light”)
valueandothercan be scalars or array-like, but must match the number of rows being assigned.
Or simply:
|
1 |
df2 = df.chem.mutate("mass > 10", "label", "heavy", "light") |
Example
See https://pychemist.com/mutate/ for a practical example using .chem.mutate().
Common Use Cases
- Labeling chemical species by some threshold: df.chem.mutate(“concentration > 1.0”, “status”, “high”, “low”)
- Creating a new boolean flag: df.chem.mutate(“pH < 7”, “is_acidic”, True, other=False)
Error Handling
- Any syntax errors in
query_strwill raise apandasexception at runtime. - If
valueorotherare array-like but do not match the shape of the selected rows, aValueErrorwill be raised.
Internals
Internally, the method:
- Copies the DataFrame (
df = self._obj.copy()), - Applies the query string using
df.query(query_str), - Locates matching and non-matching row indices,
- Assigns
valueand optionallyotherto the specified column, - Returns the modified DataFrame.
