Overview
The .chem.mutate()
method is a custom pandas accessor that enables conditional assignment of values to a column in a DataFrame using a query string. It allows for clean, chainable DataFrame transformations and always returns a modified copy of the DataFrame.
This is particularly useful when working with chemistry-related tabular data and transformations, but it can be applied more broadly.
For a concrete example of usage, refer to https://pychemist.com/mutate/.
Accessor Registration
This method is registered using the pandas API extensions system. This means you can access the method via:
1 |
df.chem.mutate(...) |
Method Signature
1 |
df.chem.mutate(query_str, column, value, other=None) |
Parameters
Parameter | Type | Description |
---|---|---|
query_str | str | A pandas query string used to select rows that satisfy the condition. |
column | str | The column to update or create. |
value | scalar or array-like | The value(s) assigned to rows that meet the query condition. |
other | scalar or array-like , optional | The value(s) assigned to rows not meeting the condition. If None , rows not matching the condition are left unchanged. |
Returns
- A modified copy of the original
pd.DataFrame
.
This method does not modify the DataFrame in-place.
Behavior
- The method evaluates the
query_str
on the DataFrame. - For rows matching the query, the specified column is set to
value
. - For rows not matching the query, the column is set to
other
only ifother
is provided. - The column is created if it does not already exist.
Notes
- This method is part of a custom accessor named
.chem
. - The original DataFrame is left unchanged.
- You can use it in method chains, e.g.: df2 = df.chem.mutate(query_str=”mass > 10″, column=”label”, value=”heavy”, other=”light”)
value
andother
can be scalars or array-like, but must match the number of rows being assigned.
Or simply:
1 |
df2 = df.chem.mutate("mass > 10", "label", "heavy", "light") |
Example
See https://pychemist.com/mutate/ for a practical example using .chem.mutate()
.
Common Use Cases
- Labeling chemical species by some threshold: df.chem.mutate(“concentration > 1.0”, “status”, “high”, “low”)
- Creating a new boolean flag: df.chem.mutate(“pH < 7”, “is_acidic”, True, other=False)
Error Handling
- Any syntax errors in
query_str
will raise apandas
exception at runtime. - If
value
orother
are array-like but do not match the shape of the selected rows, aValueError
will be raised.
Internals
Internally, the method:
- Copies the DataFrame (
df = self._obj.copy()
), - Applies the query string using
df.query(query_str)
, - Locates matching and non-matching row indices,
- Assigns
value
and optionallyother
to the specified column, - Returns the modified DataFrame.