<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Manuals</title>
	<atom:link href="https://pychemist.com/category/manuals/feed/" rel="self" type="application/rss+xml" />
	<link>https://pychemist.com</link>
	<description>Pychemist</description>
	<lastBuildDate>Tue, 29 Jul 2025 10:12:24 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.8.2</generator>

<image>
	<url>https://pychemist.com/wp-content/uploads/2025/07/cropped-mini-logo1-01-32x32.png</url>
	<title>Manuals</title>
	<link>https://pychemist.com</link>
	<width>32</width>
	<height>32</height>
</image> 
	<item>
		<title>Manual: .chem.lead() Method for Creating Lead Variables</title>
		<link>https://pychemist.com/manual-chem-lead/</link>
					<comments>https://pychemist.com/manual-chem-lead/#respond</comments>
		
		<dc:creator><![CDATA[Pychemist]]></dc:creator>
		<pubDate>Tue, 29 Jul 2025 07:41:24 +0000</pubDate>
				<category><![CDATA[Manuals]]></category>
		<guid isPermaLink="false">https://pychemist.com/?p=265</guid>

					<description><![CDATA[Overview The .chem.lead() method is a custom pandas accessor that creates lead versions of one or more variables in a [&#8230;]]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">Overview</h2>



<p>The <code>.chem.lead()</code> method is a custom pandas accessor that creates <strong>lead</strong> versions of one or more variables in a DataFrame. It works by shifting the specified columns <strong>backward</strong> in time (i.e., to future periods) and merging the result back onto the original DataFrame.</p>



<p>This is especially useful for panel or time-series data where each row represents an observation at a time point for a particular unit (e.g., experiment, company, individual).</p>



<p>For a concrete example of usage, refer to <strong><a href="https://pychemist.com/creating-lag-and-lead-variables/">https://pychemist.com/creating-lag-and-lead-variables/</a></strong>.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Accessor Registration</h2>



<p>This method is registered with pandas under the accessor name <code>.chem</code>:</p>



<p>Access the method like this:</p>



<pre class="urvanov-syntax-highlighter-plain-tag">df.chem.lead(...)</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Method Signature</h2>



<pre class="urvanov-syntax-highlighter-plain-tag">df.chem.lead(variables, identifier, time, shift=1, *, replace=False)</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Parameters</h2>



<figure class="wp-block-table"><table class="has-fixed-layout"><thead><tr><th>Parameter</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td><strong>variables</strong></td><td><code>str</code> or <code>list of str</code></td><td>Column(s) for which to create lead versions.</td></tr><tr><td><strong>identifier</strong></td><td><code>str</code></td><td>The name of the column identifying individual units (e.g., subject ID or group).</td></tr><tr><td><strong>time</strong></td><td><code>str</code></td><td>The name of the time column. Used to shift values within groups.</td></tr><tr><td><strong>shift</strong></td><td><code>int</code>, default <code>1</code></td><td>Number of time periods to shift. Must be a <strong>positive</strong> integer.</td></tr><tr><td><strong>replace</strong></td><td><code>bool</code>, default <code>False</code></td><td>If <code>True</code>, overwrites existing lead columns. If <code>False</code>, raises an error if there&#8217;s a naming conflict.</td></tr></tbody></table></figure>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Returns</h2>



<ul class="wp-block-list">
<li>A modified copy of the original <code>pd.DataFrame</code>, with lead versions of the specified variables added as new columns.</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Behavior</h2>



<ol class="wp-block-list">
<li>Validates parameter types and column existence.</li>



<li>Creates a shifted version of the selected variables by <strong>subtracting</strong> the <code>shift</code> from the <code>time</code> column.</li>



<li>Merges this lead DataFrame back into the original, using <code>identifier</code> and <code>time</code> as keys.</li>



<li>New columns are suffixed with:
<ul class="wp-block-list">
<li><code>_lead</code> for <code>shift=1</code></li>



<li><code>_leadN</code> for <code>shift=N</code> (e.g., <code>_lead3</code> for <code>shift=3</code>)</li>
</ul>
</li>



<li>If <code>replace=False</code> and a target lead column already exists, a <code>ValueError</code> is raised.</li>
</ol>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Notes</h2>



<ul class="wp-block-list">
<li>The original DataFrame remains unchanged.</li>



<li>Supports multiple variables and vectorized group-wise operations.</li>



<li>Useful for forecasting models or previewing future values.</li>



<li>For backward-looking operations, see <code>.chem.lag()</code>.</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Example</h2>



<pre class="urvanov-syntax-highlighter-plain-tag">df = pd.DataFrame({
    "id": [1, 1, 1, 2, 2, 2],
    "time": [1, 2, 3, 1, 2, 3],
    "mass": [10, 15, 20, 5, 10, 15]
})

df2 = df.chem.lead(variables="mass", identifier="id", time="time", shift=1)</pre>



<p>This will produce a new column called <code>mass_lead</code> with the mass value from the <strong>next</strong> time step (grouped by <code>id</code>).</p>



<p>Or more concisely:</p>



<pre class="urvanov-syntax-highlighter-plain-tag">df2 = df.chem.lead("mass", "id", "time")</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Common Use Cases</h2>



<ul class="wp-block-list">
<li>Creating future-looking predictors in modeling.</li>



<li>Forecast validation (comparing current and future state).</li>



<li>Detecting upcoming transitions or changes in sequence.</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Error Handling</h2>



<ul class="wp-block-list">
<li><strong>TypeError</strong> if <code>variables</code> is not a string or list of strings.</li>



<li><strong>TypeError</strong> if <code>replace</code> is not a boolean.</li>



<li><strong>TypeError</strong> if <code>shift</code> is not a positive integer.</li>



<li><strong>ValueError</strong> if a specified column does not exist.</li>



<li><strong>ValueError</strong> if a lead column already exists and <code>replace=False</code>.</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Internals</h2>



<p>The method:</p>



<ol class="wp-block-list">
<li>Copies the relevant subset of the DataFrame.</li>



<li>Shifts the <code>time</code> column <strong>backward</strong> (<code>df[time] - shift</code>) to align future values.</li>



<li>Applies suffixes such as <code>_lead</code> or <code>_leadN</code>.</li>



<li>Merges the lead values back into the original DataFrame using <code>pd.merge(...)</code>.</li>
</ol>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">See Also</h2>



<ul class="wp-block-list">
<li><code>.chem.lag()</code> – for creating lagged (past) variables</li>



<li><code>.chem.mutate()</code> – for conditional column assignment</li>



<li><code>pd.DataFrame.shift()</code> – basic shifting</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>
]]></content:encoded>
					
					<wfw:commentRss>https://pychemist.com/manual-chem-lead/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Manual: .chem.lag() Method for Creating Lagged Variables</title>
		<link>https://pychemist.com/manual-chem-lag/</link>
					<comments>https://pychemist.com/manual-chem-lag/#respond</comments>
		
		<dc:creator><![CDATA[Pychemist]]></dc:creator>
		<pubDate>Tue, 29 Jul 2025 07:35:39 +0000</pubDate>
				<category><![CDATA[Manuals]]></category>
		<guid isPermaLink="false">https://pychemist.com/?p=262</guid>

					<description><![CDATA[Overview The .chem.lag() method is a custom pandas accessor that creates lagged versions of one or more variables in a [&#8230;]]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">Overview</h2>



<p>The <code>.chem.lag()</code> method is a custom pandas accessor that creates lagged versions of one or more variables in a DataFrame. It operates by shifting the specified columns by a given number of time periods and merging the result back onto the original DataFrame.</p>



<p>This is especially useful for time-series panel data where each row belongs to a unique unit (e.g., company, experiment, patient) over time.</p>



<p>For a concrete example of usage, refer to <strong><a href="https://pychemist.com/creating-lag-and-lead-variables/">https://pychemist.com/creating-lag-and-lead-variables/</a></strong>.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Accessor Registration</h2>



<p>This method is registered with pandas under the accessor name <code>.chem</code>:</p>



<p>Access the method like this:</p>



<pre class="urvanov-syntax-highlighter-plain-tag">df.chem.lag(...)</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Method Signature</h2>



<pre class="urvanov-syntax-highlighter-plain-tag">df.chem.lag(variables, identifier, time, shift=1, *, replace=False)</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Parameters</h2>



<figure class="wp-block-table"><table class="has-fixed-layout"><thead><tr><th>Parameter</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td><strong>variables</strong></td><td><code>str</code> or <code>list of str</code></td><td>Column(s) for which to create lagged versions.</td></tr><tr><td><strong>identifier</strong></td><td><code>str</code></td><td>The name of the column identifying individual units (e.g., subject ID or group).</td></tr><tr><td><strong>time</strong></td><td><code>str</code></td><td>The name of the time column. Used to shift values within groups.</td></tr><tr><td><strong>shift</strong></td><td><code>int</code>, default <code>1</code></td><td>Number of time periods to shift. Positive values create lags. Negative values are not allowed.</td></tr><tr><td><strong>replace</strong></td><td><code>bool</code>, default <code>False</code></td><td>If <code>True</code>, overwrites existing lagged columns. If <code>False</code>, raises an error if there&#8217;s a naming conflict.</td></tr></tbody></table></figure>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Returns</h2>



<ul class="wp-block-list">
<li>A modified copy of the original <code>pd.DataFrame</code>, with lagged versions of the specified variables added as new columns.</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Behavior</h2>



<ol class="wp-block-list">
<li>Verifies input types and column existence.</li>



<li>Constructs a lagged version of the selected variables by shifting the <code>time</code> column forward by the given <code>shift</code> amount.</li>



<li>Merges this lagged DataFrame back onto the original, based on the <code>identifier</code> and <code>time</code>.</li>



<li>New columns are suffixed with:
<ul class="wp-block-list">
<li><code>_lag</code> for <code>shift=1</code></li>



<li><code>_lagN</code> for <code>shift=N</code> (e.g., <code>_lag3</code> for <code>shift=3</code>)</li>
</ul>
</li>



<li>If <code>replace=False</code> and any of the output columns already exist, the method raises a <code>ValueError</code>.</li>
</ol>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Notes</h2>



<ul class="wp-block-list">
<li>The original DataFrame remains unchanged.</li>



<li>Supports multiple variables and vectorized operations.</li>



<li>Designed for panel or longitudinal data.</li>



<li>For negative shift (lead variables) refert to <code>.chem.lead()</code></li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Example</h2>



<pre class="urvanov-syntax-highlighter-plain-tag">df = pd.DataFrame({
    "id": &#91;1, 1, 1, 2, 2, 2],
    "time": &#91;1, 2, 3, 1, 2, 3],
    "mass": &#91;10, 15, 20, 5, 10, 15]
})

df_lagged = df.chem.lag(variables="mass", identifier="id", time="time", shift=1)</pre>



<p>This will produce a new column called <code>mass_lag</code> with the mass value from the previous time step (by <code>id</code>).</p>



<p>Or more concisely:</p>



<pre class="urvanov-syntax-highlighter-plain-tag">df_lagged = df.chem.lag("mass", "id", "time")</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Common Use Cases</h2>



<ul class="wp-block-list">
<li>Creating lagged predictors in time-series regression.</li>



<li>Modeling delayed effects in experiments.</li>



<li>Comparing values across sequential periods.</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Error Handling</h2>



<ul class="wp-block-list">
<li><strong>TypeError</strong> if <code>variables</code> is not a string or list of strings.</li>



<li><strong>TypeError</strong> if <code>replace</code> is not a boolean.</li>



<li><strong>TypeError</strong> if <code>shift</code> is not a positive integer.</li>



<li><strong>ValueError</strong> if any specified variable is missing from the DataFrame.</li>



<li><strong>ValueError</strong> if a lagged column already exists and <code>replace=False</code>.</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Internals</h2>



<p>The method:</p>



<ol class="wp-block-list">
<li>Creates a shifted copy of the target columns using <code>df[time] + shift</code>.</li>



<li>Applies suffixes like <code>_lag</code>, <code>_lag2</code>, etc.</li>



<li>Merges the lagged columns back onto the original DataFrame using <code>pd.merge(...)</code>.</li>
</ol>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">See Also</h2>



<ul class="wp-block-list">
<li><code>.chem.mutate()</code> – for conditional column assignment</li>



<li><code>pd.DataFrame.shift()</code> – basic shifting</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>
]]></content:encoded>
					
					<wfw:commentRss>https://pychemist.com/manual-chem-lag/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
		<item>
		<title>Manual: .chem.mutate() Method for Conditional DataFrame Updates</title>
		<link>https://pychemist.com/manual-chem-mutate/</link>
					<comments>https://pychemist.com/manual-chem-mutate/#respond</comments>
		
		<dc:creator><![CDATA[Pychemist]]></dc:creator>
		<pubDate>Mon, 28 Jul 2025 19:13:44 +0000</pubDate>
				<category><![CDATA[Manuals]]></category>
		<guid isPermaLink="false">https://pychemist.com/?p=259</guid>

					<description><![CDATA[Overview The .chem.mutate() method is a custom pandas accessor that enables conditional assignment of values to a column in a [&#8230;]]]></description>
										<content:encoded><![CDATA[
<h2 class="wp-block-heading">Overview</h2>



<p>The <code>.chem.mutate()</code> method is a custom pandas accessor that enables conditional assignment of values to a column in a DataFrame using a query string. It allows for clean, chainable DataFrame transformations and always returns a modified copy of the DataFrame.</p>



<p>This is particularly useful when working with chemistry-related tabular data and transformations, but it can be applied more broadly.</p>



<p>For a concrete example of usage, refer to <strong><a href="https://pychemist.com/mutate/">https://pychemist.com/mutate/</a></strong>.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Accessor Registration</h2>



<p>This method is registered using the pandas API extensions system. This means you can access the method via:</p>



<pre class="urvanov-syntax-highlighter-plain-tag">df.chem.mutate(...)</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Method Signature</h2>



<pre class="urvanov-syntax-highlighter-plain-tag">df.chem.mutate(query_str, column, value, other=None)</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Parameters</h2>



<figure class="wp-block-table"><table class="has-fixed-layout"><thead><tr><th>Parameter</th><th>Type</th><th>Description</th></tr></thead><tbody><tr><td><strong>query_str</strong></td><td><code>str</code></td><td>A <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html">pandas query string</a> used to select rows that satisfy the condition.</td></tr><tr><td><strong>column</strong></td><td><code>str</code></td><td>The column to update or create.</td></tr><tr><td><strong>value</strong></td><td><code>scalar</code> or <code>array-like</code></td><td>The value(s) assigned to rows that meet the query condition.</td></tr><tr><td><strong>other</strong></td><td><code>scalar</code> or <code>array-like</code>, optional</td><td>The value(s) assigned to rows <strong>not</strong> meeting the condition. If <code>None</code>, rows not matching the condition are left unchanged.</td></tr></tbody></table></figure>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Returns</h2>



<ul class="wp-block-list">
<li>A modified copy of the original <code>pd.DataFrame</code>.</li>
</ul>



<p>This method does <strong>not</strong> modify the DataFrame in-place.</p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Behavior</h2>



<ol class="wp-block-list">
<li>The method evaluates the <code>query_str</code> on the DataFrame.</li>



<li>For rows <strong>matching the query</strong>, the specified column is set to <code>value</code>.</li>



<li>For rows <strong>not matching the query</strong>, the column is set to <code>other</code> <strong>only if <code>other</code> is provided</strong>.</li>



<li>The column is created if it does not already exist.</li>
</ol>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Notes</h2>



<ul class="wp-block-list">
<li>This method is part of a custom accessor named <code>.chem</code>.</li>



<li>The original DataFrame is left unchanged.</li>



<li>You can use it in method chains, e.g.: df2 = df.chem.mutate(query_str=&#8221;mass &gt; 10&#8243;, column=&#8221;label&#8221;, value=&#8221;heavy&#8221;, other=&#8221;light&#8221;)</li>



<li><code>value</code> and <code>other</code> can be scalars or array-like, but must match the number of rows being assigned.</li>
</ul>



<p>Or simply:</p>



<pre class="urvanov-syntax-highlighter-plain-tag">df2 = df.chem.mutate("mass &gt; 10", "label", "heavy", "light")</pre>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Example</h2>



<p><em>See <a href="https://pychemist.com/mutate/">https://pychemist.com/mutate/</a> for a practical example using <code>.chem.mutate()</code>.</em></p>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Common Use Cases</h2>



<ul class="wp-block-list">
<li>Labeling chemical species by some threshold: df.chem.mutate(&#8220;concentration &gt; 1.0&#8221;, &#8220;status&#8221;, &#8220;high&#8221;, &#8220;low&#8221;)</li>



<li>Creating a new boolean flag: df.chem.mutate(&#8220;pH &lt; 7&#8221;, &#8220;is_acidic&#8221;, True, other=False)</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Error Handling</h2>



<ul class="wp-block-list">
<li>Any syntax errors in <code>query_str</code> will raise a <code>pandas</code> exception at runtime.</li>



<li>If <code>value</code> or <code>other</code> are array-like but do not match the shape of the selected rows, a <code>ValueError</code> will be raised.</li>
</ul>



<hr class="wp-block-separator has-alpha-channel-opacity"/>



<h2 class="wp-block-heading">Internals</h2>



<p>Internally, the method:</p>



<ol class="wp-block-list">
<li>Copies the DataFrame (<code>df = self._obj.copy()</code>),</li>



<li>Applies the query string using <code>df.query(query_str)</code>,</li>



<li>Locates matching and non-matching row indices,</li>



<li>Assigns <code>value</code> and optionally <code>other</code> to the specified column,</li>



<li>Returns the modified DataFrame.</li>
</ol>



<hr class="wp-block-separator has-alpha-channel-opacity"/>
]]></content:encoded>
					
					<wfw:commentRss>https://pychemist.com/manual-chem-mutate/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
			</item>
	</channel>
</rss>
