Explore Module

The explore module provides comprehensive visualization tools for coal plant network analysis.

class retire.explore.explore.Explore(G: Graph, raw_df: DataFrame)[source]

Bases: object

Visualization and exploration class for coal plant network analysis.

The Explore class provides comprehensive visualization tools for analyzing coal plant networks and retirement strategies. It generates various plots including network graphs, heatmaps, geographic maps, and interactive visualizations to understand plant relationships and retirement patterns.

Parameters:
  • G (networkx.Graph) – Network graph of coal plants with nodes representing plant clusters and edges representing similarity relationships.

  • raw_df (pandas.DataFrame) – Raw dataset containing coal plant characteristics, retirement status, and contextual vulnerability information.

G

The network graph used for analysis and visualization.

Type:

networkx.Graph

raw_df

The raw coal plant dataset.

Type:

pandas.DataFrame

Examples

>>> from retire import Retire
>>> from retire.explore import Explore
>>> retire_obj = Retire()
>>> explore = Explore(retire_obj.graph, retire_obj.raw_df)
>>> fig, ax = explore.drawGraph(col='ret_STATUS')
>>> fig, ax = explore.drawMap()
__init__(G: Graph, raw_df: DataFrame)[source]

Initialize the Explore visualization object.

Parameters:
  • G (networkx.Graph) – Network graph of coal plants with nodes and edges representing plant relationships based on similarity metrics.

  • raw_df (pandas.DataFrame) – Raw dataset containing coal plant information including characteristics, retirement status, and contextual factors.

drawGraph(col: str = None, pos: Dict[str, ndarray] = None, title: str = None, size: tuple = (8, 6), show_colorbar: bool = False, color_method: str = 'average', show_node_labels: bool = False)[source]

Visualize a THEMA-generated NetworkX graph with optional node coloring.

Parameters:
  • col (str, optional) – Column in raw data to use for node coloring when color_method=’average’. If None, no coloring is applied.

  • pos (dict, optional) – Node positions for layout. If None, spring layout is used.

  • title (str, optional) – Title for the plot. If None, no title is displayed.

  • size (tuple, default=(8, 6)) – Figure size in inches.

  • show_colorbar (bool, default=False) – Whether to display a colorbar (only used if col is provided).

  • color_method ({"average", "community"}, default="average") – Method for coloring nodes: by attribute average or by community.

  • show_node_labels (bool, default=False) – Whether to display node labels.

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

drawComponent(component: int, col: str = 'ret_STATUS', pos: Dict[str, ndarray] = None, title=None, size: tuple = (8, 6), show_colorbar: bool = False, color_method: str = 'average', show_node_labels: bool = False)[source]

Draws a specific connected component of the graph.

Parameters:
  • col (str, optional) – Column in raw data to use for node coloring when color_method=’average’. If None, no coloring is applied.

  • pos (dict, optional) – Node positions for layout. If None, spring layout is used.

  • title (str, optional) – Title for the plot. If None, no title is displayed.

  • size (tuple, default=(8, 6)) – Figure size in inches.

  • show_colorbar (bool, default=False) – Whether to display a colorbar (only used if col is provided).

  • color_method ({"average", "community"}, default="average") – Method for coloring nodes: by attribute average or by community.

  • show_node_labels (bool, default=False) – Whether to display node labels.

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

drawPathDistance(component: int, targets: Dict[str, float], distances_dict: Dict[str, float], title='', seed=5, show_colorbar=True, size_by_degree=True, scaled_legend=False, vmax=2.5, figsize=(10, 4))[source]

Visualize shortest path distances to target nodes in a network component.

Creates a network visualization showing the shortest path distances from all nodes to a set of target nodes within a specified connected component. Nodes are colored by their distance to the nearest target, with target nodes specially highlighted.

Parameters:
  • component (int) – Index of the connected component to visualize (0-based).

  • targets (Dict[str, float]) – Dictionary of target node identifiers (keys) and their associated values. Target nodes are highlighted with ‘T’ labels.

  • distances_dict (Dict[str, float]) – Dictionary mapping node identifiers to their shortest path distance to the nearest target node.

  • title (str, default="") – Title for the plot.

  • seed (int, default=5) – Random seed for spring layout positioning to ensure reproducible layouts.

  • show_colorbar (bool, default=True) – Whether to display a colorbar indicating distance values.

  • size_by_degree (bool, default=True) – If True, node sizes are scaled by their degree; otherwise uses fixed size.

  • scaled_legend (bool, default=False) – If True, color normalization uses fixed range [0, vmax]; otherwise uses min and max of distances_dict.

  • vmax (float, default=2.5) – Maximum value for color normalization when scaled_legend is True.

  • figsize (tuple, default=(10, 4)) – Figure size in inches (width, height).

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

Examples

>>> # Visualize distances to high-retirement nodes in component 0
>>> targets = explore.get_target_nodes(0, threshold=0.6)
>>> distances = explore.get_shortest_distances_to_targets(0, targets)
>>> fig, ax = explore.drawPathDistance(
...     component=0, targets=targets, distances_dict=distances,
...     title="Distance to High-Retirement Nodes"
... )
drawHeatMap(config: Dict)[source]

Generates and displays a heatmap visualization of grouped and normalized data with annotated values and category boxes.

Parameters:

config (dict) –

Configuration dictionary containing:
  • ”aggregations”: dict

    Aggregation functions to apply when grouping the data.

  • ”derived_columns”: list of dict
    Each dict should have:
    • ”name”: str, name of the derived column.

    • ”formula”: callable, function to compute the derived column.

    • ”input”: str, either “raw” or “group” to specify the source DataFrame.

  • ”renaming”: dict

    Mapping of column names for renaming after aggregation.

  • ”categories”: dict

    Mapping of category labels to lists of column names to group and box in the heatmap.

Returns:

  • fig (matplotlib.figure.Figure) – The matplotlib Figure object containing the heatmap.

  • ax (matplotlib.axes.Axes) – The matplotlib Axes object containing the heatmap.

Notes

  • The function normalizes the selected columns using StandardScaler before plotting.

  • Annotates each heatmap cell with the original (unnormalized) value, formatted for readability.

  • Draws boxes around columns belonging to the same category and labels them.

  • Customizes plot appearance using rcParams and seaborn styling.

drawDotPlot(clean_df, config, connected_lines=False)[source]

Draws a dot plot visualizing feature values across groups, with dot color representing normalized values and dot size representing standard deviation.

Parameters:
  • clean_df (pandas.DataFrame) – The cleaned DataFrame containing the data to plot. Must include columns for group assignment and the specified features.

  • config (dict) –

    Configuration dictionary with the following keys:
    • ”features”: list of str, feature column names to plot.

    • ”feature_labels”: dict, optional, mapping of feature names to display labels.

    • ”color_map”: str, optional, name of the matplotlib colormap to use (default: “coolwarm”).

    • ”dot_size_range”: tuple, optional, (min_size, max_size) for dot sizes (default: (10, 650)).

    • ”normalize_feature”: callable, optional, function to normalize feature values (default: identity).

  • connected_lines (bool, optional) – If True, draws faint lines connecting dots of the same feature across groups (default: False).

Returns:

  • fig (matplotlib.figure.Figure) – The matplotlib Figure object containing the plot.

  • ax (matplotlib.axes.Axes) – The matplotlib Axes object of the plot.

Notes

  • Each dot represents a feature value for a group.

  • Dot color encodes the normalized feature value.

  • Dot size encodes the standard deviation of the feature within the group.

  • A legend for dot sizes (standard deviation) and a colorbar for normalized values are included.

drawBar(title=None)[source]

Generate stacked bar chart showing plant counts by proximity group.

Parameters:

title (str, optional) – Chart title to display at the top.

Returns:

  • fig (plotly.graph_objects.Figure) – The Plotly figure object.

  • ax (None) – Always None, for API consistency.

drawSankey(title=None)[source]
drawMap()[source]

Create an interactive geographic map of US coal plants.

Generates a Plotly scatter_geo map showing all coal plants in the dataset with markers colored by retirement status and sized by nameplate capacity. Includes detailed hover information about plant characteristics, retirement plans, and contextual factors.

Returns:

  • fig (plotly.graph_objects.Figure) – Interactive Plotly figure with geographic scatter plot.

  • ax (None) – Always None, maintained for API consistency with matplotlib methods.

Examples

>>> from retire import Retire, Explore
>>> retire_obj = Retire()
>>> explore = Explore(retire_obj.graph, retire_obj.raw_df)
>>> fig, _ = explore.drawMap()
>>> fig.show()  # Display interactive map in browser/notebook
drawComponentsMap()[source]
drawGraph_helper(G: Graph, col: str = None, pos: Dict[str, ndarray] = None, title: str = None, size: tuple = (8, 6), show_colorbar: bool = False, color_method: str = 'average', show_node_labels: bool = False)[source]

Helper function for visualizing a NetworkX graph with optional node coloring and labeling.

This function is used internally by both drawGraph and drawComponent to handle the core plotting logic, including node coloring, sizing, layout, and optional colorbar and labels.

Parameters:
  • G (nx.Graph) – The NetworkX graph to visualize.

  • col (str, optional) – Column in raw data to use for node coloring when color_method=’average’. If None, no coloring is applied.

  • pos (dict, optional) – Node positions for layout. If None, spring layout is used.

  • title (str, optional) – Title for the plot. If None, no title is displayed.

  • size (tuple, default=(8, 6)) – Figure size in inches.

  • show_colorbar (bool, default=False) – Whether to display a colorbar (only used if col is provided).

  • color_method ({"average", "community"}, default="average") – Method for coloring nodes: by attribute average or by community.

  • show_node_labels (bool, default=False) – Whether to display node labels.

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

get_target_nodes(component, col='Percent Capacity Retiring', threshold=0.5)[source]

Identify and return nodes within a specified connected component whose average attribute value exceeds a given threshold.

Parameters:

component (int): Index of the connected component to analyze. col (str, optional): Name of the attribute in the DataFrame to evaluate for each node. Defaults to “Percent Capacity Retiring”. threshold (float, optional): Minimum average attribute value required for a node to be included in the result. Defaults to 0.5.

Returns:

dict: A dictionary mapping node identifiers to their average attribute values for nodes exceeding the threshold.

Notes:
  • Each node is expected to have a “membership” attribute, which is a list of indices referencing rows in self.raw_df.

  • If a node has no “membership” attribute or it is empty, its attribute value is considered 0.0.

get_shortest_distances_to_targets(component, targets)[source]

Compute the shortest distances from each node in a connected component to the nearest target node.

Parameters:
  • component (int) – The index of the connected component within the graph self.G to analyze.

  • targets (dict) – A dictionary containing target nodes as keys. The function will compute, for each node in the specified component, the shortest path distance to the nearest node present in this dictionary.

Returns:

distances – A dictionary mapping each node in the specified component to the shortest distance to any target node. If a node is not connected to any target, its distance will be set to float(‘inf’).

Return type:

dict

generate_THEMAGrah_labels(G: Graph, col: str = 'ret_STATUS', color_method: str = 'average')[source]

Assign colors to nodes based on either: - The average value of col for data points in each node (color_method=”average”), or - Community membership via label propagation (color_method=”community”).

Returns:

  • color_dict (dict) – Node -> color value (float for average, int for community ID).

  • labels_dict (dictx) – Node -> label (usually the node name).

assign_group_ids_to_rawdf()[source]

Annotate the raw DataFrame with a ‘Group’ column indicating the connected component (group) each plant belongs to.

Returns:

A copy of the raw DataFrame with an added ‘Group’ column.

Return type:

pd.DataFrame

assign_group_ids_to_cleandf(df)[source]

Annotate the raw DataFrame with a ‘Group’ column indicating the connected component (group) each plant belongs to.

Returns:

A copy of the raw DataFrame with an added ‘Group’ column.

Return type:

pd.DataFrame

Main Classes

class retire.explore.explore.Explore(G: Graph, raw_df: DataFrame)[source]

Visualization and exploration class for coal plant network analysis.

The Explore class provides comprehensive visualization tools for analyzing coal plant networks and retirement strategies. It generates various plots including network graphs, heatmaps, geographic maps, and interactive visualizations to understand plant relationships and retirement patterns.

Parameters:
  • G (networkx.Graph) – Network graph of coal plants with nodes representing plant clusters and edges representing similarity relationships.

  • raw_df (pandas.DataFrame) – Raw dataset containing coal plant characteristics, retirement status, and contextual vulnerability information.

G

The network graph used for analysis and visualization.

Type:

networkx.Graph

raw_df

The raw coal plant dataset.

Type:

pandas.DataFrame

Examples

>>> from retire import Retire
>>> from retire.explore import Explore
>>> retire_obj = Retire()
>>> explore = Explore(retire_obj.graph, retire_obj.raw_df)
>>> fig, ax = explore.drawGraph(col='ret_STATUS')
>>> fig, ax = explore.drawMap()
__init__(G: Graph, raw_df: DataFrame)[source]

Initialize the Explore visualization object.

Parameters:
  • G (networkx.Graph) – Network graph of coal plants with nodes and edges representing plant relationships based on similarity metrics.

  • raw_df (pandas.DataFrame) – Raw dataset containing coal plant information including characteristics, retirement status, and contextual factors.

drawGraph(col: str = None, pos: Dict[str, ndarray] = None, title: str = None, size: tuple = (8, 6), show_colorbar: bool = False, color_method: str = 'average', show_node_labels: bool = False)[source]

Visualize a THEMA-generated NetworkX graph with optional node coloring.

Parameters:
  • col (str, optional) – Column in raw data to use for node coloring when color_method=’average’. If None, no coloring is applied.

  • pos (dict, optional) – Node positions for layout. If None, spring layout is used.

  • title (str, optional) – Title for the plot. If None, no title is displayed.

  • size (tuple, default=(8, 6)) – Figure size in inches.

  • show_colorbar (bool, default=False) – Whether to display a colorbar (only used if col is provided).

  • color_method ({"average", "community"}, default="average") – Method for coloring nodes: by attribute average or by community.

  • show_node_labels (bool, default=False) – Whether to display node labels.

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

drawComponent(component: int, col: str = 'ret_STATUS', pos: Dict[str, ndarray] = None, title=None, size: tuple = (8, 6), show_colorbar: bool = False, color_method: str = 'average', show_node_labels: bool = False)[source]

Draws a specific connected component of the graph.

Parameters:
  • col (str, optional) – Column in raw data to use for node coloring when color_method=’average’. If None, no coloring is applied.

  • pos (dict, optional) – Node positions for layout. If None, spring layout is used.

  • title (str, optional) – Title for the plot. If None, no title is displayed.

  • size (tuple, default=(8, 6)) – Figure size in inches.

  • show_colorbar (bool, default=False) – Whether to display a colorbar (only used if col is provided).

  • color_method ({"average", "community"}, default="average") – Method for coloring nodes: by attribute average or by community.

  • show_node_labels (bool, default=False) – Whether to display node labels.

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

drawPathDistance(component: int, targets: Dict[str, float], distances_dict: Dict[str, float], title='', seed=5, show_colorbar=True, size_by_degree=True, scaled_legend=False, vmax=2.5, figsize=(10, 4))[source]

Visualize shortest path distances to target nodes in a network component.

Creates a network visualization showing the shortest path distances from all nodes to a set of target nodes within a specified connected component. Nodes are colored by their distance to the nearest target, with target nodes specially highlighted.

Parameters:
  • component (int) – Index of the connected component to visualize (0-based).

  • targets (Dict[str, float]) – Dictionary of target node identifiers (keys) and their associated values. Target nodes are highlighted with ‘T’ labels.

  • distances_dict (Dict[str, float]) – Dictionary mapping node identifiers to their shortest path distance to the nearest target node.

  • title (str, default="") – Title for the plot.

  • seed (int, default=5) – Random seed for spring layout positioning to ensure reproducible layouts.

  • show_colorbar (bool, default=True) – Whether to display a colorbar indicating distance values.

  • size_by_degree (bool, default=True) – If True, node sizes are scaled by their degree; otherwise uses fixed size.

  • scaled_legend (bool, default=False) – If True, color normalization uses fixed range [0, vmax]; otherwise uses min and max of distances_dict.

  • vmax (float, default=2.5) – Maximum value for color normalization when scaled_legend is True.

  • figsize (tuple, default=(10, 4)) – Figure size in inches (width, height).

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

Examples

>>> # Visualize distances to high-retirement nodes in component 0
>>> targets = explore.get_target_nodes(0, threshold=0.6)
>>> distances = explore.get_shortest_distances_to_targets(0, targets)
>>> fig, ax = explore.drawPathDistance(
...     component=0, targets=targets, distances_dict=distances,
...     title="Distance to High-Retirement Nodes"
... )
drawHeatMap(config: Dict)[source]

Generates and displays a heatmap visualization of grouped and normalized data with annotated values and category boxes.

Parameters:

config (dict) –

Configuration dictionary containing:
  • ”aggregations”: dict

    Aggregation functions to apply when grouping the data.

  • ”derived_columns”: list of dict
    Each dict should have:
    • ”name”: str, name of the derived column.

    • ”formula”: callable, function to compute the derived column.

    • ”input”: str, either “raw” or “group” to specify the source DataFrame.

  • ”renaming”: dict

    Mapping of column names for renaming after aggregation.

  • ”categories”: dict

    Mapping of category labels to lists of column names to group and box in the heatmap.

Returns:

  • fig (matplotlib.figure.Figure) – The matplotlib Figure object containing the heatmap.

  • ax (matplotlib.axes.Axes) – The matplotlib Axes object containing the heatmap.

Notes

  • The function normalizes the selected columns using StandardScaler before plotting.

  • Annotates each heatmap cell with the original (unnormalized) value, formatted for readability.

  • Draws boxes around columns belonging to the same category and labels them.

  • Customizes plot appearance using rcParams and seaborn styling.

drawDotPlot(clean_df, config, connected_lines=False)[source]

Draws a dot plot visualizing feature values across groups, with dot color representing normalized values and dot size representing standard deviation.

Parameters:
  • clean_df (pandas.DataFrame) – The cleaned DataFrame containing the data to plot. Must include columns for group assignment and the specified features.

  • config (dict) –

    Configuration dictionary with the following keys:
    • ”features”: list of str, feature column names to plot.

    • ”feature_labels”: dict, optional, mapping of feature names to display labels.

    • ”color_map”: str, optional, name of the matplotlib colormap to use (default: “coolwarm”).

    • ”dot_size_range”: tuple, optional, (min_size, max_size) for dot sizes (default: (10, 650)).

    • ”normalize_feature”: callable, optional, function to normalize feature values (default: identity).

  • connected_lines (bool, optional) – If True, draws faint lines connecting dots of the same feature across groups (default: False).

Returns:

  • fig (matplotlib.figure.Figure) – The matplotlib Figure object containing the plot.

  • ax (matplotlib.axes.Axes) – The matplotlib Axes object of the plot.

Notes

  • Each dot represents a feature value for a group.

  • Dot color encodes the normalized feature value.

  • Dot size encodes the standard deviation of the feature within the group.

  • A legend for dot sizes (standard deviation) and a colorbar for normalized values are included.

drawBar(title=None)[source]

Generate stacked bar chart showing plant counts by proximity group.

Parameters:

title (str, optional) – Chart title to display at the top.

Returns:

  • fig (plotly.graph_objects.Figure) – The Plotly figure object.

  • ax (None) – Always None, for API consistency.

drawSankey(title=None)[source]
drawMap()[source]

Create an interactive geographic map of US coal plants.

Generates a Plotly scatter_geo map showing all coal plants in the dataset with markers colored by retirement status and sized by nameplate capacity. Includes detailed hover information about plant characteristics, retirement plans, and contextual factors.

Returns:

  • fig (plotly.graph_objects.Figure) – Interactive Plotly figure with geographic scatter plot.

  • ax (None) – Always None, maintained for API consistency with matplotlib methods.

Examples

>>> from retire import Retire, Explore
>>> retire_obj = Retire()
>>> explore = Explore(retire_obj.graph, retire_obj.raw_df)
>>> fig, _ = explore.drawMap()
>>> fig.show()  # Display interactive map in browser/notebook
drawComponentsMap()[source]
drawGraph_helper(G: Graph, col: str = None, pos: Dict[str, ndarray] = None, title: str = None, size: tuple = (8, 6), show_colorbar: bool = False, color_method: str = 'average', show_node_labels: bool = False)[source]

Helper function for visualizing a NetworkX graph with optional node coloring and labeling.

This function is used internally by both drawGraph and drawComponent to handle the core plotting logic, including node coloring, sizing, layout, and optional colorbar and labels.

Parameters:
  • G (nx.Graph) – The NetworkX graph to visualize.

  • col (str, optional) – Column in raw data to use for node coloring when color_method=’average’. If None, no coloring is applied.

  • pos (dict, optional) – Node positions for layout. If None, spring layout is used.

  • title (str, optional) – Title for the plot. If None, no title is displayed.

  • size (tuple, default=(8, 6)) – Figure size in inches.

  • show_colorbar (bool, default=False) – Whether to display a colorbar (only used if col is provided).

  • color_method ({"average", "community"}, default="average") – Method for coloring nodes: by attribute average or by community.

  • show_node_labels (bool, default=False) – Whether to display node labels.

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

get_target_nodes(component, col='Percent Capacity Retiring', threshold=0.5)[source]

Identify and return nodes within a specified connected component whose average attribute value exceeds a given threshold.

Parameters:

component (int): Index of the connected component to analyze. col (str, optional): Name of the attribute in the DataFrame to evaluate for each node. Defaults to “Percent Capacity Retiring”. threshold (float, optional): Minimum average attribute value required for a node to be included in the result. Defaults to 0.5.

Returns:

dict: A dictionary mapping node identifiers to their average attribute values for nodes exceeding the threshold.

Notes:
  • Each node is expected to have a “membership” attribute, which is a list of indices referencing rows in self.raw_df.

  • If a node has no “membership” attribute or it is empty, its attribute value is considered 0.0.

get_shortest_distances_to_targets(component, targets)[source]

Compute the shortest distances from each node in a connected component to the nearest target node.

Parameters:
  • component (int) – The index of the connected component within the graph self.G to analyze.

  • targets (dict) – A dictionary containing target nodes as keys. The function will compute, for each node in the specified component, the shortest path distance to the nearest node present in this dictionary.

Returns:

distances – A dictionary mapping each node in the specified component to the shortest distance to any target node. If a node is not connected to any target, its distance will be set to float(‘inf’).

Return type:

dict

generate_THEMAGrah_labels(G: Graph, col: str = 'ret_STATUS', color_method: str = 'average')[source]

Assign colors to nodes based on either: - The average value of col for data points in each node (color_method=”average”), or - Community membership via label propagation (color_method=”community”).

Returns:

  • color_dict (dict) – Node -> color value (float for average, int for community ID).

  • labels_dict (dictx) – Node -> label (usually the node name).

assign_group_ids_to_rawdf()[source]

Annotate the raw DataFrame with a ‘Group’ column indicating the connected component (group) each plant belongs to.

Returns:

A copy of the raw DataFrame with an added ‘Group’ column.

Return type:

pd.DataFrame

assign_group_ids_to_cleandf(df)[source]

Annotate the raw DataFrame with a ‘Group’ column indicating the connected component (group) each plant belongs to.

Returns:

A copy of the raw DataFrame with an added ‘Group’ column.

Return type:

pd.DataFrame

Utility Functions

retire.explore.utils.prepare_bar_annotations(df, label_map, buffer=2)[source]

Create Plotly-compatible annotations for each bar category.

retire.explore.utils.get_key_nodes(percent_retiring_dict, threshold=0.5)[source]

Return nodes with retirement percent above a threshold.

retire.explore.utils.compute_retirement_by_node(G, df, col='Percent Capacity Retiring')[source]

For each node in the graph, calculate the average retirement percent from the associated plants (via membership indices).

retire.explore.utils.process_group_sankey_df(G: Graph, df: DataFrame, group_num: int, threshold: float = 0.5, return_all: bool = False) DataFrame[source]

Builds Sankey-ready data for one group.

retire.explore.utils.build_group_sankey(G: Graph, df: DataFrame, group_range: range = range(0, 8), return_all: bool = False) DataFrame[source]

Runs process_group_sankey_df across multiple groups and combines output.

retire.explore.utils.reduce_opacity(color: str, opacity: float) str[source]
retire.explore.utils.create_color_mapping(sources, targets, color_scale, target_color_mapping, opacity=0.7)[source]
retire.explore.utils.build_index_mappings(sources, targets)[source]

Visualization Methods

Network Visualizations

Methods for visualizing the THEMA graph structure and component relationships:

Explore.drawGraph(col: str = None, pos: Dict[str, ndarray] = None, title: str = None, size: tuple = (8, 6), show_colorbar: bool = False, color_method: str = 'average', show_node_labels: bool = False)[source]

Visualize a THEMA-generated NetworkX graph with optional node coloring.

Parameters:
  • col (str, optional) – Column in raw data to use for node coloring when color_method=’average’. If None, no coloring is applied.

  • pos (dict, optional) – Node positions for layout. If None, spring layout is used.

  • title (str, optional) – Title for the plot. If None, no title is displayed.

  • size (tuple, default=(8, 6)) – Figure size in inches.

  • show_colorbar (bool, default=False) – Whether to display a colorbar (only used if col is provided).

  • color_method ({"average", "community"}, default="average") – Method for coloring nodes: by attribute average or by community.

  • show_node_labels (bool, default=False) – Whether to display node labels.

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

Explore.drawComponent(component: int, col: str = 'ret_STATUS', pos: Dict[str, ndarray] = None, title=None, size: tuple = (8, 6), show_colorbar: bool = False, color_method: str = 'average', show_node_labels: bool = False)[source]

Draws a specific connected component of the graph.

Parameters:
  • col (str, optional) – Column in raw data to use for node coloring when color_method=’average’. If None, no coloring is applied.

  • pos (dict, optional) – Node positions for layout. If None, spring layout is used.

  • title (str, optional) – Title for the plot. If None, no title is displayed.

  • size (tuple, default=(8, 6)) – Figure size in inches.

  • show_colorbar (bool, default=False) – Whether to display a colorbar (only used if col is provided).

  • color_method ({"average", "community"}, default="average") – Method for coloring nodes: by attribute average or by community.

  • show_node_labels (bool, default=False) – Whether to display node labels.

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

Explore.drawPathDistance(component: int, targets: Dict[str, float], distances_dict: Dict[str, float], title='', seed=5, show_colorbar=True, size_by_degree=True, scaled_legend=False, vmax=2.5, figsize=(10, 4))[source]

Visualize shortest path distances to target nodes in a network component.

Creates a network visualization showing the shortest path distances from all nodes to a set of target nodes within a specified connected component. Nodes are colored by their distance to the nearest target, with target nodes specially highlighted.

Parameters:
  • component (int) – Index of the connected component to visualize (0-based).

  • targets (Dict[str, float]) – Dictionary of target node identifiers (keys) and their associated values. Target nodes are highlighted with ‘T’ labels.

  • distances_dict (Dict[str, float]) – Dictionary mapping node identifiers to their shortest path distance to the nearest target node.

  • title (str, default="") – Title for the plot.

  • seed (int, default=5) – Random seed for spring layout positioning to ensure reproducible layouts.

  • show_colorbar (bool, default=True) – Whether to display a colorbar indicating distance values.

  • size_by_degree (bool, default=True) – If True, node sizes are scaled by their degree; otherwise uses fixed size.

  • scaled_legend (bool, default=False) – If True, color normalization uses fixed range [0, vmax]; otherwise uses min and max of distances_dict.

  • vmax (float, default=2.5) – Maximum value for color normalization when scaled_legend is True.

  • figsize (tuple, default=(10, 4)) – Figure size in inches (width, height).

Returns:

  • fig (matplotlib.figure.Figure) – The created matplotlib figure.

  • ax (matplotlib.axes.Axes) – The created matplotlib axes.

Examples

>>> # Visualize distances to high-retirement nodes in component 0
>>> targets = explore.get_target_nodes(0, threshold=0.6)
>>> distances = explore.get_shortest_distances_to_targets(0, targets)
>>> fig, ax = explore.drawPathDistance(
...     component=0, targets=targets, distances_dict=distances,
...     title="Distance to High-Retirement Nodes"
... )

Statistical Visualizations

Methods for creating statistical visualizations of coal plant data:

Explore.drawHeatMap(config: Dict)[source]

Generates and displays a heatmap visualization of grouped and normalized data with annotated values and category boxes.

Parameters:

config (dict) –

Configuration dictionary containing:
  • ”aggregations”: dict

    Aggregation functions to apply when grouping the data.

  • ”derived_columns”: list of dict
    Each dict should have:
    • ”name”: str, name of the derived column.

    • ”formula”: callable, function to compute the derived column.

    • ”input”: str, either “raw” or “group” to specify the source DataFrame.

  • ”renaming”: dict

    Mapping of column names for renaming after aggregation.

  • ”categories”: dict

    Mapping of category labels to lists of column names to group and box in the heatmap.

Returns:

  • fig (matplotlib.figure.Figure) – The matplotlib Figure object containing the heatmap.

  • ax (matplotlib.axes.Axes) – The matplotlib Axes object containing the heatmap.

Notes

  • The function normalizes the selected columns using StandardScaler before plotting.

  • Annotates each heatmap cell with the original (unnormalized) value, formatted for readability.

  • Draws boxes around columns belonging to the same category and labels them.

  • Customizes plot appearance using rcParams and seaborn styling.

Explore.drawDotPlot(clean_df, config, connected_lines=False)[source]

Draws a dot plot visualizing feature values across groups, with dot color representing normalized values and dot size representing standard deviation.

Parameters:
  • clean_df (pandas.DataFrame) – The cleaned DataFrame containing the data to plot. Must include columns for group assignment and the specified features.

  • config (dict) –

    Configuration dictionary with the following keys:
    • ”features”: list of str, feature column names to plot.

    • ”feature_labels”: dict, optional, mapping of feature names to display labels.

    • ”color_map”: str, optional, name of the matplotlib colormap to use (default: “coolwarm”).

    • ”dot_size_range”: tuple, optional, (min_size, max_size) for dot sizes (default: (10, 650)).

    • ”normalize_feature”: callable, optional, function to normalize feature values (default: identity).

  • connected_lines (bool, optional) – If True, draws faint lines connecting dots of the same feature across groups (default: False).

Returns:

  • fig (matplotlib.figure.Figure) – The matplotlib Figure object containing the plot.

  • ax (matplotlib.axes.Axes) – The matplotlib Axes object of the plot.

Notes

  • Each dot represents a feature value for a group.

  • Dot color encodes the normalized feature value.

  • Dot size encodes the standard deviation of the feature within the group.

  • A legend for dot sizes (standard deviation) and a colorbar for normalized values are included.

Flow and Distribution Charts

Methods for visualizing distributions and flows between categories:

Explore.drawBar(title=None)[source]

Generate stacked bar chart showing plant counts by proximity group.

Parameters:

title (str, optional) – Chart title to display at the top.

Returns:

  • fig (plotly.graph_objects.Figure) – The Plotly figure object.

  • ax (None) – Always None, for API consistency.

Explore.drawSankey(title=None)[source]

Geographic Visualizations

Methods for visualizing coal plants on geographic maps:

Explore.drawMap()[source]

Create an interactive geographic map of US coal plants.

Generates a Plotly scatter_geo map showing all coal plants in the dataset with markers colored by retirement status and sized by nameplate capacity. Includes detailed hover information about plant characteristics, retirement plans, and contextual factors.

Returns:

  • fig (plotly.graph_objects.Figure) – Interactive Plotly figure with geographic scatter plot.

  • ax (None) – Always None, maintained for API consistency with matplotlib methods.

Examples

>>> from retire import Retire, Explore
>>> retire_obj = Retire()
>>> explore = Explore(retire_obj.graph, retire_obj.raw_df)
>>> fig, _ = explore.drawMap()
>>> fig.show()  # Display interactive map in browser/notebook
Explore.drawComponentsMap()[source]

Analysis Methods

Graph Analysis

Methods for analyzing the network structure and identifying patterns:

Explore.get_target_nodes(component, col='Percent Capacity Retiring', threshold=0.5)[source]

Identify and return nodes within a specified connected component whose average attribute value exceeds a given threshold.

Parameters:

component (int): Index of the connected component to analyze. col (str, optional): Name of the attribute in the DataFrame to evaluate for each node. Defaults to “Percent Capacity Retiring”. threshold (float, optional): Minimum average attribute value required for a node to be included in the result. Defaults to 0.5.

Returns:

dict: A dictionary mapping node identifiers to their average attribute values for nodes exceeding the threshold.

Notes:
  • Each node is expected to have a “membership” attribute, which is a list of indices referencing rows in self.raw_df.

  • If a node has no “membership” attribute or it is empty, its attribute value is considered 0.0.

Explore.get_shortest_distances_to_targets(component, targets)[source]

Compute the shortest distances from each node in a connected component to the nearest target node.

Parameters:
  • component (int) – The index of the connected component within the graph self.G to analyze.

  • targets (dict) – A dictionary containing target nodes as keys. The function will compute, for each node in the specified component, the shortest path distance to the nearest node present in this dictionary.

Returns:

distances – A dictionary mapping each node in the specified component to the shortest distance to any target node. If a node is not connected to any target, its distance will be set to float(‘inf’).

Return type:

dict

Data Processing

Methods for processing and enriching the underlying datasets:

Explore.generate_THEMAGrah_labels(G: Graph, col: str = 'ret_STATUS', color_method: str = 'average')[source]

Assign colors to nodes based on either: - The average value of col for data points in each node (color_method=”average”), or - Community membership via label propagation (color_method=”community”).

Returns:

  • color_dict (dict) – Node -> color value (float for average, int for community ID).

  • labels_dict (dictx) – Node -> label (usually the node name).

Explore.assign_group_ids_to_rawdf()[source]

Annotate the raw DataFrame with a ‘Group’ column indicating the connected component (group) each plant belongs to.

Returns:

A copy of the raw DataFrame with an added ‘Group’ column.

Return type:

pd.DataFrame

Explore.assign_group_ids_to_cleandf(df)[source]

Annotate the raw DataFrame with a ‘Group’ column indicating the connected component (group) each plant belongs to.

Returns:

A copy of the raw DataFrame with an added ‘Group’ column.

Return type:

pd.DataFrame