spacekit.analyzer.explore¶
- class spacekit.analyzer.explore.ImagePreviews(X, labels, name='ImagePreviews', **log_kws)[source]¶
Bases:
objectBase parent class for rendering and displaying images as plots
- class spacekit.analyzer.explore.SVMPreviews(X, labels=None, names=None, ndims=3, channels=3, w=128, h=128, figsize=(10, 10), **log_kws)[source]¶
Bases:
ImagePreviewsImagePreviews subclass for previewing SVM images. Primarily can be used to compare original with augmented versions.
- Parameters:
ImagePlots (class) – spacekit.analyzer.explore.ImagePreviews parent class
Instantiates an SVMPreviews class object.
- Parameters:
X (ndarray) – ndimensional array of image pixel values
labels (ndarray, optional) – target class labels for each image
ndims (int, optional) – number of dimensions (frames) per image, by default 3
channels (int, optional) – channels per image frame (rgb color is 3, gray/bw is 1), by default 3
w (int, optional) – width of images, by default 128
h (int, optional) – height of images, by default 128
- class spacekit.analyzer.explore.DataPlots(df, width=1300, height=700, show=False, save_html=None, telescope=None, name='DataPlots', **log_kws)[source]¶
Bases:
objectBase class for drawing exploratory data analysis plots from a dataframe.
- bar_plots(X, Y, feature, y_err=[None, None], width=700, height=500, cmap=['dodgerblue', 'fuchsia'])[source]¶
Draws a bar plot for a feature, grouped by the
groupattribute.- Parameters:
X (array-like) – X-axis values
Y (array-like) – Y-axis values
feature (str) – Feature name
y_err (list, optional) – Y-axis error values, by default [None, None]
width (int, optional) – Width of the plot, by default 700
height (int, optional) – Height of the plot, by default 500
cmap (list, optional) – List of colors for the plot, by default [“dodgerblue”, “fuchsia”]
- Returns:
Plotly Figure object representing the bar plot
- Return type:
go.Figure
- box_plots(cols=None, outliers=True)[source]¶
Generates multi-trace box plots for each feature in cols param, with or without outliers
- Parameters:
- Returns:
dictionary of plotly box plot figures for each feature in cols parameter
- Return type:
- det_keys()[source]¶
Creates a list of detectors based on self.telescope
- Returns:
list of detector keys for the specified telescope
- Return type:
- feature_stats_by_target(feature)[source]¶
Calculates statistical info (mean and standard deviation) for a feature within each target class.
- Parameters:
feature (str) – dataframe column to get statistical calculations on
- Returns:
list of means and list of standard deviations for a feature, subdivided for each target class.
- Return type:
nested lists
- feature_subset()[source]¶
Create a set of groups from a categorical feature (dataframe column). Used for plotting multiple traces on a figure
- Returns:
self.categories attribute containing key-value pairs: groups of observations (values) for each category (keys)
- Return type:
dictionary
- group_keys()[source]¶
Generates numerically ordered key-pairs for each unique value of self.group found in the dataframe
- Returns:
enumerated dictionary of unique values for each group
- Return type:
- grouped_barplot(target='label', cmap=None)[source]¶
Draws a grouped bar plot for a target column, grouped by the
groupattribute.
- instr_keys()[source]¶
Generates a list of intruments based on self.telescope
- Returns:
list of instrument keys for the specified telescope
- Return type:
- kde_plots(cols, norm=False, targets=False, hist=True, curve=True, binsize=0.2, width=700, height=500, cmap=['#F66095', '#2BCDC1'])[source]¶
Generates KDE plots for specified columns in the dataframe.
- Parameters:
cols (list of str) – List of column names to generate KDE plots for
norm (bool, optional) – Whether to normalize the data, by default False
targets (bool, optional) – Whether to group data by target classes, by default False
hist (bool, optional) – Whether to show histogram, by default True
curve (bool, optional) – Whether to show KDE curve, by default True
binsize (float, optional) – Bin size for the histogram, by default 0.2
height (int, optional) – Height of the plot, by default 500
cmap (list, optional) – List of colors for the plot, by default [“#F66095”, “#2BCDC1”]
- Returns:
Plotly Figure object representing the KDE plot
- Return type:
go.Figure
- make_box_figs(vars: list)[source]¶
Generates single trace box plots, one plot for each var where
varsis a list of columns in df
- make_feature_scatter_figs(xaxis_name, yaxis_name)[source]¶
Generates scatterplots for two features in the dataframe, grouped by the
groupattribute.
- make_subplots(figtype, xtitle, ytitle, data1, data2, name1, name2)[source]¶
Generates figure with multiple subplots for two sets of data using previously generated figures.
- Parameters:
figtype (str) – type of figure being generated (used for saving html file)
xtitle (str) – title for the x-axis
ytitle (str) – title for the y-axis
data1 (go.Figure) – figure object for the first set of data
data2 (go.Figure) – figure object for the second set of data
name1 (str) – name for the first subplot
name2 (str) – name for the second subplot
- Returns:
figure object containing the subplots
- Return type:
go.Figure
- make_target_scatter(target=None)[source]¶
Generates target vs feature scatterplot for a given target (by default self.target) for each feature in self.feature_list.
- make_target_scatter_figs(xaxis_name, yaxis_name, marker_size=15, cmap=['cyan', 'fuchsia'], categories=None, target=None)[source]¶
Generates scatterplots for two features in the dataframe, grouped by target classes.
- Parameters:
xaxis_name (str) – column name in dataframe to plot on x-axis
yaxis_name (str) – column name in dataframe to plot on y-axis
marker_size (int, optional) – marker size for scatter plot points, by default 15
cmap (list, optional) – list of colors for different target classes, by default [“cyan”, “fuchsia”]
categories (dict, optional) – dictionary of categories to group data by, by default None
target (str, optional) – name of target column in dataframe, by default None
- Returns:
list of scatterplot figures for each category
- Return type:
- map_data()[source]¶
Instantiates
data_mapas a dictionary of grouped dataframes and color maps for each category incategoriesattribute.
- map_df_by_group()[source]¶
Instantiates
group_dictas a dictionary of grouped dataframes and color map
- remove_outliers(y_data)[source]¶
Removes outliers from a given pandas Series using the IQR method.
- Parameters:
y_data (pd.Series) – The data from which to remove outliers.
- Returns:
The data with outliers removed via IQR filtering.
- Return type:
pd.Series