Skip to content

๐Ÿ” Data Exploration

FlowyML Notebook provides a premium data exploration experience โ€” every DataFrame gets automatic, multi-tab visual profiling with zero extra code.


Rich Context Display

When you return a DataFrame (Pandas or Polars), FlowyML Notebook automatically generates a Rich Context Display with 10 interactive tabs:

Tab What It Shows
๐Ÿ“Š Table Sortable, paginated data view
๐Ÿ“ˆ Stats Column-level statistics with bento-grid summary
๐Ÿ’น Charts Auto-generated histograms and distribution plots
๐Ÿ”— Correlations Pearson correlation heatmap
๐Ÿ›ก Quality Missing values, duplicates, data integrity
๐Ÿ’ก Insights Outlier detection, scaling, target suggestions
โ†” Compare Side-by-side DataFrame comparison
๐Ÿค– AI AI-powered data analysis
๐Ÿ”ง SmartPrep Actionable preprocessing suggestions with code
๐Ÿง  Algorithms ML algorithm recommendations with pipeline code

Stats View

A high-density overview of the most critical metrics: total rows, column count, missing values, and memory impact โ€” plus per-column statistics with type detection.

Stats View

Bento-grid summary deck with per-column mean, std, min/max, quartiles, skew, and kurtosis

Charts View

Auto-generated visualizations for every column. Numeric columns get histograms with ยต/ฯƒ annotations. Categorical columns get horizontal bar charts with value counts.

Charts View

Interactive charts: learning_rate, accuracy, f1_score distributions, model counts, status breakdown

Correlations

Pearson correlation matrix with color-coded heatmap. Instantly spot positive (purple) and negative (red) relationships between features.

Correlations

Color-coded correlation matrix with scrollable DataFrame output

ML Insights

Automated recommendations for ML preprocessing โ€” no manual analysis needed:

  • โšก Outlier Detection โ€” IQR-based detection with percentage and bounds
  • โš– Scaling Recommendations โ€” Log transform, no scaling, or normalization suggestions
  • ๐ŸŽฏ Target Variables โ€” Automatically identifies classification and regression targets

ML Insights

training_time_s has 6% outliers, n_estimators needs log transform, model and status identified as targets

๐Ÿ”ง SmartPrep Advisor โ€” NEW in v1.2

Go beyond insights โ€” get actionable preprocessing suggestions with severity-ranked cards and ready-to-run Python code.

The SmartPrep tab detects 6 categories of data quality issues:

  • Missing Values โ€” Per-column null analysis with imputation strategy (median for numeric, mode for categorical, drop for >60% missing)
  • Skewed Distributions โ€” Skewness detection with log1p or Yeo-Johnson power transform suggestions
  • Outliers โ€” IQR-based detection with clip() code
  • High Cardinality โ€” Categorical columns >50 unique values flagged for frequency encoding
  • Class Imbalance โ€” Target variable ratio analysis with SMOTE and class weight suggestions
  • Feature Scaling โ€” Cross-feature range analysis (>100x difference triggers StandardScaler suggestion)

Each card includes a "Generate Cell" button that inserts the fix directly into your notebook. Use "Apply All Fixes" to insert all suggestions at once.


๐Ÿง  Algorithm Matchmaker โ€” NEW in v1.2

Select a target column and get ranked ML algorithm recommendations with reasoning, caveats, and complete pipeline code.

The Matchmaker analyzes:

  • Task type โ€” Automatically detects classification (โ‰ค20 unique), regression (>20), or clustering (no target)
  • Data characteristics โ€” Sample size, feature counts, dimensionality, null presence
  • Algorithm fit โ€” Scores each algorithm 0-100 based on your specific data profile

Supported algorithms include Random Forest, XGBoost, LightGBM, Logistic/Linear Regression, SVM, KNN, ElasticNet, KMeans, DBSCAN, and Hierarchical Clustering.

Click "Generate Pipeline Cell" to insert a complete sklearn pipeline โ€” train/test split, model fitting, and evaluation metrics โ€” directly into your notebook.


Built-in Visualization Libraries

In addition to the automatic profiling, FlowyML Notebook includes:

  • Plotly โ€” Interactive, web-ready charts
  • Matplotlib / Seaborn โ€” Static, publication-quality plots
  • Altair / Vega โ€” Declarative statistical visualizations
  • Recharts โ€” Built-in chart renderer for DataFrame outputs

Exporting

Every visualization and table can be exported:

  • ๐Ÿ“ท Copy as Image โ€” For presentations or documents
  • ๐Ÿ’พ Export as CSV/Parquet โ€” For downstream processing
  • ๐Ÿ“Š Promote to Dashboard โ€” Turn exploration cells into interactive dashboards