Data Exploration
FlowyML Notebook provides a premium data exploration experience โ every DataFrame gets automatic, multi-tab visual profiling with zero extra code.
Rich Context Display
When you return a DataFrame (Pandas or Polars), FlowyML Notebook automatically generates a Rich Context Display with 10 interactive tabs:
| Tab | What It Shows |
|---|---|
| Sortable, paginated data view | |
| Column-level statistics with bento-grid summary | |
| Auto-generated histograms and distribution plots | |
| Pearson correlation heatmap | |
| Missing values, duplicates, data integrity | |
| Outlier detection, scaling, target suggestions | |
| Side-by-side DataFrame comparison | |
| AI-powered data analysis | |
| Actionable preprocessing suggestions with code | |
| ML algorithm recommendations with pipeline code |
Stats View
A high-density overview of the most critical metrics: total rows, column count, missing values, and memory impact โ plus per-column statistics with type detection.
Charts View
Auto-generated visualizations for every column. Numeric columns get histograms with ยต/ฯ annotations. Categorical columns get horizontal bar charts with value counts.
Correlations
Pearson correlation matrix with color-coded heatmap. Instantly spot positive (purple) and negative (red) relationships between features.
ML Insights
Automated recommendations for ML preprocessing โ no manual analysis needed:
Outlier Detection โ IQR-based detection with percentage and bounds
Scaling Recommendations โ Log transform, no scaling, or normalization suggestions
Target Variables โ Automatically identifies classification and regression targets
SmartPrep Advisor โ NEW in v1.2
Go beyond insights โ get actionable preprocessing suggestions with severity-ranked cards and ready-to-run Python code.
The SmartPrep tab detects 6 categories of data quality issues:
- Missing Values โ Per-column null analysis with imputation strategy (median for numeric, mode for categorical, drop for >60% missing)
- Skewed Distributions โ Skewness detection with
log1por Yeo-Johnson power transform suggestions - Outliers โ IQR-based detection with
clip()code - High Cardinality โ Categorical columns >50 unique values flagged for frequency encoding
- Class Imbalance โ Target variable ratio analysis with SMOTE and class weight suggestions
- Feature Scaling โ Cross-feature range analysis (>100x difference triggers
StandardScalersuggestion)
Each card includes a "Generate Cell" button that inserts the fix directly into your notebook. Use "Apply All Fixes" to insert all suggestions at once.
Algorithm Matchmaker โ NEW in v1.2
Select a target column and get ranked ML algorithm recommendations with reasoning, caveats, and complete pipeline code.
The Matchmaker analyzes:
- Task type โ Automatically detects classification (โค20 unique), regression (>20), or clustering (no target)
- Data characteristics โ Sample size, feature counts, dimensionality, null presence
- Algorithm fit โ Scores each algorithm 0-100 based on your specific data profile
Supported algorithms include Random Forest, XGBoost, LightGBM, Logistic/Linear Regression, SVM, KNN, ElasticNet, KMeans, DBSCAN, and Hierarchical Clustering.
Click "Generate Pipeline Cell" to insert a complete sklearn pipeline โ train/test split, model fitting, and evaluation metrics โ directly into your notebook.
Built-in Visualization Libraries
In addition to the automatic profiling, FlowyML Notebook includes:
- Plotly โ Interactive, web-ready charts
- Matplotlib / Seaborn โ Static, publication-quality plots
- Altair / Vega โ Declarative statistical visualizations
- Recharts โ Built-in chart renderer for DataFrame outputs
Exporting
Every visualization and table can be exported:
Copy as Image โ For presentations or documents
Export as CSV/Parquet โ For downstream processing
Promote to Dashboard โ Turn exploration cells into interactive dashboards