Skip to content

Data Sources

HolySheet auto-detects and converts data from multiple Python data formats. You never need to manually transform data โ€” just pass it directly to any block that accepts data.


Supported Formats

List of Dicts

The most common format. Each dict is a row, keys are column names.

list_of_dicts.py
data = [
    {"month": "Jan", "revenue": 124_500, "costs": 78_200},
    {"month": "Feb", "revenue": 138_200, "costs": 82_100},
    {"month": "Mar", "revenue": 152_800, "costs": 85_600},
]

LineChart(title="Revenue Trend", data=data, x="month", y=["revenue", "costs"])

This is the internal format

Internally, HolySheet converts all data to list[dict[str, Any]]. If your data is already in this format, it passes through with minimal overhead (just value sanitization).


Dict of Lists

Column-oriented format where keys are column names and values are lists.

dict_of_lists.py
data = {
    "month": ["Jan", "Feb", "Mar", "Apr"],
    "revenue": [124_500, 138_200, 152_800, 149_300],
    "costs": [78_200, 82_100, 85_600, 83_900],
}

LineChart(title="Revenue Trend", data=data, x="month", y="revenue")

Uniform Length Required

All lists must have the same length. If column lengths differ, a DataConversionError is raised:

# This will raise DataConversionError
data = {
    "month": ["Jan", "Feb", "Mar"],
    "revenue": [100, 200],  # Only 2 values!
}

Pandas DataFrames

pandas_data.py
import pandas as pd

df = pd.DataFrame({
    "month": ["Jan", "Feb", "Mar", "Apr"],
    "revenue": [124_500, 138_200, 152_800, 149_300],
    "costs": [78_200, 82_100, 85_600, 83_900],
})

LineChart(title="Revenue Trend", data=df, x="month", y=["revenue", "costs"])
DataTable(title="Raw Data", data=df)

Optional Dependency

Pandas is an optional dependency. Install with:

pip install holysheet[pandas]

How it works: Internally calls df.to_dict(orient='records') and then sanitizes each value.


Polars DataFrames

polars_data.py
import polars as pl

df = pl.DataFrame({
    "month": ["Jan", "Feb", "Mar", "Apr"],
    "revenue": [124_500, 138_200, 152_800, 149_300],
    "costs": [78_200, 82_100, 85_600, 83_900],
})

LineChart(title="Revenue Trend", data=df, x="month", y=["revenue", "costs"])
DataTable(title="Raw Data", data=df)

Optional Dependency

Polars is an optional dependency. Install with:

pip install holysheet[polars]

How it works: Internally calls df.to_dicts() and then sanitizes each value.


๐Ÿงน Value Sanitization

HolySheet automatically cleans values for safe JSON serialization. This happens transparently on all data formats.

Input Type Output Example
None None โ€”
float NaN None float('nan') โ†’ None
float Inf None float('inf') โ†’ None
Decimal float Decimal("3.14") โ†’ 3.14
datetime ISO string datetime(2024, 1, 15) โ†’ "2024-01-15T00:00:00"
date ISO string date(2024, 1, 15) โ†’ "2024-01-15"
bytes UTF-8 string b"hello" โ†’ "hello"
NumPy scalar Native Python np.float64(3.14) โ†’ 3.14
NumPy NaN None np.nan โ†’ None

Automatic NaN Handling

import pandas as pd
import numpy as np

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Carol"],
    "score": [95, np.nan, 87],       # NaN is auto-converted to None
    "date": pd.to_datetime(["2024-01-01", "2024-02-01", "2024-03-01"]),
})

# Just pass it โ€” HolySheet handles the NaN and datetime conversion
DataTable(title="Scores", data=df)

โš™ The to_records() Function

Under the hood, all data conversion goes through holysheet.data.to_records():

to_records_usage.py
from holysheet.data import to_records

# Convert any supported format to list[dict]
records = to_records(my_dataframe)
records = to_records(my_dict_of_lists)
records = to_records(my_list_of_dicts)

# Returns: [{"col1": val1, "col2": val2}, ...]

Conversion Flow

Input Data (any format)
    โ”‚
    โ”œโ”€โ”€ list[dict] โ”€โ”€โ†’ Clean values โ”€โ”€โ†’ list[dict]
    โ”œโ”€โ”€ dict[str, list] โ”€โ”€โ†’ Transpose โ”€โ”€โ†’ Clean values โ”€โ”€โ†’ list[dict]
    โ”œโ”€โ”€ pd.DataFrame โ”€โ”€โ†’ .to_dict("records") โ”€โ”€โ†’ Clean values โ”€โ”€โ†’ list[dict]
    โ””โ”€โ”€ pl.DataFrame โ”€โ”€โ†’ .to_dicts() โ”€โ”€โ†’ Clean values โ”€โ”€โ†’ list[dict]

Error Handling

If data cannot be converted, a DataConversionError is raised:

from holysheet.exceptions import DataConversionError

try:
    records = to_records("not valid data")
except DataConversionError as e:
    print(e.message)      # "Unsupported data type: str"
    print(e.source_type)  # "str"

๐Ÿ’ก Tips

Mixing Data Sources

You can use different data formats across blocks in the same report:

mixed_data.py
import pandas as pd

# Some data as dicts
kpi_data = {"revenue": 2_260_000, "users": 42_000}

# Chart data from pandas
chart_df = pd.read_csv("monthly_revenue.csv")

# Table data as list of dicts
customers = [
    {"name": "Acme Corp", "mrr": "$12,400"},
    {"name": "GlobalTech", "mrr": "$9,800"},
]

report = Report(title="Mixed Sources", theme="dark")
report.add(KPI(label="Revenue", value=f"${kpi_data['revenue']:,.0f}"))
report.add(LineChart(title="Trend", data=chart_df, x="month", y="revenue"))
report.add(DataTable(title="Top Customers", data=customers))

Large DataFrames

Performance Note

All data is embedded in the HTML file as JSON. Very large datasets (100K+ rows) will increase file size and may impact browser performance. Consider:

  • Aggregating data before passing to charts
  • Limiting DataTable rows to a reasonable size
  • Using paginated=True (default) for large tables

Data is Serialized at Export Time

Data conversion happens when you call export_html(), export_json(), or export_folder() โ€” not when you create the block. This means you can modify your DataFrames after adding them to blocks:

df = pd.DataFrame({"x": [1, 2, 3], "y": [10, 20, 30]})
chart = LineChart(title="Chart", data=df, x="x", y="y")
report.add(chart)

# โš ๏ธ This modification WILL be reflected in the export
# because 'data' holds a reference to df
df["y"] = [100, 200, 300]

report.export_html("report.html")  # Uses the modified data