Bulk Export 780K Swiss Companies for Data Science

Name: Swiss Commercial Register Companies
Creator: VynCo
License: https://vynco.ch/legal/terms

VynCo Team4 min read4/14/2026

python data-science export pandas tutorial

Bulk Export 780K Swiss Companies for Data Science

The VynCo API holds structured data on every company in the Swiss commercial registry. For data scientists, the challenge is getting that data into pandas, polars, or DuckDB without writing pagination loops. The VynCo export endpoints solve this with three approaches: quick Excel exports, async bulk exports, and streaming NDJSON.

Quick Export: Excel in One Call

For small to medium datasets (up to 10,000 rows), the export_excel() method generates a downloadable Excel file directly.

from vynco import VynCo

client = VynCo(api_key="your-key")

# Export all active companies in Zurich
client.companies.export_excel(
    canton="ZH",
    status="active",
    output_path="zurich_companies.xlsx"
)

This is the fastest way to get structured company data into a spreadsheet for ad-hoc analysis. The export includes all standard fields: UID, name, legal form, address, capital, purpose, status, auditor, and registration date.

Async Bulk Export for Large Datasets

For larger exports (up to the full 780K registry), use the async export pipeline. You create an export job, and the API notifies you when the file is ready.

# Create an async export job
job = client.exports.create(
    format="csv",
    filters={
        "legal_form": "AG",
        "status": "active",
        "capital_min": 100000
    }
)
print(f"Export job {job.id} created, status: {job.status}")

# Poll until complete (or use webhooks)
import time
while job.status != "completed":
    time.sleep(5)
    job = client.exports.get(job.id)

# Download the result
client.exports.download(job.id, output_path="swiss_ag_100k_plus.csv")
print(f"Downloaded {job.row_count} rows")

The async pipeline supports CSV, JSON, and NDJSON output formats. For datasets over 100K rows, NDJSON is recommended because it streams without loading the entire file into memory.

Loading into pandas

Once you have your export, loading it into pandas is straightforward.

import pandas as pd

# From CSV
df = pd.read_csv("swiss_ag_100k_plus.csv")

# From NDJSON (line-delimited JSON)
df = pd.read_json("companies.ndjson", lines=True)

print(f"{len(df)} companies loaded")
print(df.groupby("legal_form")["capital"].describe())

Loading into polars

polars handles large datasets more efficiently than pandas for many operations.

import polars as pl

# polars reads NDJSON natively
df = pl.read_ndjson("companies.ndjson")

# Fast aggregation
print(
    df.group_by("canton")
    .agg([
        pl.count().alias("count"),
        pl.col("capital").mean().alias("avg_capital")
    ])
    .sort("count", descending=True)
)

Loading into DuckDB

DuckDB can query NDJSON and CSV files directly without loading them into memory first.

import duckdb

# Query the export file directly
result = duckdb.sql("""
    SELECT canton, legal_form, COUNT(*) as n, AVG(capital) as avg_cap
    FROM 'companies.ndjson'
    WHERE status = 'active' AND capital > 0
    GROUP BY canton, legal_form
    ORDER BY n DESC
    LIMIT 20
""")
print(result.df())

Bulk Profiles Endpoint

If you need full company profiles (including persons, auditors, and SOGC history) rather than flat tabular data, use the bulk profiles endpoint.

uids = ["CHE-100.000.001", "CHE-100.000.002", "CHE-100.000.003"]
profiles = client.companies.bulk_profiles(uids=uids)

for p in profiles:
    print(f"{p.name}: {len(p.persons)} persons, {len(p.sogc_publications)} publications")

This accepts up to 100 UIDs per request and returns complete profiles, saving you from making hundreds of individual API calls.

Choosing the Right Export Method

Method	Best For	Max Rows	Format
`export_excel()`	Quick ad-hoc analysis	10,000	XLSX
`exports.create()`	Large filtered datasets	780,000+	CSV, JSON, NDJSON
`bulk_profiles()`	Full profiles for a list	100 per call	JSON

Tips for Working with Swiss Company Data

Filter early: Apply canton, legal form, and status filters in the export request rather than downloading everything and filtering locally
Use NDJSON for large exports: It streams line by line, so you never need to hold the entire dataset in memory
Capital is in CHF: The capital field is always denominated in Swiss francs
UIDs are stable: The CHE-format UID never changes for a given company, making it a reliable join key across datasets

Ready to build your Swiss company dataset? Start your free trial and export your first batch today.