Data Prep and Distribution A data pipeline for large scale analytical applications

Data integration visualization Data integration visualization

Robust

Fault tolerant. Distribute tens of gigabytes of data to 10,000s of tenants every hour.

Modular

Out-of-the-box functionality can be combined with your custom code.

Open

Comprehensive APIs and SDKs. Complete documentation. SQL/JDBC for data access.

Serverless

Focus on your customers, not your servers.

Connect your data sources

GoodData bricks can be scheduled to pull data from your relational databases or from third party services such as Salesforce or Google Analytics. Connect to full or incremental CSV extracts on S3, WebDAV or SFTP. Or, run custom brick code on our platform.

Your application data

  • JDBC
  • S3
  • blob storage
  • ...and more

Data for Enrichment

  • salesforce
  • Google Analytics
  • custom code
  • ...and more
{
    "downloaders": {
        "csv_downloader": {
            "type": "csv",
            "entities": [
                "Orders", "Customers", "Sales", "Merchants"
            ]
        }
    }
}
Describe your data sources Describe your data sources

Distribute data to your customers

Serving analytical workloads generated by thousands of customer from a single warehouse is not feasible. GoodData organizes analytical data of your customers into customer-specific workspaces. This enables per-customer customizability and access controls so you can manage releases using our lifecycle management tooling.

A declarative distribution service efficiently pushes deltas from your retrieved data into thousands of customer workspaces.

Distribute data to your customers

Need to do more data preparation?

  • Our data integration tool can run initial deduplication, enrichment with computed fields, and manage how downloaded data is merged into persistent tables.
  • Our platform can orchestrate a flow of SQL transformations to pre-aggregate, de-normalize, process time-series data, build snapshots, consolidate multiple data sources and much more.
  • You can deploy and schedule custom scripts to train and retrain machine learning models or to perform any custom data preparation logic.
--Transform opportunity fields history
SELECT r.*, o.value FROM (
    SELECT MIN(d._VALID_FROM) AS _VALID_FROM, r.record_id, r.field FROM
    (SELECT DISTINCT record_id, field FROM opportunity_fields_history) r
    INNER JOIN (SELECT DISTINCT _VALID_FROM, record_id FROM opportunity_fields_history) d
        ON d.record_id = r.record_id
Need more data preparation? Need more data preparation?

Schedule and monitor

Schedule data ingestion tasks, invoke tasks by API, or from the administration user interface. Set up dependencies between tasks. Get failure alerts via e-mail or integrate alerts with third-party services (for example, PagerDuty). See the task execution history, access log files including possible failure detail.

Schedule and monitor

See how it works

Get Started

Check us on: