Overview
The GCS (Google Cloud Storage) source retrieves data files from a GCS bucket. Use it when your data is stored as files in Google Cloud and you want to import them into Reelevant for personalization.
Configuration
Required Fields
| Field | Description |
|---|
projectId | The Google Cloud project ID. |
clientEmail | The service account email address (e.g. my-sa@project.iam.gserviceaccount.com). |
privateKey | The service account private key (from the JSON key file). |
bucket | The GCS bucket name. |
path | The file path or pattern within the bucket (see Wildcard Paths below). |
Optional Fields
| Field | Description |
|---|
processType | File processing mode when using wildcard paths: last_unprocess (default), all_unprocess, or all. |
The minimum required role is Storage Object Viewer (roles/storage.objectViewer) on the target bucket.
GCS automatically detects the file format. The following formats are supported:
| Format | Description |
|---|
| CSV | Comma-separated values. Delimiter is auto-detected (comma, semicolon, tab, pipe). |
| JSON | Standard JSON files with a root array or object. |
| NDJSON | Newline-delimited JSON (one JSON object per line). |
| XML | XML files — the root element path is auto-detected. |
| Parquet | Apache Parquet columnar format. |
| Avro | Apache Avro serialization format. |
| XLSX | Microsoft Excel files. |
Compressed files (.gz, .zip) are automatically decompressed before parsing.
PGP Decryption
The GCS source supports PGP-encrypted files. When pgpPrivateKey is configured, files are decrypted transparently before decompression and parsing.
| Field | Required | Description |
|---|
pgpPrivateKey | Yes | The PGP/GPG private key in armored (ASCII) format. |
pgpPassphrase | No | The passphrase for the private key, if encrypted. |
Both armored (.asc) and binary (.pgp, .gpg) encrypted files are supported.
Wildcard Paths
You can use a wildcard (*) in the file path to match multiple files. This is useful when:
- A new file is exported periodically with a different name (e.g.
exports/products_20240101.csv, exports/products_20240102.csv)
- Data is split across multiple files in the same directory
Example patterns:
exports/products_*.csv — matches all CSV files starting with products_ in the exports/ folder
data/*.json — matches all JSON files in the data/ folder
File Processing Modes
When using wildcard paths, you can configure how files are selected via the processType field:
processType value | Description |
|---|
last_unprocess (default) | Process only the most recent file that hasn’t been processed yet. Ideal for full-dataset exports where only the latest file matters. |
all_unprocess | Process all files that haven’t been processed yet. Useful for incremental/delta exports. |
all | Process all matching files on every sync, regardless of whether they were processed before. Useful for full datasets split across multiple files with the same names. |
How It Works
- Reelevant authenticates using the provided service account credentials.
- The specified bucket and path are accessed. If a wildcard is used, matching files are listed.
- Files are downloaded, decompressed if needed, and parsed based on the detected format.
- Fields are extracted and made available for mapping.
- On subsequent syncs, files are re-fetched according to the configured processing mode.
Ensure the service account has read access to the specified bucket and objects. The minimum required role is Storage Object Viewer.