Google Cloud Storage (GCS) - Reelevant Documentation

Overview

The GCS (Google Cloud Storage) source retrieves data files from a GCS bucket. Use it when your data is stored as files in Google Cloud and you want to import them into Reelevant for personalization.

Configuration

Required Fields

Field	Description
`projectId`	The Google Cloud project ID.
`clientEmail`	The service account email address (e.g. `my-sa@project.iam.gserviceaccount.com`).
`privateKey`	The service account private key (from the JSON key file).
`bucket`	The GCS bucket name.
`path`	The file path or pattern within the bucket (see Wildcard Paths below).

Optional Fields

Field	Description
`processType`	File processing mode when using wildcard paths: `last_unprocess` (default), `all_unprocess`, or `all`.

The minimum required role is Storage Object Viewer (roles/storage.objectViewer) on the target bucket.

Supported File Formats

GCS automatically detects the file format. The following formats are supported:

Format	Description
CSV	Comma-separated values. Delimiter is auto-detected (comma, semicolon, tab, pipe).
JSON	Standard JSON files with a root array or object.
NDJSON	Newline-delimited JSON (one JSON object per line).
XML	XML files — the root element path is auto-detected.
Parquet	Apache Parquet columnar format.
Avro	Apache Avro serialization format.
XLSX	Microsoft Excel files.

Compressed files (.gz, .zip) are automatically decompressed before parsing.

PGP Decryption

The GCS source supports PGP-encrypted files. When pgpPrivateKey is configured, files are decrypted transparently before decompression and parsing.

Field	Required	Description
`pgpPrivateKey`	Yes	The PGP/GPG private key in armored (ASCII) format.
`pgpPassphrase`	No	The passphrase for the private key, if encrypted.

Both armored (.asc) and binary (.pgp, .gpg) encrypted files are supported.

See the PGP Decryption guide for details on key generation, supported formats, and error handling.

Wildcard Paths

You can use a wildcard (*) in the file path to match multiple files. This is useful when:

A new file is exported periodically with a different name (e.g. exports/products_20240101.csv, exports/products_20240102.csv)
Data is split across multiple files in the same directory

Example patterns:

exports/products_*.csv — matches all CSV files starting with products_ in the exports/ folder
data/*.json — matches all JSON files in the data/ folder

File Processing Modes

When using wildcard paths, you can configure how files are selected via the processType field:

`processType` value	Description
`last_unprocess` (default)	Process only the most recent file that hasn’t been processed yet. Ideal for full-dataset exports where only the latest file matters.
`all_unprocess`	Process all files that haven’t been processed yet. Useful for incremental/delta exports.
`all`	Process all matching files on every sync, regardless of whether they were processed before. Useful for full datasets split across multiple files with the same names.

How It Works

Reelevant authenticates using the provided service account credentials.
The specified bucket and path are accessed. If a wildcard is used, matching files are listed.
Files are downloaded, decompressed if needed, and parsed based on the detected format.
Fields are extracted and made available for mapping.
On subsequent syncs, files are re-fetched according to the configured processing mode.

Ensure the service account has read access to the specified bucket and objects. The minimum required role is Storage Object Viewer.

​Overview

​Configuration

​Required Fields

​Optional Fields

​Supported File Formats

​PGP Decryption

​Wildcard Paths

​File Processing Modes

​How It Works

Overview

Configuration

Required Fields

Optional Fields

Supported File Formats

PGP Decryption

Wildcard Paths

File Processing Modes

How It Works