Skip to main content
GCS source configuration form

Overview

The GCS (Google Cloud Storage) source retrieves data files from a GCS bucket. Use it when your data is stored as files in Google Cloud and you want to import them into Reelevant for personalization.

Configuration

Required Fields

FieldDescription
projectIdThe Google Cloud project ID.
clientEmailThe service account email address (e.g. my-sa@project.iam.gserviceaccount.com).
privateKeyThe service account private key (from the JSON key file).
bucketThe GCS bucket name.
pathThe file path or pattern within the bucket (see Wildcard Paths below).

Optional Fields

FieldDescription
processTypeFile processing mode when using wildcard paths: last_unprocess (default), all_unprocess, or all.
The minimum required role is Storage Object Viewer (roles/storage.objectViewer) on the target bucket.

Supported File Formats

GCS automatically detects the file format. The following formats are supported:
FormatDescription
CSVComma-separated values. Delimiter is auto-detected (comma, semicolon, tab, pipe).
JSONStandard JSON files with a root array or object.
NDJSONNewline-delimited JSON (one JSON object per line).
XMLXML files — the root element path is auto-detected.
ParquetApache Parquet columnar format.
AvroApache Avro serialization format.
XLSXMicrosoft Excel files.
Compressed files (.gz, .zip) are automatically decompressed before parsing.

PGP Decryption

The GCS source supports PGP-encrypted files. When pgpPrivateKey is configured, files are decrypted transparently before decompression and parsing.
FieldRequiredDescription
pgpPrivateKeyYesThe PGP/GPG private key in armored (ASCII) format.
pgpPassphraseNoThe passphrase for the private key, if encrypted.
Both armored (.asc) and binary (.pgp, .gpg) encrypted files are supported.
See the PGP Decryption guide for details on key generation, supported formats, and error handling.

Wildcard Paths

You can use a wildcard (*) in the file path to match multiple files. This is useful when:
  • A new file is exported periodically with a different name (e.g. exports/products_20240101.csv, exports/products_20240102.csv)
  • Data is split across multiple files in the same directory
Example patterns:
  • exports/products_*.csv — matches all CSV files starting with products_ in the exports/ folder
  • data/*.json — matches all JSON files in the data/ folder

File Processing Modes

When using wildcard paths, you can configure how files are selected via the processType field:
processType valueDescription
last_unprocess (default)Process only the most recent file that hasn’t been processed yet. Ideal for full-dataset exports where only the latest file matters.
all_unprocessProcess all files that haven’t been processed yet. Useful for incremental/delta exports.
allProcess all matching files on every sync, regardless of whether they were processed before. Useful for full datasets split across multiple files with the same names.

How It Works

  1. Reelevant authenticates using the provided service account credentials.
  2. The specified bucket and path are accessed. If a wildcard is used, matching files are listed.
  3. Files are downloaded, decompressed if needed, and parsed based on the detected format.
  4. Fields are extracted and made available for mapping.
  5. On subsequent syncs, files are re-fetched according to the configured processing mode.
Ensure the service account has read access to the specified bucket and objects. The minimum required role is Storage Object Viewer.