Skip to content

embulk-output-redshift is not compatible with SUPER type #332

@hibira

Description

@hibira

embulk-output-redshift does not support SUPER type.

Currently, only VARBYTE can be used to send data to Redshift.
However, VARBYTE has an upper limit of 65535 bytes, so it is not possible to migrate strings longer than that.

By supporting Redshift's SUPER type, longer strings can be migrated.

Note that the following settings can be used as a workaround, but I would like to see formal support for this.

  • The type of column (or column_options) in must be json <--- If this is a string, the file used by COPY was corrupted.
  • out's column_options must be { type: "SUPER", value_type: json } <--- If this is a string, a 65535 byte constraint error occurred.
in:
  type: file
  path_prefix: ./input.csv
  parser:
    charset: UTF-8
    newline: CRLF
    type: csv
    delimiter: ","
    quote: "'"
    escape: "'"
    null_string: "NULL"
    skip_header_lines: 1
    columns:
      - { name: col1, type: long }
      - { name: col2, type: string }
      - { name: col3, type: json }
out:
  type: redshift
  host: xxxxxx
  user: xxxxxx
  password: xxxxxx
  database: dev
  table: sample_table
  aws_access_key_id: xxxxxx
  aws_secret_access_key: xxxxxx
  iam_user_name: xxxxxx
  s3_bucket: xxxxxx
  s3_key_prefix: xxxxxx
  mode: insert
  column_options:
    col1: { type: "INTEGER" }
    col2: { type: "VARCHAR(255)" }
    col3: { type: "SUPER", value_type: json }

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions