Skip to content

Numeric connection string parameters may be formatted incorrectly in connection string #934

@BSchamberger

Description

@BSchamberger

Hi odbc team,

odbc:::build_connection_string creates a connection string out of the parameters passed to DBI::dbConnect/odbc:::odbc_connect. In the case one of the parameters is a large numeric, it may be converted to scientific format, potentially yielding an incorrect or invalid connection string.

This has happened to me with the Simba Spark ODBC Driver DefaultStringColumnLength argument/connection attribute. My data contains a field that may store long JSON strings, hence I set DefaultStringColumnLength to a large number to err on the safe side

DBI::dbConnect(
  odbc::odbc(),
  Driver                    = "Simba Spark ODBC Driver",
  SparkServerType           = 3,
  Host                      = "[MY_HOST]",
  Port                      = 443,
  AuthMech                  = 3,
  UID                       = "token",
  PWD                       = "[MY_TOKEN]",
  ThriftTransport           = 2,
  HTTPPath                  = "[MY_PATH]",
  DefaultStringColumnLength = 100000
)

which leads to the following connection string generated by odbc:::build_connection_string

[1] "Driver=Simba Spark ODBC Driver;SparkServerType=3;...;DefaultStringColumnLength=1e+05"

The DefaultStringColumnLength=1e+05 part is silently ignored by the driver.

The workaround I currently use it to set the parameter to a string directly (DefaultStringColumnLength = "100000"). However, it may be nicer if these cases are handled in the package directly. This issue could be solved by changing the definition of odbc:::build_connection_string to explicitly avoid scientific notation with format(args, scientific = FALSE) instead of using args directly:

build_connection_string <- function(args = list(), string = NULL) {

  args_string <- paste(names(args), format(args, scientific = FALSE), sep = "=", collapse = ";")

  if (!is.null(string) && !grepl(";$", string) && length(args) > 0) {
    string <- paste0(string, ";")
  }

  paste0(string, args_string)
}

which yields

[1] "Driver=Simba Spark ODBC Driver;SparkServerType=3;...;DefaultStringColumnLength=100000"

My environment:

R version 4.5.0 (2025-04-11 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26100)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United Kingdom.utf8 
[2] LC_CTYPE=English_United Kingdom.utf8   
[3] LC_MONETARY=English_United Kingdom.utf8
[4] LC_NUMERIC=C                           
[5] LC_TIME=English_United Kingdom.utf8    

time zone: Europe/Zurich
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] bit_4.6.0         odbc_1.6.1        glue_1.8.0        blob_1.2.4       
 [5] pkgconfig_2.0.3   bit64_4.6.0-1     lifecycle_1.0.4   cli_3.6.5        
 [9] vctrs_0.6.5       DBI_1.2.3         pkgload_1.4.0     compiler_4.5.0   
[13] rstudioapi_0.17.1 tools_4.5.0       hms_1.1.3         pillar_1.11.0    
[17] Rcpp_1.1.0        rlang_1.1.6      

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions