Go DuckDB Secrets

#go #duckdb #aws #s3 #data

Table of Contents

I didn’t actually set out to learn Go, in fact this is essentially my Go Hello World!

However, the LLMs all provided legacy API results when asking them to use secrets in DuckDB in Go, so I thought I might try and set the record straight

Background

The DuckDB documentation suggests using the 3rd party library go-duckdb by Marc Boeker to interface to database/sql in Go. This is interesting for a couple of reasons:

  1. I didn’t realize that DuckDB does not have first party clients for all supported languages. You can take a look at the list of Client APIs which looks like it has a maintainer name for 3rd party clients (a big thank you to Marc and all the other maintainers!)
  2. Go has a standard database interface: database/sql which I haven’t found to be the case in Python where I am using DuckDB, Snowflake, or InfluxDB each with its own specific interface
    1. A unified interface has obvious benefits of potentially being simpler and cleaner, but I hope there’s still flexibility for interesting features like using secrets :)

Using secrets

As mentioned, LLMs as of this writing recommend using the Legacy Authentication Scheme for S3 API. This does appear to still function today, but let’s follow the documentation’s recommendation:

The recommended way to configuration and authentication of S3 endpoints is to use secrets.

and put an example using the preferred S3 API Support on the web

Code step by step

First, let’s create a go.mod file

module go_duck

go 1.22.8

require github.com/marcboeker/go-duckdb v1.8.2

Now, let’s create go_duck.go and build it up step by step

  1. Package declaration and import
package main

import (
	"database/sql"
	"encoding/json"
	"fmt"
	"log"
	"os"

	_ "github.com/marcboeker/go-duckdb"
)
  1. Read environment variables with os.Getenv
func main() {
	awsAccessKey := os.Getenv("AWS_ACCESS_KEY_ID")
	awsRegion := os.Getenv("AWS_REGION")
	awsSecretKey := os.Getenv("AWS_SECRET_ACCESS_KEY")

	if awsAccessKey == "" || awsRegion == "" || awsSecretKey == "" {
		log.Fatal("Required AWS environment variables not set. Please set AWS_ACCESS_KEY_ID, AWS_REGION, and AWS_SECRET_ACCESS_KEY")
	}
  1. Use sql.Open to open a connection to DuckDB
	db, err := sql.Open("duckdb", "")
	if err != nil {
		log.Fatal(err)
	}
	defer db.Close()
  1. Write the query to set the secret (legacy and recommended)

Legacy

    // Legacy way to set the S3 credentials
	secret_query := fmt.Sprintf("SET s3_access_key_id='%s'; SET s3_secret_access_key='%s'; SET s3_region='%s';",
		awsAccessKey, awsSecretKey, awsRegion)

Recommended (this line is the meat of the post)

    // Recommended way to set the S3 credentials
	secret_query := fmt.Sprintf("CREATE SECRET secret1 (TYPE S3, KEY_ID '%s', SECRET '%s', REGION '%s');", awsAccessKey, awsSecretKey, awsRegion)
  1. Run the secret query
	_, err = db.Exec(secret_query)
	if err != nil {
		log.Fatal(err)
	}
  1. Test the secret query by running a query on S3 data

Note: you’ll need to edit <my_bucket> and <my_object>

	type ColumnInfo struct {
		ColumnName string
		ColumnType string
		IsNullable string
		Key        sql.NullString
		Default    sql.NullString
		Extra      sql.NullString
	}

	rows, err := db.Query("describe from read_parquet('s3://<my_bucket>/<my_object>.parquet')")
	if err != nil {
		log.Fatal(err)
	}
	defer rows.Close()
  1. Print the results
	var columns []ColumnInfo

	for rows.Next() {
		var col ColumnInfo
		err := rows.Scan(
			&col.ColumnName,
			&col.ColumnType,
			&col.IsNullable,
			&col.Key,
			&col.Default,
			&col.Extra,
		)
		if err != nil {
			log.Fatal(err)
		}
		columns = append(columns, col)
	}

	for _, col := range columns {
		fmt.Printf("Column: %s, Type: %s, Nullable: %s\n",
			col.ColumnName, col.ColumnType, col.IsNullable)
	}
}

Build and run

Like I said, I’m a Go noob, so I expect you know more than me at this point. In a terminal in the project directory:

  1. Set up the dependencies
go mod tidy
  1. Build the project

Note: I am running on an Apple Silicon MacBook Pro, so I am targeting GOOS=darwin and GOARCH=arm64. Be sure to set these to appropriate values

GOOS=darwin GOARCH=arm64 go build go_duck.go
  1. Set up your environment variables, I use Granted from Common Fate
assume default
  1. Finally, run the binary
./go_duck

Go duck!

Here’s the go_duck.go code in its entirety

Conclusion

Well, I couldn’t find an example of using secrets in DuckDB in Go and now there is one! Hopefully this helps someone and hopefully someday an LLM will learn from this too :)