Quickstart

Let’s get started with Recap.

  1. Install Recap
  2. Read a Schema
  3. Start Recap’s Server
    1. Read Schemas with the Gateway
    2. Store Schemas in the Registry
  4. Next Steps

Install Recap

Start by installing Recap:

pip install 'recap-core[all]'

The [all] part will install all of Recap’s dependencies, including the optional ones for systems like PostgreSQL, Snowflake, BigQuery, and so on. You might not need all of these dependencies, but they’ll be available anyway.

Read a Schema

Recap has two main commands: ls and schema. The ls command lets you list children of a URL. The URL structure depends on the system type. PostgreSQL URLs look like this:

postgresql://user:pass@host:port/[database]/[schema]/[table]

The “schema” in the URL is PostgreSQL’s database schema, not a table schema. It’s usually public.

Let’s list the schemas for a database called testdb:

recap ls postgresql://user:pass@host:port/testdb
[
  "pg_toast",
  "pg_catalog",
  "public",
  "information_schema"
]

There are four schemas. The pg_toast and pg_catalog schemas are internal to PostgreSQL. The information_schema schema is a standard schema that contains information about the database. The public schema is where our tables are located.

recap ls postgresql://user:pass@host:port/testdb/public
[
  "test_types"
]

This database only has one table, test_types. Let’s read the schema:

recap schema postgresql://user:pass@host:port/testdb/public/test_types
{
  "type": "struct",
  "fields": [
    {
      "type": "int64",
      "name": "test_bigint",
      "optional": true
    }
  ]
}

This is test_type’s schema represented as a Recap schema in JSON. The schema command reads a schema at the supplied path, converts it to a Recap schema, and prints the Recap schema as a JSON object.

You can also output the schema in Avro, Protobuf, or JSON Schema format using the --output-format switch. Here’s the same schema in JSONN schema:

recap schema postgresql://user:pass@host:port/testdb/public/test_types --output-format=json
{
  "type": "object",
  "properties": {
    "test_bigint": {
      "default": null,
      "type": "integer"
    }
  }
}

Start Recap’s Server

We’ve been using Recap’s CLI to read schemas, but Recap comes with an HTTP/JSON server as well. The server has two parts:

  • A gateway to list and read schemas. You will find this handy if you’re not using Python, or if you want to integrate Recap with other systems.
  • A registry to store and retrieve schemas. This is useful for caching schemas or acting as a repository when using Recap schemas as a source of truth.

Start the server at http://localhost:8000:

recap serve

Read Schemas with the Gateway

The server exposes /gateway/ls and /gateway/schema endpoints that are very similar to the CLI:

$ curl http://localhost:8000/gateway/ls/postgresql://user:pass@host:port/testdb
["pg_toast","pg_catalog","public","information_schema"]

And much like the CLI, I can read my test_types schema:

curl http://localhost:8000/gateway/schema/postgresql://user:pass@host:port/testdb/public/test_types
{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]}

This example includes user:pass in the URL. This works, but is not recommended for security reasons. You should configure Recap’s server to use the RECAP_URLS environment variable instead. See the configuration documentation for more details.

Recap’s HTTP/JSON gateway does not require a database or any persistence. It just connects to external systems in realtime to fetch schemas.

Store Schemas in the Registry

The server exposes a series of /registry endpoints with GET/PUT/POST methods.

To store a schema in the registry, use the POST method:

curl -X POST \
    -H "Content-Type: application/x-recap+json" \
    -d '{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]}' \
    http://localhost:8000/registry/some_schema

And to read the schema, use the GET method on /registry/[schema_name]:

curl http://localhost:8000/registry/some_schema
[{"type":"struct","fields":[{"type":"int64","name":"test_bigint","optional":true}]},4]

The schema response includes a version number. This is the schema’s version in the registry. The registry will increment this number every time you update the schema.

Next Steps

You’ve learned how to install Recap, list and read schemas, and use Recap server’s gateway and registry endpoints.

Next, you should look at Recap’s integrations page to learn how to use Recap with other systems. If you’re planning on running Recap’s server, check out the gateway and registry documentation. Finally, see Recap’s Python documentation to learn how to use Recap’s Python API.