catalogist

Catalogist 📚 📓 📒 📖 🔖

The easy way to catalog and make your software and (micro)services visible to your organization through an API.

You were a person on a mission: To have a total bird's eye view on your entire software estate. You tried to win the hearts and minds of developers with microservices, and after many battles you are now finally churning out itty-bitty services, but find yourself in a quagmire without the faintest clue about what's going on anymore. Like Fox Mulder, you become disillusioned with what sad excuse of a "truth" is actually out there. 😭😭😭

"Big ball of mud"

From Mario Fusco's Twitter post

Catalogist helps you make sense of that, in a lightweight and developer-friendly way, without having to break the bank to purchase six-figure enterprise architecture software or going all-in on Backstage.

How it works

Simple: Write a bit of metadata description (a manifest file) for every service/software in a standardized format and send it to a central service, making it available to read through an API. With no more that that, we can mitigate the lack of visibility and nomenclature around how we express the attributes of our software or services.

When the manifest reaches the actual database/persistence layer, it is called a record while it's there, laying dormant.

An implementer will interact with Catalogist in one of two typical ways:

Custom software (e.g. your own microservices): Create a manifest file in the root of the application, and make a POST request to the Catalogist service during the CI stage. This is ideal since it enforces an always-up-to-date version of the solution's manifest.
Manually, for example for non-custom (e.g. commercial-off-the-shelf") software: The Catalogist service can be called manually or as an integrated part of a "dashboard" that you build yourself. An operations team could also do infrequent updates based on a ticketing system.

Diagram

As it stands currently, Catalogist is implemented in an AWS-slanted direction. This should be fairly easy to modify so it works with other cloud platforms and with other persistence technologies. If there is sufficient demand, I might add extended support. Or you do it! Just make a PR and I'll see how we can proceed.

On the surface Catalogist is a relatively simple Node.js-based serverless application that exposes an API Gateway with three microservices behind it: an optional authorizer, one for creating a record, and the last one for getting records. Records are persisted in DynamoDB. When deployed, the standard implementation—as provided—results in a complete solution with an (optional) authorizer function, the backend functions, and all required infrastructure resources.

Catalogist diagram

Please see the generated documentation site for more detailed information.

Prerequisites

Amazon Web Services (AWS) account with sufficient permissions so that you can deploy infrastructure. A naive but simple policy would be full rights for CloudWatch, Lambda, API Gateway, DynamoDB, X-Ray, and S3.

Installation

Clone or fork the repo as you normally would. Run npm install --force.

Commands

The below commands are the most critical ones. See package.json for more commands!

npm start: Runs Serverless Framework in offline mode
npm test: Tests code
npm run deploy: Deploys code with Serverless Framework
npm run build: Package and build the code with Serverless Framework
npm run teardown: Removes the deployed stack

Configuration

You will need to configure your own AWS account number in serverless.yml.
You should enter a self-defined API key in serverless.yml under custom.config.apiKey; otherwise a default key will be used (the value is seen in the mentioned location).

Running Catalogist

Run npm start.

Deployment

First make sure that you have a fallback value for your AWS account number in serverless.yml, for example: awsAccountNumber: ${opt:awsAccountNumber, '123412341234'} or that you set the deployment script to use the flag, for example npx sls deploy --awsAccountNumber 123412341234.

Then you can deploy with npm run deploy.

Setting up for CI and automation

In your CI tool, just call the API, passing in your manifest file and your (self-defined) API key:

curl -X POST ${ENDPOINT}/record -d "@manifest.json" -H "Authorization: ${API_KEY}"

Manifest

The manifest file is a simple JSON file or a JSON payload that describes your solution, system, or service.

The below gives an overview of what data can be described. See the example and interface specification further down.

Required top-level keys/fields/properties

`spec`

Fundamental information about your solution. Note that only the repo, name, and description fields are required, all other properties are optional.

Optional top-level keys/fields/properties

`relations`

Named relations this component has to other components.

`support`

Support information for the component.

`slo`

Array of SLO items. An SLO item represents Service Level Objective (SLO) information. Max 20 items allowed.

`api`

Array of API items. An API item represents the name of any API connected to this solution. The value should ideally point to a (local or remote) schema or definition. Max 20 items allowed.

`metadata`

Any optional metadata. Accepts custom-defined keys with string values.

`links`

Array of Link items. A Link item represents a link to external resources. Max 20 items allowed.

Full example

The below gives you an idea of how a "full-scale" manifest might look.

{
  "spec": {
    "repo": "someorg/somerepo",
    "name": "my-api",
    "description": "My API",
    "kind": "api",
    "lifecycleStage": "production",
    "version": "1.0.0",
    "responsible": "Someguy Someguyson",
    "team": "ThatAwesomeTeam",
    "system": "some-system",
    "domain": "some-domain",
    "dataSensitivity": "public",
    "tags": ["typescript", "backend"]
  },
  "relations": ["my-other-service"],
  "support": {
    "resolverGroup": "ThatAwesomeTeam"
  },
  "slo": [
    {
      "description": "Max latency must be 350ms for the 90th percentile",
      "type": "latency",
      "implementation": "(sum:trace.aws.lambda.hits.by_http_status{http.status_class:2xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count() - sum:trace.aws.lambda.errors.by_http_status{http.status_class:5xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count()) / (sum:trace.aws.lambda.hits{service IN (demoservice-user,demoservice-greet)} by {service}.as_count())",
      "target": "350ms",
      "period": 30
    }
  ],
  "api": [
    {
      "name": "My API",
      "schemaPath": "./api/schema.yml"
    }
  ],
  "metadata": {},
  "links": [
    {
      "url": "https://my-confluence.atlassian.net/wiki/spaces/DEV/pages/123456789/",
      "title": "Confluence documentation",
      "icon": "documentation"
    }
  ]
}

Specification

Please find a more exact description in src/interfaces/Manifest.ts.

Validation and sanitization

Input data is processed when Catalogist attempts to form input data into a Manifest internally. During that step we coerce the input into a new object (stringify, then parse as a new object), drop unknown keys, check the size of the remaining object, and also check for missing information. See src/domain/valueObjects/Manifest.ts.

Because there is a bit of customization allowed, Catalogist will only drop unknown keys from the root object and from within the spec object.

Rules and limits

All POST request input is sanitized.
Regex matching will be used to discard a wide degree of less common characters. Less aggressive replacement is enforced on spec.description and slo[].implementation fields.
Custom key names (in the support and/or metadata fields) may be 50 characters long.
Custom values (in the support and/or metadata fields) may be 500 characters long.
The maximum ingoing payload size must be less than 20000 characters when stringified.
You are allowed to use a maximum of 20 items in the api, slo and links arrays.
You are allowed to use a maximum of 50 items in the relations array.

Note that GET requests will always return an array, even if the result set is empty.

Create a record

This is the most minimal, valid example you can create a record with.

Example request

POST {{BASE_URL}}/record

{
  "spec": {
    "repo": "someorg/somerepo",
    "name": "my-api",
    "description": "My API"
  }
}

Example response

204 No Content

Get exact record

To get an exact record yoy will need to call the API with the repo and service parameters.

Example request

GET {{BASE_URL}}/record?repo=someorg/somerepo&service=my-api

Example response

[
  {
    "spec": {
      "repo": "someorg/somerepo",
      "name": "my-api",
      "description": "My API",
      "kind": "api",
      "lifecycleStage": "somelifecycle",
      "version": "1.0.0",
      "responsible": "Someguy Someguyson",
      "team": "ThatAwesomeTeam",
      "system": "some-system",
      "domain": "some-domain",
      "dataSensitivity": "public",
      "tags": ["typescript", "backend"]
    },
    "relations": ["my-other-service"],
    "support": {
      "resolverGroup": "ThatAwesomeTeam"
    },
    "api": [
      {
        "name": "My API",
        "schemaPath": "./api/schema.yml"
      }
    ],
    "slo": [
      {
        "description": "Max latency must be 350ms for the 90th percentile",
        "type": "latency",
        "implementation": "(sum:trace.aws.lambda.hits.by_http_status{http.status_class:2xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count() - sum:trace.aws.lambda.errors.by_http_status{http.status_class:5xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count()) / (sum:trace.aws.lambda.hits{service IN (demoservice-user,demoservice-greet)} by {service}.as_count())",
        "target": "350ms",
        "period": 30
      }
    ],
    "links": [
      {
        "url": "https://my-confluence.atlassian.net/wiki/spaces/DEV/pages/123456789/",
        "title": "Confluence documentation",
        "icon": "documentation"
      }
    ],
    "timestamp": 1679155957000
  }
]

Get records in given Git repository

Note that it's possible to get multiple responses in a GET call if you are only providing the repo parameter, as you might have several services referring to the same Git repository.

Example request

GET {{BASE_URL}}/record?repo=someorg/somerepo

Example response

[
  {
    "spec": {
      "repo": "someorg/somerepo",
      "name": "my-api",
      "description": "My API",
      "kind": "api",
      "lifecycleStage": "somelifecycle",
      "version": "1.0.0",
      "responsible": "Someguy Someguyson",
      "team": "ThatAwesomeTeam",
      "system": "some-system",
      "domain": "some-domain",
      "dataSensitivity": "public",
      "tags": ["typescript", "backend"]
    },
    "relations": ["my-other-service"],
    "support": {
      "resolverGroup": "ThatAwesomeTeam"
    },
    "api": [
      {
        "name": "My API",
        "schemaPath": "./api/schema.yml"
      }
    ],
    "slo": [
      {
        "description": "Max latency must be 350ms for the 90th percentile",
        "type": "latency",
        "implementation": "(sum:trace.aws.lambda.hits.by_http_status{http.status_class:2xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count() - sum:trace.aws.lambda.errors.by_http_status{http.status_class:5xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count()) / (sum:trace.aws.lambda.hits{service IN (demoservice-user,demoservice-greet)} by {service}.as_count())",
        "target": "350ms",
        "period": 30
      }
    ],
    "links": [
      {
        "url": "https://my-confluence.atlassian.net/wiki/spaces/DEV/pages/123456789/",
        "title": "Confluence documentation",
        "icon": "documentation"
      }
    ],
    "timestamp": 1679155957000
  },
  {
    "spec": {
      "repo": "someorg/somerepo",
      "name": "my-other-api",
      "description": "My Other API"
    },
    "timestamp": 1679155958000
  }
]

catalogist

Catalogist 📚 📓 📒 📖 🔖

The easy way to catalog and make your software and (micro)services visible to your organization through an API.

How it works

Diagram

Prerequisites

Installation

Commands

Configuration

Running Catalogist

Deployment

Setting up for CI and automation

Manifest

Required top-level keys/fields/properties

`spec`

Optional top-level keys/fields/properties

`relations`

`support`

`slo`

`api`

`metadata`

`links`

Full example

Specification

Validation and sanitization

Rules and limits

Example API calls

Create a record

Example request

Example response

Get exact record

Example request

Example response

Get records in given Git repository

Example request

Example response

Settings

On This Page