catalogist

Catalogist πŸ“š πŸ““ πŸ“’ πŸ“– πŸ”–

Build Status FOSSA Status Quality Gate Status CodeScene Code Health CodeScene System Mastery codecov Maintainability

You were a person on a mission: To have a total bird's eye view on your entire software estate. You tried to win the hearts and minds of developers with microservices, and after many battles you are now finally churning out itty-bitty services, but find yourself in a quagmire without the faintest clue about what's going on anymore. Like Fox Mulder, you become disillusioned with what sad excuse of a "truth" is actually out there. 😭😭😭

"Big ball of mud"

From Mario Fusco's Twitter post

Catalogist helps you make sense of that, in a lightweight and developer-friendly way, without having to break the bank to purchase six-figure enterprise architecture software or going all-in on Backstage.

Simple: Write a bit of metadata description (a manifest file) for every service/software in a standardized format and send it to a central service, making it available to read through an API. With no more that that, we can mitigate the lack of visibility and nomenclature around how we express the attributes of our software or services.

When the manifest reaches the actual database/persistence layer, it is called a record while it's there, laying dormant.

An implementer will interact with Catalogist in one of two typical ways:

  • Custom software (e.g. your own microservices): Create a manifest file in the root of the application, and make a POST request to the Catalogist service during the CI stage. This is ideal since it enforces an always-up-to-date version of the solution's manifest.
  • Manually, for example for non-custom (e.g. commercial-off-the-shelf") software: The Catalogist service can be called manually or as an integrated part of a "dashboard" that you build yourself. An operations team could also do infrequent updates based on a ticketing system.

As it stands currently, Catalogist is implemented in an AWS-slanted direction. This should be fairly easy to modify so it works with other cloud platforms and with other persistence technologies. If there is sufficient demand, I might add extended support. Or you do it! Just make a PR and I'll see how we can proceed.

On the surface Catalogist is a relatively simple Node.js-based serverless application that exposes an API Gateway with three microservices behind it: an optional authorizer, one for creating a record, and the last one for getting records. Records are persisted in DynamoDB. When deployed, the standard implementationβ€”as providedβ€”results in a complete solution with an (optional) authorizer function, the backend functions, and all required infrastructure resources.

Catalogist diagram

Please see the generated documentation site for more detailed information.


  • Amazon Web Services (AWS) account with sufficient permissions so that you can deploy infrastructure. A naive but simple policy would be full rights for CloudWatch, Lambda, API Gateway, DynamoDB, X-Ray, and S3.

Clone or fork the repo as you normally would. Run npm install --force.

The below commands are the most critical ones. See package.json for more commands!

  • npm start: Runs Serverless Framework in offline mode
  • npm test: Tests code
  • npm run deploy: Deploys code with Serverless Framework
  • npm run build: Package and build the code with Serverless Framework
  • npm run teardown: Removes the deployed stack
  1. You will need to configure your own AWS account number in serverless.yml.
  2. You should enter a self-defined API key in serverless.yml under custom.config.apiKey; otherwise a default key will be used (the value is seen in the mentioned location).

Run npm start.

First make sure that you have a fallback value for your AWS account number in serverless.yml, for example: awsAccountNumber: ${opt:awsAccountNumber, '123412341234'} or that you set the deployment script to use the flag, for example npx sls deploy --awsAccountNumber 123412341234.

Then you can deploy with npm run deploy.

In your CI tool, just call the API, passing in your manifest file and your (self-defined) API key:

curl -X POST ${ENDPOINT}/record -d "@manifest.json" -H "Authorization: ${API_KEY}"

The manifest file is a simple JSON file or a JSON payload that describes your solution, system, or service.

The below gives an overview of what data can be described. See the example and interface specification further down.

Fundamental information about your solution. Note that only the repo, name, and description fields are required, all other properties are optional.

Named relations this component has to other components.

Support information for the component.

Array of SLO items. An SLO item represents Service Level Objective (SLO) information. Max 20 items allowed.

Array of API items. An API item represents the name of any API connected to this solution. The value should ideally point to a (local or remote) schema or definition. Max 20 items allowed.

Any optional metadata. Accepts custom-defined keys with string values.

Array of Link items. A Link item represents a link to external resources. Max 20 items allowed.

The below gives you an idea of how a "full-scale" manifest might look.

{
"spec": {
"repo": "someorg/somerepo",
"name": "my-api",
"description": "My API",
"kind": "api",
"lifecycleStage": "production",
"version": "1.0.0",
"responsible": "Someguy Someguyson",
"team": "ThatAwesomeTeam",
"system": "some-system",
"domain": "some-domain",
"dataSensitivity": "public",
"tags": ["typescript", "backend"]
},
"relations": ["my-other-service"],
"support": {
"resolverGroup": "ThatAwesomeTeam"
},
"slo": [
{
"description": "Max latency must be 350ms for the 90th percentile",
"type": "latency",
"implementation": "(sum:trace.aws.lambda.hits.by_http_status{http.status_class:2xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count() - sum:trace.aws.lambda.errors.by_http_status{http.status_class:5xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count()) / (sum:trace.aws.lambda.hits{service IN (demoservice-user,demoservice-greet)} by {service}.as_count())",
"target": "350ms",
"period": 30
}
],
"api": [
{
"name": "My API",
"schemaPath": "./api/schema.yml"
}
],
"metadata": {},
"links": [
{
"url": "https://my-confluence.atlassian.net/wiki/spaces/DEV/pages/123456789/",
"title": "Confluence documentation",
"icon": "documentation"
}
]
}

Please find a more exact description in src/interfaces/Manifest.ts.

Input data is processed when Catalogist attempts to form input data into a Manifest internally. During that step we coerce the input into a new object (stringify, then parse as a new object), drop unknown keys, check the size of the remaining object, and also check for missing information. See src/domain/valueObjects/Manifest.ts.

Because there is a bit of customization allowed, Catalogist will only drop unknown keys from the root object and from within the spec object.

  • All POST request input is sanitized.
  • Regex matching will be used to discard a wide degree of less common characters. Less aggressive replacement is enforced on spec.description and slo[].implementation fields.
  • Custom key names (in the support and/or metadata fields) may be 50 characters long.
  • Custom values (in the support and/or metadata fields) may be 500 characters long.
  • The maximum ingoing payload size must be less than 20000 characters when stringified.
  • You are allowed to use a maximum of 20 items in the api, slo and links arrays.
  • You are allowed to use a maximum of 50 items in the relations array.

Note that GET requests will always return an array, even if the result set is empty.

This is the most minimal, valid example you can create a record with.

POST {{BASE_URL}}/record

{
"spec": {
"repo": "someorg/somerepo",
"name": "my-api",
"description": "My API"
}
}

204 No Content

To get an exact record yoy will need to call the API with the repo and service parameters.

GET {{BASE_URL}}/record?repo=someorg/somerepo&service=my-api

[
{
"spec": {
"repo": "someorg/somerepo",
"name": "my-api",
"description": "My API",
"kind": "api",
"lifecycleStage": "somelifecycle",
"version": "1.0.0",
"responsible": "Someguy Someguyson",
"team": "ThatAwesomeTeam",
"system": "some-system",
"domain": "some-domain",
"dataSensitivity": "public",
"tags": ["typescript", "backend"]
},
"relations": ["my-other-service"],
"support": {
"resolverGroup": "ThatAwesomeTeam"
},
"api": [
{
"name": "My API",
"schemaPath": "./api/schema.yml"
}
],
"slo": [
{
"description": "Max latency must be 350ms for the 90th percentile",
"type": "latency",
"implementation": "(sum:trace.aws.lambda.hits.by_http_status{http.status_class:2xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count() - sum:trace.aws.lambda.errors.by_http_status{http.status_class:5xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count()) / (sum:trace.aws.lambda.hits{service IN (demoservice-user,demoservice-greet)} by {service}.as_count())",
"target": "350ms",
"period": 30
}
],
"links": [
{
"url": "https://my-confluence.atlassian.net/wiki/spaces/DEV/pages/123456789/",
"title": "Confluence documentation",
"icon": "documentation"
}
],
"timestamp": 1679155957000
}
]

Note that it's possible to get multiple responses in a GET call if you are only providing the repo parameter, as you might have several services referring to the same Git repository.

GET {{BASE_URL}}/record?repo=someorg/somerepo

[
{
"spec": {
"repo": "someorg/somerepo",
"name": "my-api",
"description": "My API",
"kind": "api",
"lifecycleStage": "somelifecycle",
"version": "1.0.0",
"responsible": "Someguy Someguyson",
"team": "ThatAwesomeTeam",
"system": "some-system",
"domain": "some-domain",
"dataSensitivity": "public",
"tags": ["typescript", "backend"]
},
"relations": ["my-other-service"],
"support": {
"resolverGroup": "ThatAwesomeTeam"
},
"api": [
{
"name": "My API",
"schemaPath": "./api/schema.yml"
}
],
"slo": [
{
"description": "Max latency must be 350ms for the 90th percentile",
"type": "latency",
"implementation": "(sum:trace.aws.lambda.hits.by_http_status{http.status_class:2xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count() - sum:trace.aws.lambda.errors.by_http_status{http.status_class:5xx AND service IN (demoservice-user,demoservice-greet)} by {service}.as_count()) / (sum:trace.aws.lambda.hits{service IN (demoservice-user,demoservice-greet)} by {service}.as_count())",
"target": "350ms",
"period": 30
}
],
"links": [
{
"url": "https://my-confluence.atlassian.net/wiki/spaces/DEV/pages/123456789/",
"title": "Confluence documentation",
"icon": "documentation"
}
],
"timestamp": 1679155957000
},
{
"spec": {
"repo": "someorg/somerepo",
"name": "my-other-api",
"description": "My Other API"
},
"timestamp": 1679155958000
}
]