I21 IPORM the IPFS ORM

I21 IPORM

Background

This document provides an idea in response to R24 IPFS ORM. We seek to define an MVP spec that serves the requests set out there.

Certain of the requests might not be serviceable in earlier passes of the work, in which case further intentions and directions will be discussed at the end of this document.

Scope

For the purposes of this iteration, we take the following request items as critical to the resulting deliverable being of any value.

Developer interface: providing an interface for managing CID's and managing iterations on the relationships between the data they represent, whilst hiding the details of this management.
IPFS interface: using an abstracted interface for the underlying machinery so that a mock can be used to test correctness in a unit test.
IPLD schema checking: Optionally associating each instance with an IPLD schema, and when provided, the instance should have tooling to check data against that schema.
Encryption: Encrypting items stored into IPFS before hashing them.
Schema change management: Suppressing explicit object relationship management allows developers to focus on data structure more and data management lifecycle less, which should result in easier schema changes to the core code.
Upstream change management: Decoupling the IPFS libraries from their consumption to allow them to change freely without requiring the consuming codebase to change

The rest of the requests will be handled or designed for based on their ease of implementation.

Motivation

There are a number of reasons to explode data into components for standalone storage on IPFS:

Different parties are interested in different parts of the data
Different parts of the data have different churn rates and resource requirements
Data structures are big enough that they need to be managed by reference
Resiliency, ie getting data to replicate as quickly as possible
Data type reuse in multiple contexts

Broadly, these reasons are similar to the reasons you might choose to add a new foreign key to a database, so surveying prior art should help understand design decisions.

ORMs typically have two layers:

The modeling language, which allows a developer to
1. Specify the shape of the data the model is handling
2. Specify a schema that can check incoming or outgoing data for adherence to expected shape (can be folded in to (1) if the ability to tag attributes is sufficiently well developed)
3. Specify any HAS A relationships to other data items that are also specified using the ORM
4. Declare any functions or methods meant to process the data the class handles
The processor/client, which allows a developer to
1. Instance objects from a serialized format, including any nested objects, providing hydrated versions
2. Serialize objects from live instances, including any nested objects
3. Check during serialization/deserialization for adherence to a schema (sometimes automatic)
4. Provide lifecycle hooks during serialization/deserialization

Examples

Prisma user schema (model layer)

model User {
  id      Int      @id @default(autoincrement())
  email   String   @unique
  name    String?
  role    Role     @default(USER)
  posts   Post[]
  profile Profile?
}

Key notes:

Every property has a declared type
Some properties have types that are defined elsewhere in the ORM models
Properties have standard symbols that equip the processor with extra behaviour such as “?” for optional and “[]” for many
There’s an additional attribute language so that properties can be equipped with even more user defined behaviour during processing
The model is parsed from the schema, which means that schema checking can be automatic and constant
This approach requires crafting an additional parser
There is no way in this schema language to declare methods that should come with the live object instances

Prisma client (processor layer)

const user = await prisma.user.findUnique({
  where: {
    email: 'emma@prisma.io',
  },
  select: {
    email: true,
    posts: {
      select: {
        likes: true,
      },
    },
  },
})

const deletePosts = prisma.post.deleteMany({
  where: {
    authorId: 7,
  },
})

Key Notes:

Prisma only provides interface level functionality on the client library/processor, eg CRUD
Any extra verbs you would wish to add to types (eg “tagPosts(tag)”) would need to exist in a higher level of the program that loads and writes objects instead of just being on the newly hydrated objects
There’s still quite a heavy syntax load inside the CRUD function arguments that don’t entirely escape from thinking like an RDBMS
Relations can also be queried from root nodes in the tree

Objection User Model

class Person extends Model {
  static get tableName() {
    return 'persons'
  }

  static get idColumn() {
    return 'id'
  }

  fullName() {
    return this.firstName + ' ' + this.lastName
  }

  static get jsonSchema() {
    return {
      type: 'object',
      required: ['firstName', 'lastName'],

      properties: {
        id: { type: 'integer' },
        parentId: { type: ['integer', 'null'] },
        firstName: { type: 'string', maxLength: 255 },
        lastName: { type: 'string', maxLength: 255 },
        age: { type: 'number' },
        address: {
          type: 'object',
          properties: {
            street: { type: 'string' },
            city: { type: 'string' },
            zipCode: { type: 'string' },
          },
        },
      },
    }
  }

  static get relationMappings() {
    const Animal = require('./Animal')
    const Movie = require('./Movie')

    return {
      pets: {
        relation: Model.HasManyRelation,
        modelClass: Animal,
        join: {
          from: 'persons.id',
          to: 'animals.ownerId',
        },
      },

      movies: {
        relation: Model.ManyToManyRelation,
        modelClass: Movie,
        join: {
          from: 'persons.id',
          through: {
            from: 'persons_movies.personId',
            to: 'persons_movies.movieId',
          },
          to: 'movies.id',
        },
      },
    }
  }
}

Key notes:

Although much less terse, this model contains many of the same elements: schema checking, automatic de/serialization of children, decorating of certain properties with extra attributes (eg the relation property on relationMapping keys)
In fact, this class definition could easily be a compile target for the prisma schema language
Defining things in terms of class inheritance gives the developer the opportunity to add extra methods to the hydrated object
Harder to learn

Objection queries (processing layer)

const people = await Person.query().withGraphFetched({
  pets: true,
  children: {
    pets: true,
    children: true,
  },
})

for (const person of people) {
  console.log(person.fullName())
}

Key notes:

Has instance methods on the results objects
Can query for a whole graph at once
Treats objects as transient and composable by forming them with a query, which will give a different answer at a different point in time
Does not have a 1:1 mapping between hydrated and dehydrated items, ie: a hydrated query result cannot be dehydrated

Proposal

Principles

With the background properly in place, we are ready to propose an intended interface. We adhere to the following rules in this proposal

Use built in language features for MVP efforts: this strongly favours a modeling language like objection, which can eventually be a compile target for something fancier like graphql or prisma
Keep it simple: Don’t try to mix schemas and properties and relations into the some unwieldy format
Colocate everything that is known statically about a model: Make it very easy to understand all the facts about a class you are looking at without jumping to a bunch of files
Provide an extensible annotation system for properties: provision for the ability to add tags/functions to properties whose code is managed orthogonally. This will make it easier to add cross cutting behaviours in the future.
Assume the underlying layer is injected at runtime: so we can always test quickly with stubs
Assume the basic use case is crushing and uncrushing data from a root object: the main thing this needs to do properly is move things on and off the wire

API Example

/** @jsdoc
  Here is where documentation about the Dmz would go, so that jsdoc can automatically
  generate documentation for the models.
**/
class Dmz extends Model {
  fullName() {
    return this.firstName + ' ' + this.lastName
  }
  static get ipldSchema() {
    return `
type Dmz struct {
    config &Config
    timestamp Timestamp          # changes every block
    network Network              # block implies net change
    state &State
    pending optional &Pending
    approot optional PulseLink   # The latest known approot
    binary optional Binary
    covenant Covenant
}
`
  }
  postUncrush() {
    // lifecycle hook
    assert(this.fullName().length < 36)
  }
  preCrush() {
    delete this.transientVar
  }
}

const iporm = await IPORM.open(getFn, putFn, classes, {
  checkSchema: false,
})
await iporm.uncrush.DMZ(cid, decrypter)
await iporm.crush(dmzInstance, encrypter)

Hints

Timestamp class needs dependency injection
No need for partial graph hydration
State can’t have any nested keys crushed
Resolver can throw, but must eventually backed by a stateful object
Binary can’t be hydrated
Keys can be changed on the fly
Keys need to be stored on the pulsechain
Changing keys is out of scope due to problems knowing what key to use for a historical Block referred to by the current Pulse

Tests

This Idea is considered implemented once these tests pass:

Output is a standlone npm package

The folder for this module is in dreamcatcher-stack/pkg/iporm and is the only output. This should be published to npm and consumable as such.

Replace existing IPLD schema specification

Currently the IPLD specification for Interpulse is in subfolder w006-schemas. This folder should be able to be deleted once the Interpulse codebase is shifted to using the iporm package, with no loss of fidelity in the descriptional information currently present.

Replace existing IPLD ORM

Currently the ORM used by Interpulse is in subfolder w008-ipld. This folder should be able to provide the same functionality it currently does, but by depending entirely on iporm for its IPFS ORM functions instead of its current adhoc ORM.

All 6 scope elements are met

The scope defines 6 crucial elements that must be judged to have been met by the implementation in a form at least free of any critical to function bugs or deficiencies.

I21 IPORM the IPFS ORM

Background​

Scope​

Motivation​

Examples​

Prisma user schema (model layer)​

Prisma client (processor layer)​

Objection User Model​

Objection queries (processing layer)​

Proposal​

Principles​

API Example​

Hints​

Tests​

Output is a standlone npm package​

Replace existing IPLD schema specification​

Replace existing IPLD ORM​

All 6 scope elements are met​

Background

Scope

Motivation

Examples

Prisma user schema (model layer)

Prisma client (processor layer)

Objection User Model

Objection queries (processing layer)

Proposal

Principles

API Example

Hints

Tests

Output is a standlone npm package

Replace existing IPLD schema specification

Replace existing IPLD ORM

All 6 scope elements are met