Archive Service

The Archive Service allows you to make an archive of transactions and attachments from the Corda vault which can no longer be part of an ongoing or new transaction flow. This can reduce pressure on your node’s database, and declutter your vault.

You can use Archive service commands to mark archivable items in your vault, archive them, and restore transactions from the archive when necessary.

The Archive Service consists of the following:

It also makes use of the Application Entity Manager, which allows CorDapps to access off-ledger databases using JPA APIs.

The Archive Service archives Ledger Recovery distribution records associated with the archived transactions. (The tables node_sender_distribution_records and node_receiver_distribution_records are included in the archiving process.)

The Archive Service has commands you can use to identify which transactions can be archived in your vault. A fully consumed transaction or attachment will be marked as archivable when:

  • There are no unconsumed transactions in the same LedgerGraph component. A LedgerGraph component is a connected group of transactions, represented as a Direct Acrylic Graph (DAG) in the LedgerGraph service.
  • The transaction is not also referenced by another LedgerGraph component that contains unconsumed transactions.
  • The attachment itself is not a contract attachment.

Archivable and non-archivable LedgerGraph components:

  • Archivable and non-archivable LG components

Once the Archive Service has marked a transaction or attachment as archivable, you can safely archive it anytime without risk to any other member of your network. You do not need to inform other members of the network, and your archiving action will not affect their ledger.

The Collaborative Recovery CorDapp LedgerSync V1.2.1 has been introduced for compatibility with the Archive Service. If you or anyone on your network uses Collaborative Recovery to recover data after a disaster scenario, archived transactions in any nodes on the network are marked as such. This means that they do not appear to be ’lost’ or ‘missing’ data and will not be recovered automatically in the recovery process.

If you are using the Archive Service with LedgerGraph V1.2.1, you can save heap memory usage by configuring the option onDemand to true in your LedgerGraph configuration settings. This means LedgerGraph is only triggered when required by the Archive Service. For example, when you make a create-snapshot request.

The more transactions within a Ledger Graph component, the longer it may take for a related transaction to become archivable. If you wish to create CorDapps that produce regularly archivable transactions, there are some steps you can take in your design process to help this.

Some characteristics of a good ‘archive-friendly’ CorDapp are:

  • Short transaction chains that will get consumed in their entirety.
  • Consumes and redeems ‘irrelevant states’ – for example if you store evolvable data it should be consumed even if it is no longer directly queried or used in a transaction.
  • Avoids consuming outputs of one transaction via multiple transactions. This could mean ensuring fungible assets are distributed as narrowly as possible – rather than from multiple cash supplies.

The Archive Service CorDapp enables you to use the Archiving Service to identify and perform archiving tasks. Your configuration of the Archive Service CorDapp depends on your node database.

As part of the configuration process, you can choose to create a backup schema. This is a temporary snapshot image of archivable transactions that can be used to restore the vault if the archiving process fails.

The Archive Service requires:

  • Node minimum platform version 6.
  • Corda Enterprise minimum version 4.4.
  • LedgerGraph V1.2.
  • Collaborative Recovery V 1.2 (if you use Collaborative Recovery).

The Archive Service CorDapp JAR file should be copied to the node’s cordapps directory. The Ledger Service CorDapp JAR file must also be copied to the same directory.

corda@CrimsonSolo:/opt/corda/node$ ls -l cordapps/
drwxr-xr-x 2 corda corda   4096 Aug 26 06:43 config
-rw-r--r-- 1 corda corda 504538 Aug 26 06:35 archive-service-1.0.1.jar
-rw-r--r-- 1 corda corda 161260 Jul 30 05:04 ledger-graph-1.2.1.jar

The Archive Service CorDapp is configured using a HOCON configuration file located in the config sub-directory of node’s cordapps directory. The configuration file must have the same name and version as the CorDapp but with the jar suffix changed to conf.

corda@CrimsonSolo:/opt/corda/node$ ls -l cordapps/config
total 12
-rw-r--r-- 1 corda corda 469 Aug 26 06:43 archive-service-1.0.1.conf

The Archive Service configuration file provides the database connection details used by the service to record a temporary snapshot of the vault data.

The following are keys for configuring the Archive Service:

  • generator - SQL generator, defaults to vault’s database type.
  • driver - JDBC driver, defaults to vault’s database driver.
  • source.schema - Vault schema name, defaults to vault database schema.
  • target.schema - Backup schema name, optional, indicates that a backup schema should be created.
  • target.url - Backup schema archive URL, required if a backup schema is used.
  • target.user - Backup schema archive database user, required if a backup schema is used.
  • target.password - Backup schema archive database password, required if a backup schema is used.

Passwords can be obfuscated using Corda’s Config Obfuscator tool.

The following is a sample configuration file:

generator: PostgresGenerator
driver: "org.postgresql.Driver"

target: {
    url: "jdbc:postgresql:postgres"
    user: "archive"
    password: "<{HNtZpbrOGM6GhYA6foh5PCCBanUtaebCjauKL8ur9EE=:PolqmEJ7JOM+Sqj3ZNAE+Ew9bqG1wVE=}>"
    schema: "archive"
}

You can configure the archiving process to create a temporary snapshot image of the archivable transactions and attachments from your Corda vault on a backup schema within the same database. This snapshot can then be used to restore the vault should the database fail during the archiving operation.

For the backup schema to work, the Corda vault schema and the archive schema must reside on the same database but be managed by different schema owners.

The Archive Service uses a separate JPA entity manager factory to manage the archive schema and copy data from the Corda schema to the archive schema.

To create a copy of the vault on the backup schema the backup schema owner must have SELECT rights to the vault schema.

To restore the vault from a copy on the backup schema the backup schema owner must either have INSERT rights to the vault schema or the restore-snapshot command must be executed with the --record option.

The --record option allows the user to capture the SQL to a file so that it can be executed by a database administrator who has the sufficient rights.

The following sections contain sample SQL statements needed for creating a backup schema user. In these examples the backup schema is called archive and the vault schema is corda.

The following DDL statements can be used to create a backup schema user archive:

create user archive password 'archive';
create schema archive AUTHORIZATION archive;
grant usage on schema corda to archive;
grant select on all tables in schema corda to archive;
grant insert on all tables in schema corda to archive;

You can use the following DDL statements to create a backup schema user archive:

create user archive identified by archive2 DEFAULT TABLESPACE users QUOTA unlimited ON users;
grant CONNECT, RESOURCE to archive;

Oracle does not have a single grant option to provide rights to all the tables in a schema. This means that rights have to be current on a per table basis.

grant select, insert on corda.NODE_TRANSACTIONS to archive;

The following tables should be archived:

  • ARCHIVABLE_TX
  • ARCHIVABLE_ATT
  • NODE_ATTACHMENTS
  • NODE_TRANSACTIONS

There are also other archivable tables that contain data linked to NODE_ATTACHMENTS and NODE_TRANSACTIONS. An archivable table contains one of the following:

  • A column named transaction_id (meaning it is linked to NODE_TRANSACTIONS).
  • A column named att_id (meaning it is linked to NODE_ATTACHMENTS).

Corda provides multiple tables of this type and they vary between different Corda versions. You can also add your own tables. You must grant access to any tables that you want to include in the archiving process.

The following DDL statements can be used to create a backup schema user archive:

create login archive with password = 'Archive123';
create schema archive;
create user archive for login archive with default_schema = archive;
grant SELECT, INSERT, UPDATE, DELETE, VIEW DEFINITION, ALTER, REFERENCES on schema::archive TO archive;
grant CREATE TABLE to archive;
grant CREATE VIEW to archive;
grant SELECT on schema ::corda to archive;
grant INSERT on schema ::corda to archive;

It is sometimes necessary to restart the node when carrying out an archiving job.

You need to restart the node in the following circumstances:

  • Before running import-snapshot, having run ‘delete-snapshot’ (archive schema has been deleted, and now the vault is to be restored from file archive).
  • After ‘delete-vault’ has been run using the ‘–record’ option.
  • After ‘restore-snapshot’ has been run using the ‘–record’ option.

The command-line tool is a ‘fat-jar’ that can be executed directly using the java -jar option.

$ java -jar corda-tools-archive-service-1.0.1.jar --help
archive-service [--config-obfuscation-passphrase[=<cliPassphrase>]]
                [--config-obfuscation-seed[=<cliSeed>]]
				[--rpc-password[=<rpcPassword>]]
				[--rpc-url[=<rpcUrl>]]
                [--rpc-user=<rpcUser>]
                [--tracker]
                [-b=<baseDirectory>]
                [-f=<configurationFile>] [COMMAND]

Description:

Command line tool for performing archive operations on the Corda vault.

Options:

  -b, --base-directory=<baseDirectory>
                             Path to the base directory, default current directory
  -f, --config-file=<configurationFile>
                             Path to the configuration file, default node.conf
  -t, --tracker              Display progress tracking
      --config-obfuscation-passphrase[=<cliPassphrase>]
                             The passphrase used in the key derivation function when generating an AES key
      --config-obfuscation-seed[=<cliSeed>]
                             The seed used in the key derivation function to create a salt
      --rpc-user=<rpcUser>   Set RPC user
      --rpc-password[=<rpcPassword>]
                             Set RPC user password
      --rpc-url[=<rpcUrl>]   Set RPC connection URL
  -h, --help                 Show this help message and exit.
  -V, --version              Print version information and exit.

Commands:

  list-jobs                 display status of archiving jobs
  list-items                list transactions/attachments for archiving
  create-snapshot           marks transactions/attachments for archiving
  delete-vault              delete archived items from the vault
  export-snapshot           export snapshot to offline storage
  delete-snapshot           delete the snapshot from backup schema
  import-snapshot           import an archive to the vault
  restore-snapshot          restore items from backup schema to the vault

Filters can be used to limit which transactions and attachments are available for archiving. They are applied on the command line of the list-items or create-snapshot commands or alternatively recorded in the CorDapp configuration file.

Each filter has its own configuration requirements, which it takes either from the HOCON file given on the command line or from the CorDapp configuration file.

Custom filters can be implemented by using the Archive Service Library. For more details see the Archive Service Library documentation.

The following is a sample HOCON configuration file that can be used to configure the standard TransactionId filter.

filter: {
    // A list of filters to be applied
    filters: [
        "TransactionIdFilter"
    ]

    // Configuration for the TransactionIdFilter filter
    transactionIdFilter: {
        transactions: [
            "1234567812345678123456781234567812345678123456781234567812345678"
        ]
    }
}

The following HOCON example can be used to configure the standard Party filter:

filter: {
    // A list of filters to be applied
    filters: [
        "PartyFilter"
    ]

    // Configuration for the PartyFilter filter
    partyFilter.parties: [
     “O=PartyA,L=London,C=GB”
    ]
   }

Exporters are used to copy the archive snapshot from the vault to a permanent archive. The exporters to be applied can be given on the command line to the export-snapshot command or recorded in the CorDapp configuration file.

exporter: {
    exporters: [
        "ZippedFileExporter",
        "QueryableStateFileExporter"
    ]
}

By default no exporters are applied.

Each exporter has its own configuration requirements, which it takes either from the HOCON file given on the command line or from the CorDapp configuration file.

Custom exporters can be implemented for individual archive solutions. For more details see the Archive Service Library documentation.

Queryable state tables can be exported to CSV format by listing the tables by listing the tables in the configuration file under the property queryableTables.

queryableTables: [
    "LOAN_STATES"
]

The property can be added to the Archive Service CorDapp configuration file, or passed within the create-snapshot command configuration.

A suitable exporter, such as QueryableStateFileExporter, must also be listed on the command line to export-snapshot.

Archive Service automatically detects transaction and attachment tables that use the columns TRANSACTION_ID or ATT_ID within the vault schema and include them in the archive process.

Additional transaction and attachment tables which use different column names can be registered using the properties additionalTransactionTables and additionalAttachmentTables with the following format:

additionalTransactionTables: [
    "ACCOUNT_STATE:TX_ID",
]
additionalAttachmentTables: [
    "ATTACHMENT_INFO:ATTACHMENT_ID"
]

Data from these tables will be recorded as part of the snapshot process and later deleted from the vault, but will not be exported to permanent archive.

Tables should be excluded from the archive process can be registered using the properties excludeTransactionTables and excludeAttachmentTables.

Was this page helpful?

Thanks for your feedback!

Chat with us

Chat with us on our #docs channel on slack. You can also join a lot of other slack channels there and have access to 1-on-1 communication with members of the R3 team and the online community.

Propose documentation improvements directly

Help us to improve the docs by contributing directly. It's simple - just fork this repository and raise a PR of your own - R3's Technical Writers will review it and apply the relevant suggestions.

We're sorry this page wasn't helpful. Let us know how we can make it better!

Chat with us

Chat with us on our #docs channel on slack. You can also join a lot of other slack channels there and have access to 1-on-1 communication with members of the R3 team and the online community.

Create an issue

Create a new GitHub issue in this repository - submit technical feedback, draw attention to a potential documentation bug, or share ideas for improvement and general feedback.

Propose documentation improvements directly

Help us to improve the docs by contributing directly. It's simple - just fork this repository and raise a PR of your own - R3's Technical Writers will review it and apply the relevant suggestions.