aiida.backends.general.migrations package

Submodules

Data structures for mapping legacy JobCalculation data to new process attributes.

class aiida.backends.general.migrations.calc_state.StateMapping(state, process_state, exit_status, process_status)

Bases: tuple

__getnewargs__()

Return self as a plain tuple. Used by copy and pickle.

__module__ = 'aiida.backends.general.migrations.calc_state'
static __new__(_cls, state, process_state, exit_status, process_status)

Create new instance of StateMapping(state, process_state, exit_status, process_status)

__repr__()

Return a nicely formatted representation string

__slots__ = ()
_asdict()

Return a new dict which maps field names to their values.

_field_defaults = {}
_fields = ('state', 'process_state', 'exit_status', 'process_status')
_fields_defaults = {}
classmethod _make(iterable)

Make a new StateMapping object from a sequence or iterable

_replace(**kwds)

Return a new StateMapping object replacing specified fields with new values

exit_status

Alias for field number 2

process_state

Alias for field number 1

process_status

Alias for field number 3

state

Alias for field number 0

SQL statements to detect invalid/ununderstood links for the provenance redesign migration.

Various utils that should be used during migrations and migrations tests because the AiiDA ORM cannot be used.

class aiida.backends.general.migrations.utils.LazyFile(name: str = '', file_type: aiida.repository.common.FileType = <FileType.DIRECTORY: 0>, key: Union[str, None, disk_objectstore.utils.LazyOpener] = None, objects: Dict[str, File] = None)[source]

Bases: aiida.repository.common.File

Subclass of File where key also allows LazyOpener in addition to a string.

This subclass is necessary because the migration will be storing instances of LazyOpener as the key which should normally only be a string. This subclass updates the key type check to allow this.

__init__(name: str = '', file_type: aiida.repository.common.FileType = <FileType.DIRECTORY: 0>, key: Union[str, None, disk_objectstore.utils.LazyOpener] = None, objects: Dict[str, File] = None)[source]

Construct a new instance.

Parameters
  • name – The final element of the file path

  • file_type – Identifies whether the File is a file or a directory

  • key – A key to map the file to its contents in the backend repository (file only)

  • objects – Mapping of child names to child Files (directory only)

Raises

ValueError – If a key is defined for a directory, or objects are defined for a file

__module__ = 'aiida.backends.general.migrations.utils'
class aiida.backends.general.migrations.utils.MigrationRepository(backend: aiida.repository.backend.abstract.AbstractRepositoryBackend = None)[source]

Bases: aiida.repository.repository.Repository

Subclass of Repository that uses LazyFile instead of File as its file class.

__module__ = 'aiida.backends.general.migrations.utils'
_file_cls

alias of LazyFile

class aiida.backends.general.migrations.utils.NoopRepositoryBackend[source]

Bases: aiida.repository.backend.abstract.AbstractRepositoryBackend

Implementation of the AbstractRepositoryBackend where all write operations are no-ops.

This repository backend is used to use the Repository interface to build repository metadata but instead of actually writing the content of the current repository to disk elsewhere, it will simply open a lazy file opener. In a subsequent step, all these streams are passed to the new Disk Object Store that will write their content directly to pack files for optimal efficiency.

__abstractmethods__ = frozenset({})
__module__ = 'aiida.backends.general.migrations.utils'
_abc_impl = <_abc_data object>
_put_object_from_filelike(handle: io.BufferedIOBase)str[source]

Store the byte contents of a file in the repository.

Parameters

handle – filelike object with the byte content to be stored.

Returns

the generated fully qualified identifier for the object within the repository.

Raises

TypeError – if the handle is not a byte stream.

erase()[source]

Delete the repository itself and all its contents.

Note

This should not merely delete the contents of the repository but any resources it created. For example, if the repository is essentially a folder on disk, the folder itself should also be deleted, not just its contents.

has_object(key: str)bool[source]

Return whether the repository has an object with the given key.

Parameters

key – fully qualified identifier for the object within the repository.

Returns

True if the object exists, False otherwise.

initialise(**kwargs)None[source]

Initialise the repository if it hasn’t already been initialised.

Parameters

kwargs – parameters for the initialisation.

property is_initialised

Return whether the repository has been initialised.

property uuid

Return the unique identifier of the repository.

Note

A sandbox folder does not have the concept of a unique identifier and so always returns None.

aiida.backends.general.migrations.utils.apply_new_uuid_mapping(table, mapping)[source]

Take a mapping of pks to UUIDs and apply it to the given table.

Parameters
  • table – database table with uuid column, e.g. ‘db_dbnode’

  • mapping – dictionary of UUIDs mapped onto a pk

aiida.backends.general.migrations.utils.deduplicate_uuids(table=None)[source]

Detect and solve entities with duplicate UUIDs in a given database table.

Before aiida-core v1.0.0, there was no uniqueness constraint on the UUID column of the node table in the database and a few other tables as well. This made it possible to store multiple entities with identical UUIDs in the same table without the database complaining. This bug was fixed in aiida-core=1.0.0 by putting an explicit uniqueness constraint on UUIDs on the database level. However, this would leave databases created before this patch with duplicate UUIDs in an inconsistent state. This command will run an analysis to detect duplicate UUIDs in a given table and solve it by generating new UUIDs. Note that it will not delete or merge any rows.

Returns

list of strings denoting the performed operations

Raises

ValueError – if the specified table is invalid

aiida.backends.general.migrations.utils.delete_numpy_array_from_repository(uuid, name)[source]

Delete the numpy array with a given name from the repository corresponding to a node with a given uuid.

Parameters
  • uuid – the UUID of the node

  • name – the name of the numpy array

aiida.backends.general.migrations.utils.dumps_json(dictionary)[source]

Transforms all datetime object into isoformat and then returns the JSON.

aiida.backends.general.migrations.utils.ensure_repository_folder_created(uuid)[source]

Make sure that the repository sub folder for the node with the given UUID exists or create it.

Parameters

uuid – UUID of the node

aiida.backends.general.migrations.utils.get_duplicate_uuids(table)[source]

Retrieve rows with duplicate UUIDS.

Parameters

table – database table with uuid column, e.g. ‘db_dbnode’

Returns

list of tuples of (id, uuid) of rows with duplicate UUIDs

aiida.backends.general.migrations.utils.get_node_repository_dirpaths(basepath, shard=None)[source]

Return a mapping of node UUIDs onto the path to their current repository folder in the old repository.

Parameters
  • basepath – the absolute path of the base folder of the old file repository.

  • shard – optional shard to define which first shard level to check. If None, all shard levels are checked.

Returns

dictionary of node UUID onto absolute filepath and list of node repo missing one of the two known sub folders, path or raw_input, which is unexpected.

Raises

DatabaseMigrationError – if the repository contains node folders that contain both the path and raw_input subdirectories, which should never happen.

aiida.backends.general.migrations.utils.get_node_repository_sub_folder(uuid, subfolder='path')[source]

Return the absolute path to the sub folder path within the repository of the node with the given UUID.

Parameters

uuid – UUID of the node

Returns

absolute path to node repository folder, i.e /some/path/repository/node/12/ab/c123134-a123/path

aiida.backends.general.migrations.utils.get_numpy_array_absolute_path(uuid, name)[source]

Return the absolute path of a numpy array with the given name in the repository of the node with the given uuid.

Parameters
  • uuid – the UUID of the node

  • name – the name of the numpy array

Returns

the absolute path of the numpy array file

aiida.backends.general.migrations.utils.get_object_from_repository(uuid, name)[source]

Return the content of a file with the given name in the repository sub folder of the given node.

Parameters
  • uuid – UUID of the node

  • name – name to use for the file

aiida.backends.general.migrations.utils.get_repository_object(hashkey)[source]

Return the content of an object stored in the disk object store repository for the given hashkey.

aiida.backends.general.migrations.utils.load_numpy_array_from_repository(uuid, name)[source]

Load and return a numpy array from the repository folder of a node.

Parameters
  • uuid – the node UUID

  • name – the name under which to store the array

Returns

the numpy array

aiida.backends.general.migrations.utils.migrate_legacy_repository(shard=None)[source]

Migrate the legacy file repository to the new disk object store and return mapping of repository metadata.

Warning

this method assumes that the new disk object store container has been initialized.

The format of the return value will be a dictionary where the keys are the UUIDs of the nodes whose repository folder has contents have been migrated to the disk object store. The values are the repository metadata that contain the keys for the generated files with which the files in the disk object store can be retrieved. The format of the repository metadata follows exactly that of what is generated normally by the ORM.

This implementation consciously uses the Repository interface in order to not have to rewrite the logic that builds the nested repository metadata based on the contents of a folder on disk. The advantage is that in this way it is guarantee that the exact same repository metadata is generated as it would have during normal operation. However, if the Repository interface or its implementation ever changes, it is possible that this solution will have to be adapted and the significant parts of the implementation will have to be copy pasted here.

Returns

mapping of node UUIDs onto the new repository metadata.

aiida.backends.general.migrations.utils.put_object_from_string(uuid, name, content)[source]

Write a file with the given content in the repository sub folder of the given node.

Parameters
  • uuid – UUID of the node

  • name – name to use for the file

  • content – the content to write to the file

aiida.backends.general.migrations.utils.recursive_datetime_to_isoformat(value)[source]

Convert all datetime objects in the given value to string representations in ISO format.

Parameters

value – a mapping, sequence or single value optionally containing datetime objects

aiida.backends.general.migrations.utils.serialize_repository(repository: aiida.repository.repository.Repository)dict[source]

Serialize the metadata into a JSON-serializable format.

Note

the serialization format is optimized to reduce the size in bytes.

Returns

dictionary with the content metadata.

aiida.backends.general.migrations.utils.store_numpy_array_in_repository(uuid, name, array)[source]

Store a numpy array in the repository folder of a node.

Parameters
  • uuid – the node UUID

  • name – the name under which to store the array

  • array – the numpy array to store

aiida.backends.general.migrations.utils.verify_uuid_uniqueness(table)[source]

Check whether database table contains rows with duplicate UUIDS.

Parameters

table – Database table with uuid column, e.g. ‘db_dbnode’

Raises

IntegrityError if table contains rows with duplicate UUIDS.