scm.exlib

Overview

scm.exlib is a general framework for writing exheres that check out code from a source code management system. It is designed to make the interface and feature set as uniform as reasonably possible across different SCMs, to share as much logic as possible between different backend implementations, and to easily support multiple checkouts in a single exheres.

Usage

Basics

An exheres using scm.exlib defines one or more variables defining the code that should be checked out, and then requires the exlib. While it is possible to require scm directly, it is more common to specify the backend exlib, named scm-backend - this is a shortcut that avoids the need to define the TYPE variable explicitly. Examples:

Multiple repositories

Most commonly, an exheres will only perform a single checkout. However, sometimes it is necessary to fetch code from multiple sources. In this case, one repository is designated the “primary” repository (this term is also used for the only repository in the one-checkout case) and the rest are called “secondary” and each given their own name, listed in ${SCM_SECONDARY_REPOSITORIES}. Names must be non-empty and consist only of letters, digits and underscores. For example:

SCM_REPOSITORY="git://anongit.freedesktop.org/git/xorg/xserver"
SCM_REVISION="3336ff91de2aa35277178f39b8d025e324ae5122"
SCM_BRANCH="xgl-0-0-1"
SCM_CHECKOUT_TO="xorg-server"

SCM_mesa_REPOSITORY="git://anongit.freedesktop.org/git/mesa/mesa"
SCM_mesa_REVISION="ad6351a994fd14af9d07da4f06837a7f9b9d0de4"
SCM_mesa_BRANCH="master"

SCM_SECONDARY_REPOSITORIES="mesa"
require scm-git

Here, REPOSITORY is set for both repositories, as it must be, and REVISION, because we want to build a specific version rather than whatever happens to be the latest. BRANCH is not required here, as the default for the git backend is to fetch all branches, but it is more efficient to fetch only the one that is needed. CHECKOUT_TO is set for the primary repository so that the local git clone can be shared with the xorg-server package.

More elaborate multiple repository usage

Occasionally it is useful to set the variables after requireing the exlib. For example, when defining many similar repositories in a loop, it is convenient to use the scm_set_var function, but this is not available until scm.exlib is loaded. In this case, either ensure that ${SCM_REPOSITORY} is not set when require scm is executed or set ${SCM_NO_AUTOMATIC_FINALISE} to a non-zero value before the require, and call scm_finalise after setting all the variables. Example:

require amarok scm-svn

SCM_REPOSITORY="svn://anonsvn.kde.org/home/kde/"
SCM_SUBPATH="extragear/multimedia/${PN}"

SCM_SECONDARY_REPOSITORIES="libplasma animators popupdropper"
for SCM_THIS in ${SCM_SECONDARY_REPOSITORIES}; do
    scm_set_var REPOSITORY "${SCM_REPOSITORY}"
    scm_set_var REVISION   _
    SCM_SVN_EXTERNALS+=" src/context/${SCM_THIS/lib}:${SCM_THIS}"
done

SCM_libplasma_SUBPATH="KDE/kdebase/workspace/libs/plasma"
SCM_animators_SUBPATH="KDE/kdebase/workspace/plasma/animators"
SCM_popupdropper_SUBPATH="playground/libs/popupdropper/popupdropper"

scm_finalise

This example also demonstrates the use of SUBPATH to fetch a single project from a large SVN repository, and SVN_EXTERNALS to deal with the svn:externals feature of SVN (externals are not fetched automatically, because all SCM checkouts should be controlled by scm.exlib).

Exheres-defined variables

Global exheres-defined variables

Variables defined in this section are defined at most once per exheres. All are simple bash variables. Unless otherwise specified, variables default to being unset/empty.

SCM_SECONDARY_REPOSITORIES
A space-separated list of the names of the secondary repositories used by the exheres. May contain option? ( ) conditionals, similar to those in dependencies.
SCM_NO_PRIMARY_REPOSITORY
If non-empty, disables the use of the primary repository (inheritance of the TYPE variable into secondary repositories still functions). Intended for cases where all checkouts should be subject to options.
SCM_NO_AUTOMATIC_FINALISE
If non-empty, prevents scm_finalise from being called automatically by scm.exlib, even if ${SCM_REPOSITORY} is non-empty at require-time. Intended for cases where the SCM_* variables are defined in multiple places (exheres/exlibs), and it would be difficult to ensure that ${SCM_REPOSITORY} is not set earlier than desired.

Per-repository exheres-defined variables

Variables described in this section can be defined separately for each repository. Unless otherwise specified, variables default to being unset/empty, and secondary repositories do not automatically inherit values from the primary repository. After requireing scm.exlib, some of these may automatically be sanitised or assigned default values.

To define a variable for the primary repository, set the bash variable SCM_variable-name. Examples:

To define a variable for a secondary repository, set the bash variable SCM_repository-name_variable-name. Examples:

General definition

Backend-specific notes

bzr

cvs

darcs

git

hg

svn

Generic variables

TYPE

The name of the backend. For secondary repositories, defaults to that set for the primary repository, which is itself usually set by requireing the backend directly.

CHECKOUT_TO

The directory into which the remote repository will be cloned/checked out/etc.

Defaults to the package name for the primary repository, or the repository name for secondary repositories. If the backend doesn’t support multiple branches in a single checkout and ${SLOT} is not equal to 0, then the default value will also have :${SLOT} at the end. Be sure to set this variable manually if the default behaviour isn’t sufficient to prevent different branches from “fighting” over the space.

Should always be set to a relative path in the exheres; scm.exlib will automatically prepend ${SCM_HOME}/ to the value.

A shared repository is created in this directory, then the branch is cloned to a subdirectory named by the BRANCH variable; therefore, multiple branches can be stored simultaneously.

UNPACK_TO

The directory under ${WORKBASE} to which the repository contents will be copied at the start of the build. Defaults to ${WORKBASE}/${PNV} for the primary repository, and ${WORKBASE}/name for secondary repositories.

Note that, although this should always be a subdirectory of ${WORKBASE}, it will not be added automatically, unlike CHECKOUT_TO.

Can be overridden automatically if the repository is mentioned in SVN_EXTERNALS for some other repository.

REPOSITORY

The address (often a URI) of the repository, in the appropriate backend-specific syntax. Must be set.

This should specify the portion of the URI common to all branches in the “same” repository (it doesn’t matter whether or not a shared repository is actually present on the remote end) — the remainder should be placed in the BRANCH variable.

lp: URIs are not supported, because they require network access to do anything useful with them, including to tell whether or not network access is required. Since they are just aliases for the “real” URI, you can use HOME=/var/empty bzr info lp:whatever to find out what to specify instead.

This should be set to the appropriate value of ${CVSROOT}, aka the -d global option to cvs.

This should be the Subversion URI up to but not including the /trunk/, /branches/ or /tags/ path component. The remainder is specified by and/or infered by the presence or absense of the BRANCH, TAG and SUBPATH variables. See SVN_RAW_URI if this is not appropriate.

Do not include peg revisions here; use REVISION instead.

BRANCH

The branch within the repository that should be fetched and copied to the build directory.

Defaults to trunk; use . to specify that the branch lives in the root of the repository.

Not supported, as darcs uses a “repository = branch” model.

Specifies which branch to fetch, and if neither TAG nor REVISION is set, also which branch head to copy to the build directory.

Defaults to master if neither TAG nor REVISION is set, otherwise defaults to empty (in which case all branches will be fetched).

Specifies the branch within a particular repository. With Mercurial, it is common to use the “repository = branch” model instead, which should be handled using REPOSITORY, and CHECKOUT_TO if multiple branches should be allowed to coexist.

Defaults to default if neither TAG not REVISION is set.

TAG

The symbolic name of the revision that should be copied to the build directory.

It is generally assumed that tags are fixed at creation, even if the SCM doesn’t enforce this. Please throw things at your upstream if this is not true.

REVISION

The identifier of the revision that should be copied to the build directory, using the appropriate backend-specific syntax.

In general, this must be a literal identifier, not (for SCMs that support such things) an expression that evaluates to an identifier and may even change meaning over time.

Must be either a revision identifier or a non-dotted revision number (the latter is only supported to allow things like ${PV#*_p} — in all other cases, the globally unique identifier is preferred).

A timestamp of the form YYYY.MM.DD.hh.mm.ss.

Not supported, as darcs does not have revision identifiers. See DARCS_CONTEXT_FILE for an alternative.

Must be lower-case and unabbreviated.

Must be lower-case and unabbreviated.

May be _ to allow it to be filled in from a svn:externals definition; see SVN_EXTERNALS.

SUBPATH

The subdirectory of the repository that should be checked out and copied to the work directory.

Not supported.

The CVS module that should be checked out. Defaults to the package name for the primary repository, and the repository name for secondary repositories.

Not supported.

Not supported.

Not supported.

Backend-specific variables

DARCS_CONTEXT_FILE

A file containing the output of darcs changes --context specifying the repository state that should be copied to the build directory. Usually stored in ${FILES}.

GIT_TAG_SIGNING_KEYS

An array listing the names of files containing public keys that should be used to verify signed tags. If specified, the tag must be signed by one of these keys. Usually created with gpg --export --armor email-address and stored in ${FILES}.

SVN_EXTERNALS

A space-separated list of terms of the form subdirectory:repository, indicating that the specified repository matches the svn:externals definition for subdirectory. repository may also be blank, to specify that this externals definition should be ignored. The repository’s REVISION variable may be set to _, in which case it will be automatically reset to the value specified by the externals definition.

It is an error if the repository does not match the externals definition (as defined by the values of REPOSITORY, BRANCH, TAG, SVN_RAW_URI, REVISION and UNPACK_TO; however, UNPACK_TO will be set to the correct value automatically if it is unset), or if there is any externals definition that is not mentioned in this variable. Externals will not be fetched automatically, so that all SCM checkouts will be accounted for by the scm.exlib framework.

SVN_PASSWORD

The password to pass to SVN. If empty, an explict empty password is used. Only meaningful if SVN_USERNAME is set.

SVN_RAW_URI

If non-empty, the URI specified in REPOSITORY will be used as-is, without assuming the standard trunk/branches/tags layout; BRANCH, TAG and SUBPATH must all be empty.

SVN_USERNAME

The username to pass to SVN. See also SVN_PASSWORD.

scm.exlib-defined variables

scm.exlib defines more variables than are listed here, but the remainder are for internal use only.

Global scm.exlib-defined variables

These are simple bash variables, and are defined after requireing scm.exlib, either directly or via a backend.

SCM_HOME
Equal to ${FETCHEDDIR}/scm. The default directory for storing local checkouts. Exheres will rarely need to refer to this directly.

Per-repository scm.exlib-defined variables

These are mapped to bash variables in the same way as those defined by the exheres. They are generally not defined immediately; see the description of each variable to find out when it becomes available.

General definition

Backend-specific notes

bzr

cvs

darcs

git

hg

svn

Generic variables

ACTUAL_REVISION

After a repository has been checked out and copied into ${WORK}, identifies the specific revision that is present, in whatever format is appropriate for the SCM in question.

This is always a globally unique revision identifier, not a revision number.

Not supported, as CVS does not have repository-wide revision identifiers (and timestamps are not suitable due to non-atomicity). See CVS_ACTUAL_REVISION_LIST for an alternative.

Not supported, as darcs does not have revision identifiers. See DARCS_ACTUAL_CONTEXT for an alternative.

Backend-specific variables

CVS_ACTUAL_REVISION_LIST

After the repository is checked out and copied to ${WORK}, lists the name and CVS revision of each file that is present. The format is one file per line, where each line contains the filename relative to the checkout root, folowed by a : and a space, followed by the revision. The lines are sorted according to the C locale.

DARCS_ACTUAL_CONTEXT

After the repository is checked out and copied to ${WORK}, contains the darcs context identifying the files present, as generated by darcs changes --context.

Note: currently empty if TAG is set.

User-defined variables

These variables may be set by the user in the package manager configuration. Exheres should not modify them, and will rarely need to use them.

SCM_OFFLINE
If non-empty, disables network access, using the existing checkout(s). If they are missing or insufficient, the build is aborted.
SCM_MIN_UPDATE_DELAY
If non-empty, must be a positive integer that specifies the minimum number of hours between updating any specific checkout. Ignored if the existing checkout is insufficient.
SCM_SVN_CONFIG_DIR
If non-empty, the path to an SVN user configuration directory, as is usually stored in ~/.subversion.

Miscellaneous variables

SCM_THIS
Defines the currently “active” repository. If unset or empty, refers to the primary repository, otherwise refers to the named secondary repository. Exheres may set this when calling the various functions that access per-repository variables.

scm.exlib-defined functions

scm.exlib defines more functions than are listed here, but the remainder are for internal use only.

Exported phase functions

As is usual with exlib-defined phase functions, these need only be called if the exheres or another exlib needs to define its own version of the phase function.

scm_unpack
Checks out or updates all the repositories used by the exheres, then copies the code to ${WORK}.
pkg_info
For built packages (installed or binary packages), may display detailed information about the exact version of the code that was built, if the SCM does not support short unambiguous global revision identifiers. For unbuilt packages, does nothing.

Other functions

Note: most of these functions are rarely needed in exheres. In particular, scm_get_var and scm_set_var need only be used if the name of the variable or repository is not fixed at authoring time; otherwise, it is acceptable and usually preferably to reference the underlying bash variable directly. See the example for an exception.

scm_finalise
Performs various global-scope operations. Usually called automatically when scm.exlib is loaded. See the example for a situation where it needs to be called manually.
scm_for_each
Takes one or more arguments denoting a command, and runs the command once for each repository (the primary and any secondaries) with SCM_THIS set appropriately.
scm_var_name
Takes a single argument naming a per-repository variable, and outputs the name of the bash variable corresponding to the specified variable for the active repository.
scm_get_var
Takes a single argument naming a per-repository variable, and outputs the value of the specified variable for the active repository.
scm_set_var
Takes two arguments, the first naming a per-repository variable and the second specifying a value, and sets the specified variable to the specified value for the active repository.
scm_modify_var
Takes at least two arguments, the first naming a per-repository variable and the remainder denoting a command. Runs the command with the current value of the specified variable for the active repository as an extra, final argument, and sets the specified variable to the output of the command.
scm_get_array
Takes two arguments, one naming a per-repository variable and one naming a bash array in the caller’s scope, and sets the contents of the array to the value of the specified variable for the active repository.
scm_set_array
Takes at least one argument, one naming a per-repository variable and the rest denoting an array, and sets the specified variable to the specified array value for the active repository.
scm_trim_slashes
Takes zero or more of -scheme, leading and trailing, followed by exactly one string argument. Outputs its string argument with sequences of duplicate / characters replaced by single /s. If -scheme is specified, any text up to and including the first occurrence of :// will be unchanged. If -leading is specified, also removes any / characters at the start of the string. If -trailing is specified, also removes any / characters at the end of the string.

Writing backends

TODO


Copyright 2009 David Leverton

This work is licensed under the Creative Commons Attribution Share Alike 3.0 License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/; or, (b) send a letter to Creative Commons, 171 2nd Street, Suite 300, San Francisco, California, 94105, USA.