使用 Subversion 的 API 开发应用看起来相当的直接。所有的公共头文件(.h
)放在源文件的
subversion/include
目录。从源代码编译和安装
Subversion,这些头文件会被复制到系统目录(例如
/usr/local/include
)。这些头文件包括了所有 Subversion
库的用户可以访问的功能和类型。Subversion 开发者社区仔细的确保所有的公共 API 有完好的文档—直接引用头文件的文档。
When examining the public header files, the first thing you might notice is
that Subversion's datatypes and functions are namespace-protected. That is,
every public Subversion symbol name begins with svn_
,
followed by a short code for the library in which the symbol is defined
(such as wc
, client
,
fs
, etc.), followed by a single underscore
(_
), and then the rest of the symbol name. Semipublic
functions (used among source files of a given library but not by code
outside that library, and found inside the library directories themselves)
differ from this naming scheme in that instead of a single underscore after
the library code, they use a double underscore
(_ _
). Functions that are private to a given
source file have no special prefixing and are declared
static
. Of course, a compiler isn't interested in these
naming conventions, but they help to clarify the scope of a given function
or datatype.
Another good source of information about programming against the Subversion APIs is the project's own hacking guidelines, which you can find at http://subversion.apache.org/docs/community-guide/. This document contains useful information, which, while aimed at developers and would-be developers of Subversion itself, is equally applicable to folks developing against Subversion as a set of third-party libraries.[58]
伴随 Subversion 自己的数据类型,你会看到许多 apr
开头的数据类型引用—来自 Apache
可移植运行库(APR)的对象。APR 是 Apache
可移植运行库,源自为了服务器代码的多平台性,尝试将不同的操作系统特定代码与操作系统无关代码隔离。结果就提供了一个基础 API
库,只有一些适度区别—或者是广泛的—来自各个操作系统。Apache HTTP 服务器很明显是 APR 的第一个用户,Subversion
开发者立刻发现了使用 APR 的价值。意味着 Subversion 没有操作系统特定的代码,也意味着 Subversion 客户端可以在 Apache
HTTP 服务器存在的平台编译和运行。当前这个列表包括,各种类型的 Unix, Win32, BeOS, OS/2 和 Mac OS X。
In addition to providing consistent implementations of system calls that
differ across operating systems,[59] APR
gives Subversion immediate access to many custom datatypes, such as dynamic
arrays and hash tables. Subversion uses these types extensively. But
perhaps the most pervasive APR datatype, found in nearly every Subversion
API prototype, is the apr_pool_t
—the APR memory
pool. Subversion uses pools internally for all its memory allocation needs
(unless an external library requires a different memory management mechanism
for data passed through its API),[60] and
while a person coding against the Subversion APIs is not required to do the
same, she is required to provide pools to the API
functions that need them. This means that users of the Subversion API must
also link against APR, must call apr_initialize()
to
initialize the APR subsystem, and then must create and manage pools for use
with Subversion API calls, typically by using
svn_pool_create()
,
svn_pool_clear()
, and
svn_pool_destroy()
.
To facilitate “streamy” (asynchronous) behavior and provide consumers of the Subversion C API with hooks for handling information in customizable ways, many functions in the API accept pairs of parameters: a pointer to a callback function, and a pointer to a blob of memory called a baton that carries context information for that callback function. Batons are typically C structures with additional information that the callback function needs but which is not given directly to the callback function by the driving API function.
With remote version control operation as the whole point of Subversion's
existence, it makes sense that some attention has been paid to
internationalization (i18n) support. After all, while “remote”
might mean “across the office,” it could just as well mean
“across the globe.” To facilitate this, all of Subversion's
public interfaces that accept path arguments expect those paths to be
canonicalized—which is most easily accomplished by passing them
through the svn_path_canonicalize()
function—and
encoded in UTF-8. This means, for example, that any new client binary that
drives the libsvn_client
interface needs to first
convert paths from the locale-specific encoding to UTF-8 before passing
those paths to the Subversion libraries, and then reconvert any resultant
output paths from Subversion back into the locale's encoding before using
those paths for non-Subversion purposes. Fortunately, Subversion provides a
suite of functions (see subversion/include/svn_utf.h
)
that any program can use to do these conversions.
Also, Subversion APIs require all URL parameters to be properly
URI-encoded. So, instead of passing
file:///home/username/My File.txt
as the URL of a file named
My File.txt
, you need to pass
file:///home/username/My%20File.txt
. Again, Subversion supplies
helper functions that your application can
use—svn_path_uri_encode()
and
svn_path_uri_decode()
, for URI encoding and decoding,
respectively.
If you are interested in using the Subversion libraries in conjunction with
something other than a C program—say, a Python or Perl
script—Subversion has some support for this via the Simplified Wrapper
and Interface Generator (SWIG). The SWIG bindings for Subversion are
located in subversion/bindings/swig
. They are still
maturing, but they are usable. These bindings allow you to call Subversion
API functions indirectly, using wrappers that translate the datatypes native
to your scripting language into the datatypes needed by Subversion's C
libraries.
Significant efforts have been made toward creating functional SWIG-generated bindings for Python, Perl, and Ruby. To some extent, the work done preparing the SWIG interface files for these languages is reusable in efforts to generate bindings for other languages supported by SWIG (which include versions of C#, Guile, Java, MzScheme, OCaml, PHP, and Tcl, among others). However, some extra programming is required to compensate for complex APIs that SWIG needs some help translating between languages. For more information on SWIG itself, see the project's web site at http://www.swig.org/.
Subversion also has language bindings for Java. The javahl bindings
(located in subversion/bindings/java
in the Subversion
source tree) aren't SWIG-based, but are instead a mixture of Java and
hand-coded JNI. Javahl covers most Subversion client-side APIs and is
specifically targeted at implementors of Java-based Subversion clients and
IDE integrations.
Subversion's language bindings tend to lack the level of developer attention given to the core Subversion modules, but can generally be trusted as production-ready. A number of scripts and applications, alternative Subversion GUI clients, and other third-party tools are successfully using Subversion's language bindings today to accomplish their Subversion integrations.
It's worth noting here that there are other options for interfacing with Subversion using other languages: alternative bindings for Subversion that aren't provided by the Subversion development community at all. There are a couple of popular ones we feel are especially noteworthy. First, Barry Scott's PySVN bindings (http://pysvn.tigris.org/) are a popular option for binding with Python. PySVN boasts of a more Pythonic interface than the more C-like APIs provided by Subversion's own Python bindings. And if you're looking for a pure Java implementation of Subversion, check out SVNKit (http://svnkit.com/), which is Subversion rewritten from the ground up in Java.
例 8.1 “Using the repository layer” contains a code segment
(written in C) that illustrates some of the concepts we've been discussing.
It uses both the repository and filesystem interfaces (as can be determined
by the prefixes svn_repos_
and svn_fs_
of the function names, respectively) to create a new revision in which a
directory is added. You can see the use of an APR pool, which is passed
around for memory allocation purposes. Also, the code reveals a somewhat
obscure fact about Subversion error handling—all Subversion errors
must be explicitly handled to avoid memory leakage (and in some cases,
application failure).
例 8.1. Using the repository layer
/* Convert a Subversion error into a simple boolean error code. * * NOTE: Subversion errors must be cleared (using svn_error_clear()) * because they are allocated from the global pool, else memory * leaking occurs. */ #define INT_ERR(expr) \ do { \ svn_error_t *__temperr = (expr); \ if (__temperr) \ { \ svn_error_clear(__temperr); \ return 1; \ } \ return 0; \ } while (0) /* Create a new directory at the path NEW_DIRECTORY in the Subversion * repository located at REPOS_PATH. Perform all memory allocation in * POOL. This function will create a new revision for the addition of * NEW_DIRECTORY. Return zero if the operation completes * successfully, nonzero otherwise. */ static int make_new_directory(const char *repos_path, const char *new_directory, apr_pool_t *pool) { svn_error_t *err; svn_repos_t *repos; svn_fs_t *fs; svn_revnum_t youngest_rev; svn_fs_txn_t *txn; svn_fs_root_t *txn_root; const char *conflict_str; /* Open the repository located at REPOS_PATH. */ INT_ERR(svn_repos_open(&repos, repos_path, pool)); /* Get a pointer to the filesystem object that is stored in REPOS. */ fs = svn_repos_fs(repos); /* Ask the filesystem to tell us the youngest revision that * currently exists. */ INT_ERR(svn_fs_youngest_rev(&youngest_rev, fs, pool)); /* Begin a new transaction that is based on YOUNGEST_REV. We are * less likely to have our later commit rejected as conflicting if we * always try to make our changes against a copy of the latest snapshot * of the filesystem tree. */ INT_ERR(svn_repos_fs_begin_txn_for_commit2(&txn, repos, youngest_rev, apr_hash_make(pool), pool)); /* Now that we have started a new Subversion transaction, get a root * object that represents that transaction. */ INT_ERR(svn_fs_txn_root(&txn_root, txn, pool)); /* Create our new directory under the transaction root, at the path * NEW_DIRECTORY. */ INT_ERR(svn_fs_make_dir(txn_root, new_directory, pool)); /* Commit the transaction, creating a new revision of the filesystem * which includes our added directory path. */ err = svn_repos_fs_commit_txn(&conflict_str, repos, &youngest_rev, txn, pool); if (! err) { /* No error? Excellent! Print a brief report of our success. */ printf("Directory '%s' was successfully added as new revision " "'%ld'.\n", new_directory, youngest_rev); } else if (err->apr_err == SVN_ERR_FS_CONFLICT) { /* Uh-oh. Our commit failed as the result of a conflict * (someone else seems to have made changes to the same area * of the filesystem that we tried to modify). Print an error * message. */ printf("A conflict occurred at path '%s' while attempting " "to add directory '%s' to the repository at '%s'.\n", conflict_str, new_directory, repos_path); } else { /* Some other error has occurred. Print an error message. */ printf("An error occurred while attempting to add directory '%s' " "to the repository at '%s'.\n", new_directory, repos_path); } INT_ERR(err); }
Note that in 例 8.1 “Using the repository layer”, the code
could just as easily have committed the transaction using
svn_fs_commit_txn()
. But the filesystem API knows
nothing about the repository library's hook mechanism. If you want your
Subversion repository to automatically perform some set of non-Subversion
tasks every time you commit a transaction (e.g., sending an email that
describes all the changes made in that transaction to your developer mailing
list), you need to use the libsvn_repos
-wrapped version
of that function, which adds the hook triggering functionality—in this
case, svn_repos_fs_commit_txn()
. (For more information
regarding Subversion's repository hooks, see 第 3.2 节 “实现版本库钩子”.)
Now let's switch languages. 例 8.2 “Using the repository layer with Python” is a sample program that uses Subversion's SWIG Python bindings to recursively crawl the youngest repository revision, and to print the various paths reached during the crawl.
例 8.2. Using the repository layer with Python
#!/usr/bin/python """Crawl a repository, printing versioned object path names.""" import sys import os.path import svn.fs, svn.core, svn.repos def crawl_filesystem_dir(root, directory): """Recursively crawl DIRECTORY under ROOT in the filesystem, and return a list of all the paths at or below DIRECTORY.""" # Print the name of this path. print directory + "/" # Get the directory entries for DIRECTORY. entries = svn.fs.svn_fs_dir_entries(root, directory) # Loop over the entries. names = entries.keys() for name in names: # Calculate the entry's full path. full_path = directory + '/' + name # If the entry is a directory, recurse. The recursion will return # a list with the entry and all its children, which we will add to # our running list of paths. if svn.fs.svn_fs_is_dir(root, full_path): crawl_filesystem_dir(root, full_path) else: # Else it's a file, so print its path here. print full_path def crawl_youngest(repos_path): """Open the repository at REPOS_PATH, and recursively crawl its youngest revision.""" # Open the repository at REPOS_PATH, and get a reference to its # versioning filesystem. repos_obj = svn.repos.svn_repos_open(repos_path) fs_obj = svn.repos.svn_repos_fs(repos_obj) # Query the current youngest revision. youngest_rev = svn.fs.svn_fs_youngest_rev(fs_obj) # Open a root object representing the youngest (HEAD) revision. root_obj = svn.fs.svn_fs_revision_root(fs_obj, youngest_rev) # Do the recursive crawl. crawl_filesystem_dir(root_obj, "") if __name__ == "__main__": # Check for sane usage. if len(sys.argv) != 2: sys.stderr.write("Usage: %s REPOS_PATH\n" % (os.path.basename(sys.argv[0]))) sys.exit(1) # Canonicalize the repository path. repos_path = svn.core.svn_path_canonicalize(sys.argv[1]) # Do the real work. crawl_youngest(repos_path)
This same program in C would need to deal with APR's memory pool system. But Python handles memory usage automatically, and Subversion's Python bindings adhere to that convention. In C, you'd be working with custom datatypes (such as those provided by the APR library) for representing the hash of entries and the list of paths, but Python has hashes (called “dictionaries”) and lists as built-in datatypes, and it provides a rich collection of functions for operating on those types. So SWIG (with the help of some customizations in Subversion's language bindings layer) takes care of mapping those custom datatypes into the native datatypes of the target language. This provides a more intuitive interface for users of that language.
The Subversion Python bindings can be used for working copy operations,
too. In the previous section of this chapter, we mentioned the
libsvn_client
interface and how it exists for the sole
purpose of simplifying the process of writing a Subversion client. 例 8.3 “一个 Python 状态爬虫” is a brief example of
how that library can be accessed via the SWIG Python bindings to re-create a
scaled-down version of the svn status command.
例 8.3. 一个 Python 状态爬虫
#!/usr/bin/env python """Crawl a working copy directory, printing status information.""" import sys import os.path import getopt import svn.core, svn.client, svn.wc def generate_status_code(status): """Translate a status value into a single-character status code, using the same logic as the Subversion command-line client.""" code_map = { svn.wc.svn_wc_status_none : ' ', svn.wc.svn_wc_status_normal : ' ', svn.wc.svn_wc_status_added : 'A', svn.wc.svn_wc_status_missing : '!', svn.wc.svn_wc_status_incomplete : '!', svn.wc.svn_wc_status_deleted : 'D', svn.wc.svn_wc_status_replaced : 'R', svn.wc.svn_wc_status_modified : 'M', svn.wc.svn_wc_status_conflicted : 'C', svn.wc.svn_wc_status_obstructed : '~', svn.wc.svn_wc_status_ignored : 'I', svn.wc.svn_wc_status_external : 'X', svn.wc.svn_wc_status_unversioned : '?', } return code_map.get(status, '?') def do_status(wc_path, verbose, prefix): # Build a client context baton. ctx = svn.client.svn_client_create_context() def _status_callback(path, status): """A callback function for svn_client_status.""" # Print the path, minus the bit that overlaps with the root of # the status crawl text_status = generate_status_code(status.text_status) prop_status = generate_status_code(status.prop_status) prefix_text = '' if prefix is not None: prefix_text = prefix + " " print '%s%s%s %s' % (prefix_text, text_status, prop_status, path) # Do the status crawl, using _status_callback() as our callback function. revision = svn.core.svn_opt_revision_t() revision.type = svn.core.svn_opt_revision_head svn.client.svn_client_status2(wc_path, revision, _status_callback, svn.core.svn_depth_infinity, verbose, 0, 0, 1, ctx) def usage_and_exit(errorcode): """Print usage message, and exit with ERRORCODE.""" stream = errorcode and sys.stderr or sys.stdout stream.write("""Usage: %s OPTIONS WC-PATH Print working copy status, optionally with a bit of prefix text. Options: --help, -h : Show this usage message --prefix ARG : Print ARG, followed by a space, before each line of output --verbose, -v : Show all statuses, even uninteresting ones """ % (os.path.basename(sys.argv[0]))) sys.exit(errorcode) if __name__ == '__main__': # Parse command-line options. try: opts, args = getopt.getopt(sys.argv[1:], "hv", ["help", "prefix=", "verbose"]) except getopt.GetoptError: usage_and_exit(1) verbose = 0 prefix = None for opt, arg in opts: if opt in ("-h", "--help"): usage_and_exit(0) if opt in ("--prefix"): prefix = arg if opt in ("-v", "--verbose"): verbose = 1 if len(args) != 1: usage_and_exit(2) # Canonicalize the repository path. wc_path = svn.core.svn_path_canonicalize(args[0]) # Do the real work. try: do_status(wc_path, verbose, prefix) except svn.core.SubversionException, e: sys.stderr.write("Error (%d): %s\n" % (e.apr_err, e.message)) sys.exit(1)
As was the case in 例 8.2 “Using the repository layer with Python”, this program is pool-free and uses, for the most part, normal Python datatypes.
警告 | |
---|---|
Run user-provided paths through |
Of particular interest to users of the Python flavor of Subversion's API is
the implementation of callback functions. As previously mentioned,
Subversion's C API makes liberal use of the callback function/baton
paradigm. API functions which in C accept a function and baton pair only
accept a callback function parameter in Python. How, then, does the caller
pass arbitrary context information to the callback function? In Python, this
is done by taking advantage of Python's scoping rules and default argument
values. You can see this in action in 例 8.3 “一个 Python 状态爬虫”. The
svn_client_status2()
function is given a callback
function (_status_callback()
) but no
baton—_status_callback()
gets access to the
user-provided prefix string because that variable falls into the scope of
the function automatically.
[58] 当然,Subversion使用Subversion的API。
[59] Subversion使用尽可能多ANSI系统调用和数据类型。
[60] Neon和Berkeley DB就是这种库的例子。
[61] Redistributions in any form must be accompanied by information on how to obtain complete source code for the software that uses SVNKit and any accompanying software that uses the software that uses SVNKit. See http://svnkit.com/license.html for details.