This text is a work in progress—highly subject to change—and may not accurately describe any released version of the Apache™ Subversion® software. Bookmarking or otherwise referring others to this page is probably not such a smart idea. Please visit http://www.svnbook.com/ for stable versions of this book.

Getting Data into Your Repository

You can get new files into your Subversion repository in two ways: svn import and svn add. We'll discuss svn import now and will discuss svn add later in this chapter when we review a typical day with Subversion.

Importing Files and Directories

The svn import command is a quick way to copy an unversioned tree of files into a repository, creating intermediate directories as necessary. svn import doesn't require a working copy, and your files are immediately committed to the repository. You typically use this when you have an existing tree of files that you want to begin tracking in your Subversion repository. For example:

$ svn import /path/to/mytree \
             http://svn.example.com/svn/repo/some/project \
             -m "Initial import"
Adding         mytree/foo.c
Adding         mytree/bar.c
Adding         mytree/subdir
Adding         mytree/subdir/quux.h

Committed revision 1.
$

The previous example copied the contents of the local directory mytree into the directory some/project in the repository. Note that you didn't have to create that new directory first—svn import does that for you. Immediately after the commit, you can see your data in the repository:

$ svn list http://svn.example.com/svn/repo/some/project
bar.c
foo.c
subdir/
$

Note that after the import is finished, the original local directory is not converted into a working copy. To begin working on that data in a versioned fashion, you still need to create a fresh working copy of that tree.

Recommended Repository Layout

Subversion provides the ultimate flexibility in terms of how you arrange your data. Because it simply versions directories and files, and because it ascribes no particular meaning to any of those objects, you may arrange the data in your repository in any way that you choose. Unfortunately, this flexibility also means that it's easy to find yourself lost without a roadmap as you attempt to navigate different Subversion repositories which may carry completely different and unpredictable arrangements of the data within them.

To counteract this confusion, we recommend that you follow a repository layout convention (established long ago, in the nascency of the Subversion project itself) in which a handful of strategically named Subversion repository directories convey valuable meaning about the data they hold. Most projects have a recognizable main line, or trunk, of development; some branches, which are divergent copies of development lines; and some tags, which are named, stable snapshots of a particular line of development. So we first recommend that each project have a recognizable project root in the repository, a directory under which all of the versioned information for that project—and only that project—lives. Secondly, we suggest that each project root contain a trunk subdirectory for the main development line, a branches subdirectory in which specific branches (or collections of branches) will be created, and a tags subdirectory in which specific tags (or collections of tags) will be created. Of course, if a repository houses only a single project, the root of the repository can serve as the project root, too.

Here are some examples:

$ svn list file:///var/svn/single-project-repo
trunk/
branches/
tags/
$ svn list file:///var/svn/multi-project-repo
project-A/
project-B/
$ svn list file:///var/svn/multi-project-repo/project-A
trunk/
branches/
tags/
$

We talk much more about tags and branches in Chapter 4, Branching and Merging. For details and some advice on how to set up repositories when you have multiple projects, see the section called “Repository Layout”. Finally, we discuss project roots more in the section called “Planning Your Repository Organization”.

What's In a Name?

Subversion tries hard not to limit the type of data you can place under version control. The contents of files and property values are stored and transmitted as binary data, and the section called “File Content Type” tells you how to give Subversion a hint that textual operations don't make sense for a particular file. There are a few places, however, where Subversion places restrictions on information it stores.

Subversion internally handles certain bits of data—for example, property names, pathnames, and log messages—as UTF-8-encoded Unicode. This is not to say that all your interactions with Subversion must involve UTF-8, though. As a general rule, Subversion clients will gracefully and transparently handle conversions between UTF-8 and the encoding system in use on your computer, if such a conversion can meaningfully be done (which is the case for most common encodings in use today).

In WebDAV exchanges and older versions of some of Subversion's administrative files, paths are used as XML attribute values, and property names in XML tag names. This means that pathnames can contain only legal XML (1.0) characters, and properties are further limited to ASCII characters. Subversion also prohibits TAB, CR, and LF characters in path names to prevent paths from being broken up in diffs or in the output of commands such as svn log or svn status.

While it may seem like a lot to remember, in practice these limitations are rarely a problem. As long as your locale settings are compatible with UTF-8 and you don't use control characters in path names, you should have no trouble communicating with Subversion. The command-line client adds an extra bit of help—to create legally correct versions for internal use it will automatically escape illegal path characters as needed in URLs that you type.

[Warning] Warning

Of course, when it comes to choosing valid path names, Subversion isn't the only limiting factor. Teams using multiple operating systems need to consider the limitations placed on path names by those operating systems, too. For example, while Windows disallows the use of colon characters in file names, a user on a Linux system can very easily add such a file to version control, resulting in a dataset that can no longer be checked out on Windows. Adding multiple files to a directory whose names differ only in their letter casing will likewise cause problems for users checking out working copies onto case-insensitive filesystems. So, some broad awareness of the various limitations introduced by different operating systems and filesystems, then, is recommended.