RFP - git-svn server ~~~~~~~~~~~~~~~~~~~~ ok, so here's the basic plan. All of the surrogate information that SVN stores (eg, path, revision number) is cached in a relational database accessed via DBI or equivalent. All of the mined changesets will be stored in git. Database dumps of the metadata can be stored in a specially named git branch. There's two parts to this server - the ripper, and the actual server. Once the repository is managed in git, there is no need for anyone except the people accessing via Subversion to use SVN revision URLs & numbers. However all of these revisions will be numbered ad nauseum by the git-svnserver daemon to support clients and developers used to the SVN API. The Ripper ~~~~~~~~~~ The ripper should be able to import from multiple sources. an SVN repository itself, a git-svn clone, or its own output. It indexes the surrogate Subversion information, and builds tables for it. Extra information will take the form of; 'copied-from' properties that trace history back 'svn:mergeinfo' properties that record cherry picking and merging 'svk:merge' properties that record merging revision properties such as the log message, author, etc. other file and directory metadata (ie crap) author details (e-mail address, full name, etc) The basic philosophy is to read all metadata possible, and store it using a reasonable set of database tables. The ripper should send all file revisions it can compute to git-fast-import, as well as built commits, as soon as it has them. The ripper should be restartable. If the ripper requires a local repository, this is fine - however it should be able to understand svnsync or SVN::Mirror attributes and allow it to be treated as the original. ripper milestones ----------------- The idea is that completion of these milestones, as demonstrated by reviewed test cases, is required for payment of the sponsorship grants. For each, the test repository should be tested three times - once using a direct SVN import, once on a git-svn clone of that repository, and once on a ripper export. Ideally, re-ripping a ripper export should yield the same commit IDs; ie it should be able to re-parse all of the breadcrumbs it leaves behind. The ripper should be able to checkpoint its database to a separate branch as an optional action, and load from that checkpoint. Any interactivity must save its results to the database, so that the answers to the interactive questions are checkpointable. 1. loading 'plain' history - a basic repository with no funny stuff, and supporting all of the checkpointing/loading described. Support for an "authors" map. 2. determining the branch structure based on which paths files were added to. Information in the database should override this, allowing the automatic detection to be trumped. 3. representing the 'copy from' metadata in tables, and using this to know which parent branches were copied from, and creating branches 4. recording *all* attributes and properties in the database. Translation of 'easy' attributes like 'svn:executable' and 'svn:ignore'. 5. representing the 'svn:mergeinfo' and 'svk:merge' attributes as metadata, and using this to write merge commits (or add git-svn-merge: headers to the commit message), where the other UUID:path:revision is present in the database. The Server ~~~~~~~~~~ The server stores information about what it is serving. When it sees that there are new git revisions, it fabricates the surrogate information using sequences and the like. This is fully capable of running on a repository which has no surrogate information in it at all, though it must store all arbitrary decisions it makes, and commit them, before issuing responses. The goal is to have the git-svnserver as a "drop-in" replacement for an SVN server. git-svnserver Interface ----------------------- There are two SVN protocols - the ra_svn protocol and the HTTP/Web-DAV protocol. The SVN WebDAV protocol is mostly custom reports layered in the "anything whores^Wgoes" section of the Web-DAV protocol. So, there would be a task to collect a sample Web-DAV session, and build a test suite that demonstrates that the new git-svnserver serves the same data as the real SVN. Doing the SVN protocol directly would probably be higher performance, but potentially less desirable because it is insecure for commits. It might even be possible to have a svn+ssh:// interface using a suitably crafted dummy "svn" program. Merge attribute compatibility with SVN 1.5 and SVK is considered desirable. The server should re-use logic from the ripper, so that incoming SVN commits are automatically "ripped" into good git commits. server milestones ----------------- 1. [optional] triage and report on suitability of the three listed interfaces. For each of these following milestones, the test case should use the real SVN client on one of the above ripped repositories. Tests should include those where a checkout is switched using 'svn switch --relocate' from the original SVN checkout to the ripped+served one. 2. support 'svn ls' and 'svn pl' via the interface 4. support 'svn co' and 'svn pg' 5. support 'svn diff' 6. support 'svn update' and/or 'svn switch' 7. support 'svn commit', including committing properties 8. test that all other svn operations (eg, ps --revprop) are refused gracefully Features out of scope ~~~~~~~~~~~~~~~~~~~~~ For now svn:externals will NOT be represented using git-submodule