Gw Requirements
Gw
The requirements (user stories) for the 2005 project are maintained in JIRA at
The following pages contain a few lines to explain the basic idea:
Below are the requirements for the 2004 precursor project; these have been partially implemented.
Storage Management
Advisor/Customer:
Eelco Dolstra
Most current Wikis are based on RCS or CVS. This has some
limitations, the main one being that files (Wiki pages), not
directories, are versioned. This means that:
- It is not possible to rename or move files, refactor the directory structure, etc.
- Wikis generally do not support a recursive directory structure.
- Commits are per-file. So changes to sets of files are logically distinct, i.e., not atomic. So if you want to make related changes to a bunch of pages, the Wiki may be in an inconsistent state while you're making the changes, and it's hard to undo (back out) the changes.
Since Subversion solves all these problems, we will use it as the
storage back-end of GW. An entire Wiki will be stored in a single
Subversion repository.
The goal of the storage team is to implement the Subversion storage
back-end. I envision the following milestone releases (subject to
change):
Release 0 - initial GW
On startup GW does a checkout of the repository. Edits happen on the
working copy. However, there are no commits, so all changes to the
Wiki are lost when the server is restarted. There is no locking of
any kind.
Release 1 - persistent storage
Save operations should cause a commit to happen. When edits a new
page, the page should be added first of course. Still no locking
though.
R2 - multi file edits ("transactions"):
When a user starts editing a file, create a new working copy.
Edits happen on this per-session working copy and are not visible to
other users. So a "Save" causes the per-session working copy to be
modified, but nothing is committed.
There should now be a "Commit" operation that causes the entire
per-session working copy to be committed. After this the per-session
working copy can be deleted, and the global working copy should be
updated. This makes the changes globally visible.
Merge conflicts are ignored for now.
R3 - merge conflicts
If on commit a merge conflict occurs, the per-session working copy
should be retained, the user should be presented with a list of
conflicting pages (showing the conflicts in those pages) and be
allowed to edit those pages. Most of the work here should be done by
the versioning UI team, but the storage team has to provide the
supporting infrastructure.
R4 - use RA layer
Using working copies is inefficient. For instance, to start an edit
session, we have to clone the entire working copy. This doesn't scale
well. So instead of using Subversion's working copy (WC) layer, we
should use the remote access (RA) layer, which allows us to fetch and
edit just those files that are involved in an operation.
It's possible that the high-level Subversion bindings only supports WC
operations, not RA operations. So additional C/Java bindings might
have to be created.
R5 - caching for RA operations
Using the RA layer is scalable, but it's also slow. For instance, to
view a page, we have to fetch it from the Subversion server every time
(while before R4, we could just get it from the working copy). So RA
fetches should be cached.
Of course, the cache should be properly invalidated on edit
operations.
Additional complications
The storage layer is quite fundamental. The entire Wiki depends on
it. The storage team should design a simple but sufficient interface
to the storage layer that other teams can develop against. In
particular the work of versioning UI team is closely related to that
of the storage team. For instance, for merge support it is necessary
that the storage layer offers to the upper layers notification that
there is a merge conflict, a way to query what the conflicts are, and
a way to clear the conflict situation. Close communication and
frequent syncing with the versioning UI team is probably required.
Big bang integration is not an option.
Maybe R4/R5 aren't such a good idea since replicating a lot of the
functionality in the WC layer (such as support for moving files) is a
lot of work. However something should be done about the scalability
problem in R3. Alternatives might be to clone working copies using
hard or symbolic links, lazily cloning the working copy, and so on.
Versioning User Interface
Advisor/Customer:
Martin Bravenboer
The corner-stone of a Wiki is the ability to edit the data and
structure of a Wiki online in a web browser from any workstation. The
web browser is an extremely distributed and concurrent interface for
editing the data of the Wiki. Hence, the generalized Wiki requires an
appropriate user interface to the underlying version control system
(Subversion). The
Versioning User Interface project will develop a
web-based user-interface for viewing and editing the Subversion
repository of the Generalized Wiki.
Editing Files
Files contain the data of a Wiki. Files can be edited, moved and
copied like ordinary files in a file system. Files can be organized in
directories. GW supports binary files. Thus, there is no need for a
notion of attachments like many existing Wiki implementations. Binary
files are under version control, since they are just files in the Wiki
- Edit, preview and commit in a single atomic action.
- Editing a file is only supported for the head revision. This restriction will not be repeated in the description of all features.
- Allow a commit message for a commit.
- Edit transactions:
- Use a session or author specific working copy
- In cooperation with storage group
- See also "Status Overview".
- Move a file
- Change all references
- In cooperation with query and search
- Revert the changes of a file to a certain previous revision.
- Show working copy status if the current user is in an editing transaction.
- Action for updating a file with the latest server revision (only in editing transaction).
- Resolving conflicts: three buttons: cancel edit, save, save and resolve.
- Create and remove a symbolic link to a file.
Editing Directories
GW has no notions of webs or subwebs: it is entirely hierarchic by
allowing directories. Hence, it allows arbitrary levels of
nesting. Furthermore, directories can be moved to different levels and
are under version control.
- Implement a directory listing. An URL that ends with a / should show his directory listing (if the user has the rights for this). The / URLs should not forward to a default file (such as index).
- Optional: show access rights for files and subdirectories.
- Directories are versioned as well. Implement support for viewing a directory listing specific revision of a directory. Don't show edit actions.
- Create a subdirectory (head revision only)
- Create a file in this directory.
- Show working copy status if the current user is in an editing transaction.
- Action for updating a directory with the latest server revision (only in editing transaction).
Version Information of Files and Directories.
The user-interface must show versioning information of an item.
- Show revision, last change date and last author at item page itself.
- Action to view the history of a file (log)
- Commit log: author, date and time, commit message.
- Provide link to the actual commit to see the files that have been changed in the same commit.
- Show the copy/move history in the Wiki file hierarchy.
- Action to view the blame (praise) of Wiki topic: for each line show the author responsible for it.
- Action to view differences between revisions of a file and directory. The diffs might be visualized in several ways.
Status Overview
During an editing transaction the user has its own working copy. The
versioning user interface must implement tools for viewing the current
status of this working copy. This status overview makes clear what has
been changed and supports some additional actions for preparing a
commit.
- Show 'first column' information of SVN status:
- Possible changes:
- Modified
- Added
- Deleted
- Conflicted
- Merged
- ... (see SVN documentation)
- Use attractive icon set
- Show information on properties
- Conflicted
- Modified
- Use custom visualizations for known properties (access properties)
- Optional, status overview with -u:
- Show if newer revision exists on server
- Show working and last-committed revision.
Actions from status overview:
- Update a file for the server.
- Edit a file in order to resolve the conflict of a file.
- Revert a file to the state at the server.
Integrate user authentication
The user authentication group will implement methods for
authenticating users and determine their rights. This work has highly
related to the versioning user interface, since the versioning user
interface involves many actions that require user authentication. The
integration of user authentication is a cross-cutting feature and
therefore it is currently not described in detail.
Global information and statistics
- Recent changes
- For specific subdirectories (webs) only
- HTML Commit log at the Wiki
- RSS and Atom feeds for feed aggregators
- Support Live Bookmarks of Firefox
- In cooperation with query and search group.
- Statistics:
- Author statistics (all-time and per month)
- File views (based on logs, database, or existing statistics system)
- HotSpots? : frequently modified files
Local information and statistics
- Show refererred-by
- HotReferrers?
- In cooperation with query and search team.
- Who 'owns' this file: author statistics.
User Management
Advisor/Customer:
Eelco Dolstra
GW should support the creation of users for the following reasons:
- To provide access control - not everybody might be allowed to edit or view some pages.
- To provide per-user customisation: preferences, style sheets, etc. (user management should be restricted to provide pointers to the preferred resources; stylesheets are in the rendering and templates package)
- To provide edit traceability - who edited what?
Requirements
Make it possible to create and edit users. The information maintained
for each user should be flexible (so that arbitrary properties can be
added later on), but should included at least the full name and e-mail
address.
Make it possible to create and edit ACLs (access control lists) that
define who has the right to do what.
The most important access rights that must be implemented are read and
write access. A more interesting access right is who has the ability
to modify the access rights (for instance, in the Unix filesystem only
the owner of a file can change the access rights, but this is often
very limiting). (So maybe the concept of owner should be adapted as
well; not just the person who last edited the file. --
EelcoVisser)
ACLs should be per-page; they should not be global to the Wiki. I.e.,
different parts of the Wiki can have different access controls.
It should be possible to defined recursive ACLs, e.g., if we make /A
readable to user Foo, then user Foo should also be able to read /A/B.
Make it possible to create and edit groups (sets of users that can be
reffered to in ACLs).
It must not be necessary to login to access/edit the Wiki if allowed
by the ACLs (so there should be an "anonymous" user). And login should
be prompted only when trying to access a restricted resource.
It would be interesting to be able to form subwikis for which user management
is done locally in that subwiki. --EelcoVisser
(Maybe:) Account creation should use an e-mail confirmation scheme to
verify that the user supplied a valid e-mail address.
I envision the following milestone releases (subject to change):
Release 1 - user creation
Allow users to be created. User account information should probably
just be stored in the Wiki and edited through the normal Wiki
mechanisms, for instance, through a Wiki page /Users/Foo for a user
named Foo. This makes it unnecessary to create special forms for this
purpose.
Release 2 - basic access control
Allow simple, non-recursive ACLs to be defined per-page. Probably
setting the ACLs for some page /X should happen by editing /X/ACL.
(Note that it may be problematic to let /X be a page and a directory
at the same time, unless we establish some kind of implicit extension
scheme. But we (might) also want to use GW to edit arbitrary
subversion repositories. --
EelcoVisser)
Use the ACLs to check access to pages.
Release 3 - groups
Allow groups to be created and used in ACLs.
Release 4 - recursive ACLs
Add recursive ACLs.
Release 5 - efficiency?
Perhaps loading and parsing the ACLs for each page access is too slow,
so it may be necessary to add caching.
Difficult issues
It's not quite clear how ACLs and versioning should interact. For
instance, it will be possible to view old revisions of pages. But
should be then use the ACLs in the old revision of the repository, or
the current ACLs? The latter is necessary to withdraw or grant access
to old revisions of pages. On the other hand, it is not clear how
current ACLs should be applied to pages that have been deleted in the
current revision.
Rendering and Template
Advisor/Customer:
Eelco Visser
An essential component of any wiki is the rendering of text in a
simple wiki markup language to full blown HTML files. Following the
tradition of the original c2 wiki, this rendering phase is
regular-expression based. As is known from programming languages, this
leads to badly designed languages when adding new features.
- line by line processing makes it awkward to write well structured markup
- rendering operations interfere with each other, e.g., a wiki word in a forced link
The TWiki clone of wiki extends the basic wiki markup scheme with a
server-side template mechanism, which allows configuration of page
layout. The template mechanism is 'propietary' and ad-hoc. Being
server-side it cannot be maintained via the wiki itself.
GW should support
- A wiki markup language based on context-free parsing
- Extensibility to new markup languages through an XML-based intermediate format, which separates parsing from rendering
- Stylesheets for rendering of wiki markup and composition of pages
The following ingredients are necessary to achieve this.
GWML: XML Schema for the GW Intermediate Representation
Design an XML schema for structured representation of wiki markup. The
first version should support all the markup in the TWiki
TextFormattingRules? . The second version should also incorporate TWiki
variables.
WikiML? would be a logical name, but is already claimed. See
http://wiki.wikiml.org/ also for inspiration.
Parsing TWiki Markup
Parse the TWiki markup. Since we want to migrate several existing
wikis (ST Wiki, program-transformation.org, stratego-language.org) to
the new wiki, the twiki format should be supported. The parser should
produce an XML document in DOM format to be passed on to an XSLT
transformer.
Rendering: GWML representation to HTML in XSLT
When viewing a page, its GWML representation is converted to HTML by
an XSLT stylesheet (or wiki formatting template). The stylesheet is
obtained from the wiki.
Cascading Stylesheet: pretty layout
The amount of formatting in the XSLT stylesheet can be limited by
separating logical layout and graphical layout. The latter should be
achieved using cascading stylesheets. The CSS for viewing pages should
also be part of the wiki. A good set of conventions for element
classes and nice initial graphical layout should be designed.
Templates provide context for generated HTML
In addition to rendering the contents of a page to HTML, templates can
add context information such as navigation bars, headers, footers, and
menus. Define templates for viewing, editing, previewing, and saving
pages. See the
FlexibleSkin? templates for inspiration. For
example in the ST wiki the
WebContents? page is included in the
navigation bar on the left.
Finding Preferences
Rather than just providing a single template, subtrees or specific
pages of the wiki may adopt a different presentation style. For that
purpose it should be possible to override the default templates with
new templates. There should be a scheme for finding the applicable
templates. Also it should be possible to reuse (inherit) as much code
as necessary from exising templates.
Type-Based Rendering
A wiki hierarchy does not necessarily only consist of proper wiki
files. Rather it can also contain pure HTML files, files in other
document formats such as PDF or Postscript, files in data formats such
as XML, Bibtex, plain text, CSS. Each of these file types should be
'viewed' and 'edited' using the appropriate wiki operations. This
requires a rendering function for each type and thus the rendering
engine should be extensible with new rendering functions. There
should be a standard interface for such rendering functions.
(Note that 'rendering' a pdf file may simply mean providing a link to
the file, or to copy it to the browser untouched.)
See also relation with Form rendering in Forms project
Context Variables
Wiki pages may use a series of variables to refer to their
context. That way the page may be easier to port (move to another
location), to include dynamic information (current time, user logged
in), or search results. Also it allows the creation of generic files
for inclusion in other files. See
TWikiVariables for inspiration
and legacy requirements.
Access Control
Integrate with access control; don't show a page if no access to it is
granted. What happens if a component of a page is not accessible
(e.g., reviews for a paper are not accessible to its authors until
notification)
Dependencies
When creation of page views becomes involved, the efficiency of
presenting pages may suffer. Therefore, it may become necessary to
maintain a cache of page renderings. In order to maker sure that pages
in the cache are invalidated when a page used for its creation changes
it is necessary to infer the its dependencies, i.e., all files
accessed for rendering a page and all authentication needed to obtain
them. This should be achievable by asking user and storage management
for a trace.
Integration with Forms
A wiki page is a form with textarea field, which can also have other
fields such as parents and access control. There may be different
access controls for different fields.
Risks
The prim risk seems to be to find a good parsing technology is key to
(1) obtaining a working parser for twiki markup and (2) for its
maintainability.
Wiki Form
Advisor/Customer:
Martin Bravenboer and
Eelco Visser
Introduction
Wiki webs are used to maintain and develop a web of information with a
group of people. In current Wiki implementations this information is
restricted to text documents. The text documents are written in a
simple markup language and are presented in a browser. The goal of the
Forms project is to experiment with collaborative maintenance of
structured information. This information could be structured in XML,
but it could also be stored in ad-hoc notations that are part of the
text documents. Examples of this are the
WebNotify, Preferences, and
Category feature of TWiki.
Structured information can be edited at the source level, i.e. an XML
document could simply be edited as an ordinary plain text file, but
this will not make the web application accessible to users that are
not deeply involved in the structure of the web application. This
approach would be comparable to diving into a relational database and
executing SQL queries and update.
To make the manipulation of structured data more user-friendly, we
need to be able to attach
forms to edit topics. These forms are an
alternative to editing the structured source of a Wiki topic. The
ultimate goal is to have
If GW Forms are to be compared with the Model View Controller pattern,
then the most striking difference is that GW Forms will probably not
have a controller: GW Forms are restricted to editing the structured
of a single file and will not be used to implement web applications
with complex control flow.
This project will have a more experimental nature than the other
project. Hence, the requirements are not very specific and might be
changed, based on the experiences of the team working on the Forms
project. If one of the approaches appears to be particularly useful,
then it might be used for the development of an entire web
application.
Overview
A GW Form should do several things:
- Describe the structure and presentation of a form.
- Create an instance of a form, based on existing data.
- Map the changes performed by the user back to the data.
- Verify that the data is correct and send feedback if it is not.
We will mostly focus on data stored as XML documents.
Server-side Validation
The first step towards structured information in a Wiki is to ensure
that the information stays structured. For information stored in XML
this means that the Wiki files should remain well-formed and valid.
More concrete requirements are:
- Verify the well-formedness of XML documents, or rather, files that are supposed to be XML documents.
- Verify the well-formedness of XML documents with a schema or multiple schemas. Schema languages that should be supported: Schematron, RELAX NG, W3C? XML Schema.
- Arbitrary validators implemented in a scripting language (Groovy or Jython).
- Arbitrary validors implemented in Java.
The validators that are to be applied must be defined in meta files.
Some form techniques that we will experiment with in this project also
allow client-side validation. This is only suitable for improving the
user experience, since a client cannot be trusted.
HTML Forms
The only advantage of HTML Forms is that they are available in every
browser. However, they are not suitable for developing Forms without
any scripting or other server-side code. Still, it is in practice the
only option for creating web forms.
- Develop a method for defining GW Forms as HTML Forms. Use server-side code that can be edited online to bind the form to the structured data of a GW file. The server-side code might for example be written in Java, Groovy, etc.
Client-side XForms
Client-side XForms improve the user experience by allowing client-side
validation, offering more widgets, etc. Although most browsers do
currently not support XForms, this is probably the future of forms in
web applications. Mozilla will include XForms in future builds and
plugins are available for Internet Explorer. XForm client-side and
server-side libraries exist for platforms like Java.
- Develop a method for defining XForms to edit the XML data of GW files.
Server-side XForms
The lack of browser support for XForms is a serious problem in any
website that is going to be used in practice. This could be solved by
using server-side XForm libraries, or client-side plugins (maybe
Java).
- Develop an alternative implementation for browsers that do not support XForms. The form definitions in GW should remain the same.
XUL Forms
XUL is the user-interface technique used by the Mozilla platform. The
ultimate goal of XUL is to provide a platform for the development of
platform independent independent graphical user interface
applications. Most applications rely heavily in
JavaScript? , which
makes the Form less declarative. Another limitation is that XUL-based
web applications will probably only work in Mozilla browsers: there
are currently no efforts to make XUL a true web standard.
- Develop examples of XUL based web applications that are hosted and maintained in a Wiki.
Use Cases
Some application that can be used to test the various approaches are:
- GW Configuration (variables)
- User preferences
- List of users in web notification service
- Category editing
- Form for BibTex? entry editing
Full web applications:
- Conference management (paper sumbission)
Search and Query
Advisor/Customer:
Merijn de Jonge
Search
- topic search
- keyword search
- full text search
- category search
- search in search
- constrained search
- multi-web search
- presentation of search results
Query
- querying web structure
- dead pages
- live pages
- new pages
- most recently changed pages
- page readers/visitors
- page writers
- changes
- most visited pages
- web site activity browser (i.e, graphical presentation of activities of all parts of the web site are
- web counters/statistics
General
- search/query expression language
- (caching of results. this may conflict with access control mechanism)