`classes2dot`

The main focus for this script is class/object diagrams, but can also handle component/deployment diagrams.

General notes
- Node and port IDs
- Stereotypes and property strings
Elements
Unsupported notation

General notes

Node and port IDs

In many cases, you will need to refer to nodes, such as when describing an association or attaching a note. Sometimes, you will want to refer to individual parts of a node — what Graphviz calls ports.

Most elements that map to Graphviz nodes support explicit ID specification, in the form of an @ sign followed by an ID string. TextUML does not place any restrictions on the format of IDs except that they cannot contain @ signs; however, Graphviz does, and if your IDs do not adhere to Graphviz’s rules, you will need to enclose them in double quotes when referring to them.

In the syntax explanations below, @id indicates the position where an ID can be specified. All IDs are optional; if name detection is not supported, a sequential ID will be automatically generated.

When practical, TextUML automatically extracts the element name and uses it in place of the ID if no ID is explicitly specified. These cases will be specifically mentioned in the documentation below.

Additionally, attributes and operations in class definitions can also have IDs. When addressing an attribute or operation, write the class name or ID first, then a colon, and then the name or ID of the attribute or operation. Any of the two IDs can be quoted, but the colon must not, otherwise Graphviz will interpret it as part of the node ID.

Stereotypes and property strings

UML allows stereotypes on most elements, and suggests that they be written above or to the left of the element names. This means TextUML must know to skip them when detecting names.

TextUML recognizes two forms of stereotype specification — the proper guillemets encoded in UTF-8 («like this») and their poor man’s replacements (<<like this>>).

Although UML suggests that property strings be written after or below the element names, TextUML knows to skip them too if they appear before names. A property string is enclosed in braces ({like this}).

Elements

The following elements are supported.

Note

Two syntaxes for notes are supported. One, for attaching a note to a single target:

target *-- [note text]

The other, for attaching to multiple targets:

[note text] --* {targets}

The note text may contain newlines. In the second syntax, targets are separated with newlines. Targets can be indented; the indentation is ignored. Empty lines are ignored. (In particular, this allows a newline immediately after the opening brace or immediately preceding the closing brace.)

Notes do not have IDs or names, because in UML notes do not participate in any relationships except for attachment.

Package, Subsystem, Model

__[label @id]____

The text icon for package represents the upper edge of the package folder icon, with the tab. The number of underscore characters can be arbitrary, provided that there is at least one at each side.

If no @id is specified, the first word (delimited by whitespace or any of « » < > { }) of the label that is not a stereotype or a property string is taken to be the package name.

Subsystems and Models are notated as Packages with stereotypes of «subsystem» and «model», respectively.

Class/Object, Interface, Utility, Metaclass, Enumeration, Stereotype, Powertype, Actor

<template params> {label @id
--
attribute @id
…
--
    compartment name
operation @id
    continuation line
…}

A class may have an optional list of template parameters, which can span multiple lines.

If no explicit ID is specified for the class, the first word of the label (delimited by whitespace or any of « » < > { }) that is not a stereotype or a property string is considered to be the class name.

A class may have zero or more compartments (in addition to the name compartment, which is always present). Compartments are separated by -- on a separate line. Each compartment may have a compartment name, specified by an indented line immediately following the separator. The remaining lines of a compartment specify the class’s features (attributes, operations and possibly other entities).

A feature may be specified on one or more lines. Only the first line is considered when looking for the @id and name. If an @id is not specifed explicitly, the name is extracted using the following rules:

If the first character is one of + # - ~, it is skipped. (Visibility)
If the next character is $, it is skipped. (Class-scope member)
If the next character is /, it is skipped. (Derived attribute)
The longest possible string of characters not containing spaces, nor any of : ( ) « » < > { }, is taken to be the name.

Indented lines are considered to be continuation lines. The indentation is preserved. Note that Graphviz will preserve spaces but skip tabs.

Utilities, Metaclasses, Enumerations, Stereotypes, Powertypes and Actors are notated as Classes with stereotypes of «utility», «metaclass», «enumeration», «stereotype», «powertype» and «actor» respectively.

Objects are notated the same way as Classes except that the label will contain an object name followed by a colon and a class name. The object name (part of label before the colon, not including any stereotypes or property strings) will be used for the node ID by default.

Interface (suppressed)

() name @id

An interface can alternatively be specified like this. In this form, attributes and operations are suppressed. In fact, there is no syntax to specify them!

Binary Association/Link, Composition, Generalization, Dependency, InstanceOf

Given any two nodes or ports, you can draw any kind of edge supported by Graphviz using this syntax:

<source> (<label>) [<qualifier>]<arrowtail><style><arrowhead>[<qualifier>] [<label>] (<label>) <target>

Here, <source> and <target> can be any addressable element (nodes or node:port combinations, quoted as necessary). Labels in parentheses relate to the edge ends, and the label in brackets relates to the edge itself. Qualifiers also relate to the ends. Labels and qualifiers are optional; the delimiting parentheses and brackets should be omitted along with their content.

The arrowheads and line style are specified graphically. The following arrowheads are available (shown here as the right end of a solid edge):

`-----#>`	filled triangle
`-----<#`	filled inverted triangle
`----(#)` or `------*`	filled dot
`--(#)<#` or `----*<#`	filled dot and inverted triangle
`-----()` or `------o`	empty dot
`---()<#` or `----o<#`	empty dot and filled inverted triangle
`-------`	none
`------\|`	tee
`-----\|>`	empty triangle
`-----<\|`	empty inverted triangle
`----<#>`	filled diamond
`-----<>`	empty diamond
`------<`	crow
`----[#]`	filled box
`-----[]`	empty box
`------>`	open
`------\`	half-open

The following line styles are supported:

`-----`	solid
`- - -`	dashed
`. . .`	dotted
`=====`	bold
`%%%%%`	invisible

Not all of these are used in UML. I decided to provide them anyway, as an extension.

Here are examples for specifying various types of edges:

Association	`Class1 ----- Class2` `Class1 ----> Class2` (one-way navigation) `Class1 <>--- Class2` (aggregation) `Class1 [Key]----> (0..1 -class2) Class2` (with qualifier, multiplicity, visibility and role)
Composition	`Whole <#>--- Part` (both ends navigable) `Whole <#>-->` (one end navigable)
Generalization	`Derived ---\|> Base` `Implementation - - -\|> Interface` (realization)
Dependency	`Dependent - - -> Dependency`
InstanceOf	`Instance - - -> [«instanceOf»] Class`

Association Class

Since in Graphviz edges can only connect nodes, an association class can only be attached to the n-ary form of association:

{
SomeClass
}
{
SomeOtherClass
}
{
AssociationClass
}
<> {
    SomeClass ----- <>
    <> ----- SomeOtherClass
    <> - - - AssociationClass
}

N-ary Association/Link

<> @id {
branches
}

branches lists the edges connecting other elements with the diamond representing the association. Either end of each branch must be the diamond, <>. Edges are specified the same way as usual, except that the end connected to the diamond cannot have any arrowhead.

The other end may have any supported arrowhead, but note that it might not make sense in UML.

Use Case

(_label @id_)

Note: Since use case labels are typically phrases rather than identifiers, it is not recommended to rely on name autodetection. The latter works the same way for use cases as for most other items — the first word of the label, skipping any stereotypes and property strings, becomes the default ID.

Node

/_____/
label @id
______/

The number of underscore characters can be arbitrary.

The first word of the label that is not a stereotype or a property string becomes the default ID if no explicit ID is specified.

Component

__-__-__
label @id
________

The number of underscores and hyphens in the opening line can be arbitrary, except that: there must be at least one underscore between any two hyphens; there must be at least two hyphens; and there must be at least one underscore at the start and end of the line.

The closing line can consist of an arbitrary number of underscore characters.

The usual rule for name autodetection applies.

Unsupported notation

UML defines several stereotype icons that can be used instead of writing the stereotype in guillemets. Graphviz does not make it easy to add icons, so we don’t bother.
Nesting. Again, Graphviz graphs cannot be nested.
Boldface class names, italic abstract class names and abstract methods, underlined object names and class scope attributes and operations. Graphviz accepts HTML <b>, <i> and <u> tags but does not act on them.
Any kind of lines attached to other lines. So, to attach an association class to an association, or model a xor-association, use the n-ary association form.
Qualifiers. Since there is no way to make Graphviz attach a qualifier box to the class box at the end of an association, qualifiers are displayed in brackets in the association end’s label.

classes2dot

Table of Contents