classes2dot
The main focus for this script is class/object diagrams, but can also handle component/deployment diagrams.
In many cases, you will need to refer to nodes, such as when describing an association or attaching a note. Sometimes, you will want to refer to individual parts of a node — what Graphviz calls ports.
Most elements that map to Graphviz nodes support explicit ID specification, in the form of an
@
sign followed by an ID string. TextUML does not place any restrictions on the format
of IDs except that they cannot contain @
signs; however, Graphviz does, and if your
IDs do not adhere to Graphviz’s rules, you will need to enclose them in double quotes when referring
to them.
In the syntax explanations below, @id
indicates the position where an ID
can be specified. All IDs are optional; if name detection is not supported, a sequential ID will be
automatically generated.
When practical, TextUML automatically extracts the element name and uses it in place of the ID if no ID is explicitly specified. These cases will be specifically mentioned in the documentation below.
Additionally, attributes and operations in class definitions can also have IDs. When addressing an attribute or operation, write the class name or ID first, then a colon, and then the name or ID of the attribute or operation. Any of the two IDs can be quoted, but the colon must not, otherwise Graphviz will interpret it as part of the node ID.
UML allows stereotypes on most elements, and suggests that they be written above or to the left of the element names. This means TextUML must know to skip them when detecting names.
TextUML recognizes two forms of stereotype specification — the proper guillemets encoded in UTF-8
(«like this»
) and their poor man’s replacements (<<like
this>>
).
Although UML suggests that property strings be written after or below the element names, TextUML
knows to skip them too if they appear before names. A property string is enclosed in braces
({like this}
).
The following elements are supported.
Two syntaxes for notes are supported. One, for attaching a note to a single target:
target *-- [note text]
The other, for attaching to multiple targets:
[note text] --* {targets}
The note text may contain newlines. In the second syntax, targets are separated with newlines. Targets can be indented; the indentation is ignored. Empty lines are ignored. (In particular, this allows a newline immediately after the opening brace or immediately preceding the closing brace.)
Notes do not have IDs or names, because in UML notes do not participate in any relationships except for attachment.
__[label @id]____
The text icon for package represents the upper edge of the package folder icon, with the tab. The number of underscore characters can be arbitrary, provided that there is at least one at each side.
If no @id
is specified, the first word (delimited by whitespace or any of
« » < > { }
) of the label that is not a stereotype or a property string is taken
to be the package name.
Subsystems and Models are notated as Packages with stereotypes of «subsystem» and «model», respectively.
<template params> {label @id
--
attribute @id
…
--
compartment name
operation @id
continuation line
…}
A class may have an optional list of template parameters, which can span multiple lines.
If no explicit ID is specified for the class, the first word of the label (delimited by
whitespace or any of « » < > { }
) that is not a stereotype or a property string
is considered to be the class name.
A class may have zero or more compartments (in addition to the name compartment, which is always
present). Compartments are separated by --
on a separate line. Each compartment may
have a compartment name, specified by an indented line immediately following the separator. The
remaining lines of a compartment specify the class’s features (attributes, operations and possibly
other entities).
A feature may be specified on one or more lines. Only the first line is considered when looking
for the @id
and name. If an @id
is not specifed
explicitly, the name is extracted using the following rules:
+ # - ~
, it is skipped. (Visibility)$
, it is skipped. (Class-scope member)/
, it is skipped. (Derived attribute): ( ) « »
< > { }
, is taken to be the name.Indented lines are considered to be continuation lines. The indentation is preserved. Note that Graphviz will preserve spaces but skip tabs.
Utilities, Metaclasses, Enumerations, Stereotypes, Powertypes and Actors are notated as Classes with stereotypes of «utility», «metaclass», «enumeration», «stereotype», «powertype» and «actor» respectively.
Objects are notated the same way as Classes except that the label will contain an object name followed by a colon and a class name. The object name (part of label before the colon, not including any stereotypes or property strings) will be used for the node ID by default.
() name @id
An interface can alternatively be specified like this. In this form, attributes and operations are suppressed. In fact, there is no syntax to specify them!
Given any two nodes or ports, you can draw any kind of edge supported by Graphviz using this syntax:
<source> (<label>) [<qualifier>]<arrowtail><style><arrowhead>[<qualifier>] [<label>] (<label>) <target>
Here, <source>
and <target>
can be
any addressable element (nodes or node:port combinations, quoted as necessary). Labels in
parentheses relate to the edge ends, and the label in brackets relates to the edge itself.
Qualifiers also relate to the ends. Labels and qualifiers are optional; the delimiting parentheses
and brackets should be omitted along with their content.
The arrowheads and line style are specified graphically. The following arrowheads are available (shown here as the right end of a solid edge):
-----#> | filled triangle |
-----<# | filled inverted triangle |
----(#) or ------* | filled dot |
--(#)<# or ----*<# | filled dot and inverted triangle |
-----() or ------o | empty dot |
---()<# or ----o<# | empty dot and filled inverted triangle |
------- | none |
------| | tee |
-----|> | empty triangle |
-----<| | empty inverted triangle |
----<#> | filled diamond |
-----<> | empty diamond |
------< | crow |
----[#] | filled box |
-----[] | empty box |
------> | open |
------\ | half-open |
The following line styles are supported:
----- | solid |
- - - | dashed |
. . . | dotted |
===== | bold |
%%%%% | invisible |
Not all of these are used in UML. I decided to provide them anyway, as an extension.
Here are examples for specifying various types of edges:
Association | Class1 ----- Class2 Class1 ----> Class2 (one-way navigation)Class1 <>--- Class2 (aggregation)Class1 [Key]----> (0..1 -class2) Class2 (with qualifier, multiplicity,
visibility and role) |
Composition | Whole <#>--- Part (both ends navigable)Whole <#>--> (one end navigable) |
Generalization | Derived ---|> Base Implementation - - -|> Interface (realization) |
Dependency | Dependent - - -> Dependency |
InstanceOf | Instance - - -> [«instanceOf»] Class |
Since in Graphviz edges can only connect nodes, an association class can only be attached to the n-ary form of association:
{
SomeClass
}
{
SomeOtherClass
}
{
AssociationClass
}
<> {
SomeClass ----- <>
<> ----- SomeOtherClass
<> - - - AssociationClass
}
<> @id {
branches
}
branches lists the edges connecting other elements with the diamond representing the
association. Either end of each branch must be the diamond, <>
. Edges are
specified the same way as usual, except that the end connected to the diamond cannot have any
arrowhead.
The other end may have any supported arrowhead, but note that it might not make sense in UML.
(_label @id_)
Note: Since use case labels are typically phrases rather than identifiers, it is not recommended to rely on name autodetection. The latter works the same way for use cases as for most other items — the first word of the label, skipping any stereotypes and property strings, becomes the default ID.
/_____/
label @id
______/
The number of underscore characters can be arbitrary.
The first word of the label that is not a stereotype or a property string becomes the default ID if no explicit ID is specified.
__-__-__
label @id
________
The number of underscores and hyphens in the opening line can be arbitrary, except that: there must be at least one underscore between any two hyphens; there must be at least two hyphens; and there must be at least one underscore at the start and end of the line.
The closing line can consist of an arbitrary number of underscore characters.
The usual rule for name autodetection applies.
<b>
, <i>
and <u>
tags but does not
act on them.