Linguistic segmentation
Spacing
Segmentation, referring to the separation of text into "words", can be purely subjective as this notion only really makes sense of printed matter. To simplify decision making, we MAY follow the meaning of the text, separating semantic words with a typographic space (" " [U+0020]) whenever possible.
Word segmentation MAY follow a modernized segmentation, to ensure homogeneity.
When modern usage would like to show an elision, whereas the source tends to use an agglutination, this agglutination MAY be transcribed.
When in doubt, word segmentation MAY follow the usage of the contemporary language of the manuscript.
For cases which me be difficult to decide (such as for verbs like "enchargier", "en fuir", "en partir" or certain locutions), we MAY stay as close as possible to the source and follow either the dictionary entry or modern usage. In French, such dictionary MAY be the DMF.
Agglutinations and dragging strokes in cursive writing MUST NOT be imitated in the transcription. Therefore, we MUST transcribe "et en effet" and not "eteneffet" or "et_en_effet", even when the quill was not lifted from the paper.
Hyphenation
Hyphenation refers to the act of indicating that a word was cut off at the end of a line. It can be marginal in medieval manuscripts but is frequent in modern and contemporary sources.
Hyphenation MUST be transcribed whenever it exists in the source.
The transcription MUST NOT add hyphenation symbol to signal the hyphenation in the source if the hyphenation mark is not in the source.
The character "-" [U+002D] MUST be used to transcribe the hyphenation symbol, whichever symbol is traced on the source.
When the hyphenation symbol is repeated at the beginning of the next line, it should also be transcribed with "-" [U+002D].
Diastoles
Sometimes, diastoles (vertical or oblique pen strokes) are drawn between two contiguous letters to indicate that they belong to different words.
Word-separating diastoles MAY be transcribed with the sign "/" [U+002F].