Workflow for documentation production in OSM.md 18.2 KiB
Newer Older
ramonsalguer's avatar
ramonsalguer committed
# Workflow for documentation production in OSM

## Introduction

The process is based on:

- Text-based edition with Markdown
- Further conversion to multiple formats and templates with Pandoc
- Version control with Git, using ETSI's Gitlab

In the edition process there are two roles:

- **Contributors.** Anyone contributing content to a given document, with no special privileges. They cannot contribute directly to `master` branch, so they have to contribute pushing branches and issuing a _merge request_ (i.e. requesting their integration).
- **Editor/Reviewer**. The person (or people) in charge of integrating (_merging_) contributions into the main document. In consequence, they can push changes directly to the `master` branch. They should also decide the document split in different files and provide the means to build and convert the document to a given set of target formats and templates.
ramonsalguer's avatar
ramonsalguer committed

## Software requirements

## Contributor

Basic software (available on Windows, Mac and Linux):
ramonsalguer's avatar
ramonsalguer committed

- [Git](https://git-scm.com/)
- [Pandoc](https://pandoc.org/)
  - In Windows, you might want to optionally install [MiKTeX](https://miktex.org/) to support conversion to PDF from Pandoc.
ramonsalguer's avatar
ramonsalguer committed
- Markdown editor(s). Recommended (both):
  - Integrated: [Visual Studio Code](https://code.visualstudio.com/)
    - The integrated editor supports version control operations graphically, so that usual operations do not require CLI commands necessarily.
  - WYSIWYG: [Typora](https://typora.io/)
ramonsalguer's avatar
ramonsalguer committed

### Some recommended extensions for Visual Studio Code

- `bat67.markdown-extension-pack`
- `yzhang.markdown-all-in-one`
- `DavidAnson.vscode-markdownlint`
- `satokaz.vscode-markdown-header-coloring`
- `wayou.vscode-todo-highlight`
- `streetsidesoftware.code-spell-checker`
- `bierner.github-markdown-preview`
- `shd101wyy.markdown-preview-enhanced`
- `geeklearningio.graphviz-markdown-preview`
- `eamodio.gitlens`
- `donjayamanne.githistory`
- `bierner.markdown-checkbox`
- `bierner.markdown-emoji`
- `bierner.markdown-mermaid`
- `bierner.markdown-preview-github-styles`
- `bierner.markdown-yaml-preamble`
- `mrmlnc.vscode-remark`
- `csholmq.excel-to-markdown-table`
- `DougFinke.vscode-pandoc`
- `fcrespo82.markdown-table-formatter`
- `yzane.markdown-pdf`

To install extensions in VSCode you can follow any of the procedures described in the [VS Code guide for extensions](https://code.visualstudio.com/docs/editor/extension-gallery).
ramonsalguer's avatar
ramonsalguer committed

This document provides the description of all the procedures based on Git as CLI commands to minimize any potential ambiguity. However, it must be noted that all actions required for contributors and the commonest actions for the editor can be conveniently performed from the integrated editor, following the same steps but in a visual manner. Once you have read this document, it is highly recommended checking this [visual guide](https://code.visualstudio.com/docs/editor/versioncontrol), which shows how to perform all the operations using the editor GUI.
Beware that the steps required for each of the procedures described thorough this document are exactly the same, and should happen in the same order, for both the integrated editor and the CLI. Furthermore, there is total interchangeability between both means, so that you can combine steps performed from the editor with steps performed from the CLI at your total discretion, with no risk of confusing Git (in the end of the day, the editor also invokes the own `git` command behind the scenes).
ramonsalguer's avatar
ramonsalguer committed
## Markdown conversion basics

The basic syntax of Pandoc is rather simple. In the simplest cases, it just requires an input file and an output file name, and it will determine the formats based on the extensions.
ramonsalguer's avatar
ramonsalguer committed

For instance, this command converts from Markdown to MS-Word format:
ramonsalguer's avatar
ramonsalguer committed

ramonsalguer's avatar
ramonsalguer committed
pandoc MyDocument.md -o MyDocument.docx
```

It is also possible to merge several input files into one output file:

```bash
pandoc MyDocument1.md MyDocument2.md -o MyDocument.docx
```
ramonsalguer's avatar
ramonsalguer committed

For some target formats, such as HTML, you would need to specify whether you want to produce a fragment (default) or a complete standalone document (using the `-s` switch):

```bash
pandoc -s MyDocument.md -o MyDocument.html
```

In the case of MS-Word target format, a highly useful feature is specifying an existing document (usually based on a specific format template) as reference, so that Pandoc can produce the output document based on the same template:

```bash
ramonsalguer's avatar
ramonsalguer committed
pandoc MyDocument.md --reference-doc=referenceDocument.docx -o MyDocument.docx
```

Please note that the reference must be a valid standalone DOC/DOCX document, not an MS-Word template (i.e. DOT/DOTX files are not valid references for Pandoc).
ramonsalguer's avatar
ramonsalguer committed

## Git settings

Reference: https://osm.etsi.org/gitlab/osm_doc/test#repo-command-line-instructions

```bash
git config --global user.name "myEOLusername"
git config --global user.email "myusername@mycompany.com"
```

If you want to check if Git configuration was successfully enforced, you can do:

```bash
git config --list
```

| NOTE  |
|---|
| Your email address will be visible on commits to Git. |
| If wanted to keep your email address private, you can mask your `user.email` but it is highly advisable to preserve your company name. E.g.  Instead of `myuser@company.com`, you can set it to `hidden@company.com` |

It is also highly advisable adding an SSH key to [your profile keys](https://osm.etsi.org/gitlab/profile/keys) so that password is not requested anymore.
ramonsalguer's avatar
ramonsalguer committed

## Guide for contributors

### Step 0: Setup the repository in your local environment for the first time

#### If the repository already exists in ETSI's Gitlab but not locally

```bash
git clone ssh://git@osm.etsi.org:29419/osm-doc/test.git
ramonsalguer's avatar
ramonsalguer committed
```

In some cases (e.g. in presence of firewalls blocking the SSH port), you can use as fallback option the HTTPS access:
ramonsalguer's avatar
ramonsalguer committed

```bash
git clone https://osm.etsi.org/gitlab/osm-doc/test.git
ramonsalguer's avatar
ramonsalguer committed
```

#### If the repository already exists in ETSI's Gitlab and the local copy is not up-to-date
ramonsalguer's avatar
ramonsalguer committed

```bash
git pull origin master
```

### Step 1 (**IMPORTANT)**: Create a local branch

This should be based on an up-to-date version of the code in the repo's `master`. If you are unsure, you should do:

```bash
git pull origin master
```

Then create a branch and move to it in order to continue. In this example, we will create a branch called `BranchNewSection` to store a new section that you are working on:

```bash
git checkout -b BranchNewSection
```

#### NOTE: What to do if you committed changes to `master` by mistake

In case you forgot to create your branch (or moving to it) and made a number of commits directly to your `master` branch by mistake, you will need to use a procedure to move them to a branch, or you will be unable to work with you remote origin.

To achieve it, you can still follow some steps to fix your Git history and move the commits to the right place.

Note that **this procedure needs to be followed exactly as it is described** or might lead to unintended loss of commits. In particular, you should understand that:

- You are **rewriting** your Git history and, therefore, might delete wrong entries if you make any mistake. Please read these steps carefully and follow them accurately.
- Any changes not committed before starting this procedure will be lost. If they are relevant for your work, you should commit them (to `master`) as well before starting this procedure.
As said, you can follow this procedure (assuming that you were initially in your `master` branch):

```bash
git branch newbranch              # Create a new branch, saving the desired commits
git reset --hard origin/master    # Move master back to the latest commit common to the remote
git checkout newbranch            # Go to the new branch that still has the desired commits
```

ramonsalguer's avatar
ramonsalguer committed
### Step 2: Local edition and commits

In this phase, you can edit your files as you would normally do.

From time to time, you might want to "save" snapshots of your changes. This process has two states:

- Putting files with modifications in your _stage area_.
- Once you have all you need in your _stage area_, create a local _commit_ out of this set of changes.

You can add those files to your _stage area_ as needed. For instance, this would add a file called `MyFile.md`:

```bash
git add MyFile.md
```

In case you wanted to add all files with any modification since the last commit, you could do:

```bash
git add .
```

Once you are ready, you can create a commit (which would serve as a kind of local "snapshot" of your changes:

```bash
git commit -m "My message to remember later on what was included in this commit"
```

Then you can continue editing until you are ready to share a contribution.

### Step 3: Contribute your branch and make a merge request

When your are ready, you can push your branch to the remote repo:

ramonsalguer's avatar
ramonsalguer committed
git push origin BranchNewSection
```

Once you have pushed your branch to the remote repo, you should **inform the editor of it by making a _merge request_**. This can be easily made in the GitLab web, in the _Merge Requests_ area (on the left side bar). Your recently pushed branch would be there to be selected for your new merge request.
ramonsalguer's avatar
ramonsalguer committed

When creating the merge request, **select _Delete source branch when merge request accepted_ option** so that the source branch is deleted when the merge request is merged.

### Step 4: Result of the review
ramonsalguer's avatar
ramonsalguer committed

Following your _merge request_, the editor will revise the changes suggested in your branch and decide whether they should be merged or receive comments from the editor.
ramonsalguer's avatar
ramonsalguer committed

If your contribution was **commented** by the editor, you should try to address the comments in your branch and push them again if appropriate (in case the contribution has not been entirely rejected by the editor).
ramonsalguer's avatar
ramonsalguer committed

If your contribution was integrated (**merged**) by the editor, the result will be available in the main edition branch (i.e. `master`) as a new commit . Therefore, you should update your local repository (`pull`) before attempting further edits:
ramonsalguer's avatar
ramonsalguer committed

ramonsalguer's avatar
ramonsalguer committed
git checkout master
ramonsalguer's avatar
ramonsalguer committed
```

## Guide for editors

### Initial setup of the repository

#### If you start from a local folder and the repo in ETSI's Gitlab is still empty (initial load of info)

```bash
cd existing_folder
git init
git remote add origin ssh://git@osm.etsi.org:29419/osm-doc/test.git
git add .
git commit -m "Initial commit"
git push -u origin master
```

#### If the repository in ETSI's Gitlab already exists and may have valuable information

Check the corresponding sections of the [Guide for contributors](#step-0-setup-the-repository-in-your-local-environment-for-the-first-time)

### Edition and processing of open merge requests

The editor can view all the pending merge requests within a project by navigating to **Project > Merge Requests** in the Gitlab web of the project for the document:

![merge-requests-in-gitlab](assets/merge-requests-in-gitlab.png)

That merge request would be associated to a new branch, which can be also be checked navigating to **Repository > Branches**:

![branches-in-gitlab](assets/branches-in-gitlab.png)
ramonsalguer's avatar
ramonsalguer committed

For instance, for a group of committers (Group) called `osm-doc` and a project called `documentation-how-to`:
ramonsalguer's avatar
ramonsalguer committed

- The list of pending Merge Requests would be at `https://osm.etsi.org/gitlab/osm-doc/documentation-how-to/merge_requests`
- The list of branches would be available at `https://osm.etsi.org/gitlab/osm-doc/documentation-how-to/branches`
ramonsalguer's avatar
ramonsalguer committed

First online review and, if needs to be rejected, include comments where applicable.
If the contribution is acceptable for a merge, the merge process can be conducted either in the website (particularly if it is a simple review) or in the editor's local repo. The following steps describe the most generic case of using the local repo, although the process would be analogous in the web interface.
First, update your local repository to retrieve the branch(es) with the pending _merge request(s)_:
```bash
git pull --all
```

Then, launch the merge operation:

```bash
git checkout master
git merge branchwithcontribution
If completed successfully, mark the merge request as approved in the website (**Project > Merge Requests**) and remove the branch if asked (this will clean up the Git tree in GitLab by removing unnecessary temporary branches).
If applicable, update the local repository with a `git pull`
If the merge does not imply conflicts with `master` branch, it will be smoothly solved by Git (as a _fast forward_ or as a _recursive_ merge) and properly integrated into `master`.

However, sometimes there might be colliding text lines between `master` and the contribution in the branch. In those cases, Git will request some manual intervention from the editor to proceed to a more careful merge.

#### How to resolve merge conflicts

When there is a _merge conflict_, Git leaves a special edited version of the conflicting file in the local working directory, which will include both conflicting proposals for a given line, one after the other, leaving the final result at editor's discretion.

Once solved by the editor (manually), the file can be added, committed and pushed again as normal:
ramonsalguer's avatar
ramonsalguer committed

```bash
git add conflictingfile.md
git commit
git push -u origin master
ramonsalguer's avatar
ramonsalguer committed
```

However, if eventually you were doubtful of your attempt to address the conflicts with the editor, it is always possible to revert the merge process (so it can be restarted later again) with:
ramonsalguer's avatar
ramonsalguer committed

```bash
ramonsalguer's avatar
ramonsalguer committed
```

### Some recommendations for editors

#### Markdown style

In spite of the fact that there is a high convergence around Markdown, some of the most advanced features of the language are not that normalized and may differ slightly among different Markdown flavours.

In order to minimize potential issues due to ambiguity of format among contributors, in case of doubt, **GitHub Flavored Markdown (GFM)** is highly recommended, given its prevalent popularity and support by most major editors, renderers and viewers.

#### File structure and document build

It is highly advisable that the editor provides a base file structure for the document. There are some best practices that apply:

- Different chapters (i.e. level-1 titles, with a single dash, **`#`**) of the document should be in different markdown files. Likewise, no more than one level-1 title should be placed in the same file.
  - Use **`TODO:`** as tag (including the colon, **`:`**) to indicate (in free text) the content expected for a specific section or any pending refinements to be added at later stages. This text will be visible by readers and will be rendered to output formats, so editors should use it wisely to provide sufficient context about the missing content without adding excessive noise to the reader.
  - It is highly recommended that all markdown documents end with an empty line, so that the build process is trivial for `pandoc`.
- Images should be placed in a dedicated folder. The recommendation is using the `./assets` folder.
- Likewise, DOCX templates, auxiliary files, output files, etc. should be placed in dedicated folders
- File and folder names should not contain spaces or any other kind of special characters to maximize portability and minimize rendering or reading errors in various auxiliary tools. To improve readability, an useful convention is using dashes (`-`) to replace spaces.
- The editor should add to the base folder a simple **script to build the document**, so there is no ambiguity to test the results.

TODO: Create a sample template with all recommendations above, so that it can be easily reused by new editors. Meanwhile, the [EUAG whitepaper on OSM scope](https://osm.etsi.org/images/OSM_EUAG_White_Paper_OSM_Scope_and_Functionality.pdf) can be used as reference and adapted with no much effort. Source files can be obtained by cloning the repository:

```bash
git clone ssh://git@osm.etsi.org:29419/osm-euag/osm-scope-white-paper.git
```

Or, alternatively, from regular OSM repos:

```bash
git clone https://osm.etsi.org/gerrit/osm/DOC.git
```

If you plan to adapt it over the same directory, **remember to change the remote URL to point the appropriate repo** to avoid pushing commits to the wrong location. In our example:

```bash
git remote set-url origin ssh://git@osm.etsi.org:29419/osm-doc/test.git
```

### Line length limitation and minimization of conflicts

A useful peculiarity of Markdown is that any paragraph of the final document can expressed using multiple Markdown lines, since a single carriage return does not create a new paragraph (2 carriage returns are needed to separate paragraphs). This property, among other reasons, is intended to improve readability in some environments.

Therefore, while the more natural and simplest way to produce and edit markdown files in a Git context can be producing the text as with any other text editor (leading to a single markdown line per paragraph), this singularity of Markdown syntax can be used in editor's favour for files, sections or paragraphs that might be subject to multiple edits and revisions. Thus, in those cases, it would be highly convenient for the editor to split critical paragraphs into lines equal or shorter than maximum length (typically, **79 characters**), so that merge conflicts between contributors are minimized.

If eventually a whole file were in that situation, `pandoc` can provide a shortcut for automatic conversion to the editor:

```bash
pandoc busychapter.md -t gfm --columns=79 -o busychapter-wrapped.md
```

And then, replace the original file by the new one, so that new contributors can take it as a reference:

```bash
cp busychapter-wrapped.md busychapter.md
rm busychapter-wrapped.md
git add busychapter.md
git commit -m "File busychapter.md wrapped to 79 characters per line"
git push origin master
```

Whenever the editor realises that the document is stable again or if prefers an edition based on lines that equal paragraphs, the previous rewriting can be easily reverted with Pandoc:

```bash
pandoc busychapter.md -t gfm --wrap=none -o busychapter-nowrap.md
```

And then:

```bash
cp busychapter-nowrap.md busychapter.md
rm busychapter-nowrap.md
git add busychapter.md
git commit -m "File busychapter.md now allows unlimited line lengths"
git push origin master
```

**WARNING:** Please make sure that **any of these commits with massive changes in the line wrapping convention is properly announced to contributors**, so that they do not create new contributions based on the former line convention (in branches derived from older commits), or they would be really challenging to handle as merge conflicts (**all the converted lines would be different!**).