09 Mar 2017
Github Repository Structures
In the interest of external engagement and maintaining a fairly simple
management structure for handling updates through pull requests, a process
similar to many mature python packages that are developed through Github,
such as IPython or Pandas, will be used.
The basic concept is that each repository will have just two branches as
follows:
- Master: This is the default branch of Github repositories and will carry
the development version of each package. This branch should be stable and
should work with upstream packages but it is not guaranteed.
- Release(s): This branch will hold the latest stable version of the code.
The branch itself will be labelled following the “major.minor” version number
of the package and then tags will be used to indicate micro version changes.
It may be that multiple release branches need maintained while upstream
packages “catch-up” to the release, but the older releases should have no
further commits made to them.
Package Version Numbers
The version numbering of the DTOcean packages generally relates to the amount of
effort required for upstream repositories as a result of changes made
downstream. Some typical examples of changes for the three level version
hierarchy is as follows:
- Micro version: Small bug fixes that make no API changes
- Minor version: Minor API changes that require small changes upstream;
More significant code changes for new features, performance or accuracy
improvements but which do not significantly alter the results
- Major version: Major API changes that significant affect other components
of DTOcean or the data model; Changes that significantly alter the results of
the code from new features or changes to existing features.
Some judgement may be necessary as to define the appropriate level to be
incremented for a particular change, but the idea is that the hierarchy
considers the level of impact on upstream packages. Indeed, beyond micro version
changes (which should require no action from upstream) the developers of
upstream packages should be consulted to discuss the likely level of impact.
The updating of a version number in a package should “automatically” trigger
an update of the matching hierarchy version in upstream packages. Although this
can not really happen automatically for changes other than micro versions,
the process should be instigated once the upstream package developer has been
made aware of the change through the
DTOcean Mailing List.
Development Stack
In order to allow developers of modules to evaluate the impact of their changes,
outside of any automated tests, it is necessary to provide a development stack
that can be used to create a “local” version of DTOcean.
To facilitate this the stable (release) version of all the DTOcean packages will
be made available through Anaconda Cloud, which can then be installed through
the “conda” package management system.
Details of how to install and develop the complete DTOcean system can be
found on the
dtocean-app Github repository page.
Updating Process
All updates to the dtocean repositories shall be handled through pull requests
and apart from micro version bug fixes, they should be made against the
master branch. In the case of micro version changes these should be submitted to
the current release branch.
In order to cleanly maintain the two public branches, other branches must not
be part of any pull request. Private branches are not discouraged, but when
submitting to the public server they must either be “fast-forward” merged or
“rebased”. This way all changes are maintained in the master branch.
The same process should be followed for changes submitted to the release branch
but those changes may or may not be incorporated into the master branch
depending on its level of development.
This process is detailed in the following subsections which is adapted from
the descriptions in Nicola Paolucci’s blog
post.
Creating a Fork and Cloning
The developer should use a personal fork of the DTOcean repository, rather
than working directly on the official repositories themselves. The process
is started as follows:
- “Fork” the DTOcean repository into a personal repository on Github. This
can be done using the “Fork” button in the top right of the Github page.
- “Clone” the personal fork onto your computer to produce a local working
copy.
Syncing the Fork
It is desirable to keep the fork in sync with the official repository, both
for submitting work and reusing the fork in the future (rather than deleting
and cloning again).
A summary of the syncing process is as follows:
- Add the DTOcean repository as the “upstream” remote.
- “Fetch” changed from upstream.
- Apply those changes to your local master branch.
- Push the updates to Github
In code:
git remote add upstream https://github.com/DTOcean/dtocean-*.git
git fetch upstream
git checkout master
git rebase upstream/master
git push origin
Steps 2, 3 & 4 can be repeated as required. Note that we never merge branches
in this approach to repository management.
Creating a branch and working on it
In your personal fork, create a new branch to create your new code and test it.
DO NOT WORK ON THE MASTER OR RELEASE BRANCHES. If you submit a pull request
on either of these branches your local fork will become irrecoverably out of sync
with the DTOcean organisation version.
Do your work on the new branch in your personal fork. You can commit and push
this branch without worry. When the code is ready and tested the branch can
then be prepared for a pull request.
Rebase the branch
Before submitting your branch as a pull request, it is useful to rebase
the branch with the current upstream repository. Assuming that the feature
is being developed on branch “feature-x” the rebasing process is as follows.
git checkout feature-x
git fetch upstream
git rebase upstream/master
git push origin
Assuming there are no conflicting changes on the official repository this
should happen smoothly. Otherwise you may need to resolve a merge conflict, see
Resolving merge conflicts after a Git rebase
for a description about how to resolve these. Its possible you may also need
the “-f” flag for the push, if the rebase has changed the history of your
personal fork.
Once the rebase is done, a pull request can be made on the GitHub website.
Pull Request
Use the Github.com interface to make a pull request for your branch. Switch to the
feature branch and push the “New pull request” button. Make sure that the base
fork points to the official DTOcean repository and that the base branch is
either “master” or a release branch.
Add a short message about the purpose of the pull request and ensure that the
“allow edits from maintainers” box is ticked so that the official repository
maintainer can help. Then click “Create pull request”.
Modification, Acceptance and Tidy-Up
Once the pull request has been submitted the repository maintainer will review
it and maybe make suggestions as to how it can be improved. They may also work
on the branch itself. If you add commits to the branch, these will automatically
be included in the pull request.
Once the maintainer is happy, the request will be accepted and the branch will
be rebased into the requested official branch. The feature branch will then no longer
exist in the history of the official repository and should be deleted in your
personal copy. This can be started by pressing the “Delete branch” button at
the bottom of the pull request.
Following the branch delete in your person Github repository, the local
repository should be synced to the updated official repository so that you
can see your changes and continue working. The process is as follows:
git checkout master
git fetch upstream
git rebase upstream/master
git fetch origin --prune
git branch -D feature-x
git push origin
This process will align your master (or release) branch with the official
repository and then prune and delete your feature branch. Note that “feature-x”
should be replaced with the name of your feature branch.
Returning to your personal Github repository of the package should now show
the message:
This branch is even with DTOcean:master.
Impact on other DTOcean Packages
The act of making a pull request allows time for both review of the new
code itself, but also the impact on upstream packages. If it is deemed in the
pull request that some work is required upstream, this work can be scheduled.
In terms of minor changes, and assuming the code passes review, scheduling
upstream changes may be sufficient to allow the pull request to be merged.
However, for major changes, the pull request may be delayed until the upstream
packages are ready or (in extreme circumstance) may be moved to another branch
or repository.
Collaborative Working
Although the above process would seem to suggest that either working alone to
create stable code or submitting unstable code in order to facilitate
collaboration are the only options available. However, this is not the case,
it’s just that collaboration on a fix or feature should be carried out on a
forked repository rather than the official repository itself. In the forked
repositories any number of collaborators and branches can be utilised.
Changelogs
The project will also endeavour to maintain useful change logs, as described at
keepachangelog.com, in order to provide a user
friendly description of changes between versions. An official approach for
maintaining these changelogs is yet to be decided, however.
Further Reading
Further reading about the mechanism for this “Trunk” style updating process can
be found in the following references:
Conclusions
The techniques presented in this blog post represent a new paradigm for the
development of DTOcean. Previously it followed a multi-branch approach with each
team maintaining their own branch before integration. Some mistakes will certainly be
made along the way while we learn to effectively implement these new processes,
but aligning with common practice in open source development will allow the
developer base to grow beyond the original project development team.
23 Feb 2017
A Gentle Renaming
Before commencing a new wave of development I want to describe the purpose of
each of the packages in the DTOcean project. As part of this process of
description it has become clear that some of the names of both the repositories
and the packages themselves could be improved.
One of the main areas of confusion is which of the packages is the principal
one. Although many people thing that it is dtocean-core, in fact it’s
dtocean-gui which provides the final user interface. One reason that this
confusion could have occurred is that dtocean-core was in development for
significantly longer than dtocean-gui as it provides the control aspects that
dtocean-gui operates. Indeed DTOcean can be run with dtocean-core alone, but
there will be no graphical aspect other than output graphs.
Thus, to reduce confusion it has been decided to rename dtocean-gui as
dtocean-app. Hopefully, this will make the package appear of more importance
in the hierarchy than dtocean-core. This and all the other changes to package
names are as follows:
- dtocean-gui to dtocean-app: As explained above.
- pandas-qt to dtocean-qt: This is a fork of the pandas-qt package that
has been modified to work with the DTOcean GUI. To avoid confusion with the
original package (although depreciated) the name will be changed.
- dtocean-operations to dtocean-maintenance: The naming here is misleading
as this package deals only with maintenance operations, not, for example,
installation operations which is dealt with by dtocean-installation.
- dtocean-environmental to dtocean-environment: This is to match the naming
used within the package itself.
In addition to the above changes to the packages the Github repositories names
will also be modified to match.
Package Descriptions
For development of the DTOcean tool, it is necessary to understand the
purpose of the 14 packages that make up the project. They can be divided into
4 groups:
- Support: These packages provide basic functionality that may be shared
among many packages.
- Core: These packages control the execution of the design and assessment
modules, control the data flows between the database, user and modules and
provide interactive access for the user.
- Modules: The design modules execute the scientific algorithms.
- Assessments: The assessment modules provide metrics for the outputs of
individual modules or the entire design.
The packages in each of these groups will now be discussed in turn. Further
details can be found in the DTOcean technical manual.
Support Packages
polite
The polite package contains a number of shared functions for working with
configuration files and the python logging system. It makes deploying and
utilising a logging configuration file extremely “polite”. The package is
utilised by almost all the other DTOcean packages.
dtocean-qt
A fork of the now defunct pandas-qt package, this has been modified to work
with the DTOcean (Anaconda) Qt system and provides some additional functionality
for cherry-picking how pandas tables can be edited, such as allowing new rows
but not columns.
Core Packages
aneris
The package aneris is the underlying data coupling and action scheduling
framework. It provides all the pieces required for the logical structure of
dtocean-core, however it could be used for different applications that required
any functionality where generic interfaces are sharing data. It also
allows for extensions through the use of plugins.
dtocean-core
The dtocean-core applies the functionality of aneris to a more strict structure.
It contains the concepts of executing a number of modules in order which will
then be immediately assessed by a number of thematic assessments for the module
itself and the global, cumulative state of the data. It also provides interfaces
to the database and to the user in forms of data input and output and plots. It
uses the plugin architecture of aneris to allow for easy extensions, including
defining advanced execution strategies for optimisation.
dtocean-app
This package provides the Qt4 GUI for interacting with the dtocean-core.
Although it does not contain any additional logic, visualising the data
provides significantly greater insight over using dtocean-core on its own and
increases productivity.
Modules
dtocean-hydrodynamics
The hydrodynamics module for DTOcean. It encapsulates 4 python modules,
dtocean-wave, dtocean-tidal, dtocean-hydro and dtocean-wec. The first three
provide the solvers and optimisation routines for designing a wave or tidal
array and dtocean-wec provides a tool for creating a complex input for the wave
solver by utilising the open source code NEMOH.
dtocean-electrical
This module develops the electrical network for the array up to and including the
onshore landing point. It also selects an electrically appropriate
umbilical cable should the selected devices be floating.
dtocean-moorings
The moorings and foundations module will design foundations for all devices
and furthermore design mooring for floating devices. If an umbilical cable has
been chosen (either by dtocean-electrical or the user) then the module will
adjust the mooring and foundation system so that it is compliant with the
cable.
dtocean-logistics
This module is not exposed to the user but provides all the logistics
logic (such as selections of vessels and ports, scheduling and duration
information) for the installation and maintenance modules.
dtocean-installation
The module generates an installation solution for the device array layout,
electrical and moorings and foundations networks. It also indicates when the
maintenance phase of operations can commence.
dtocean-maintenance
This module generates a maintenance strategy for the array layout and electrical
and moorings and foundations networks. This strategy can be tuned by the user
depending on whether they prefer a calendar based, condition monitoring or
unplanned corrective strategy. The module not only requires dtocean-logistics
but is also dependent on dtocean-reliability to provide the likelihood of
failure of the components and device subsystems. Failure events are estimated
in the time domain and a maintenance schedule and adjusted power production are
produced for the lifetime of the array.
Assessments
dtocean-economics
The first assessment module is used to calculate the levelised cost of energy
for any given input. A lot of additional work is done within the dtocean-core
interface to this theme to create additional outputs for the user.
dtocean-reliability
Generates mean time to failure and other metrics for the networks
produced by the electrical and moorings and foundations module or can be
used with custom networks, as is the case with the maintenance module.
dtocean-environment
This assessment generates two numerical environmental impact scores based on
the array design, operations details and inputs from the user. The
first score considers any negative impacts from the array whilst the second
score considers any potential positive impacts.
Interrelationships

Now that the purpose of each package has been briefly discussed, it is
important to understand the relationships between them. Of particular
interest is visualising the impact that changes in one package will have on
those packages that depend on it. A new release of a
package at a low level in the chain could necessitate at least a new release of
packages higher up and potentially require changes in response.
As can be seen from the graph above, a change in any of the design modules may
necessitate a change in both the dtocean-core and dtocean-app packages. As
dtocean-logistics is a dependency for two other modules, changes to it may
require changes in the 4 packages which are upstream of it. Also note that a
change in dtocean-reliability affects 4 packages as it may trigger changes to
dtocean-maintenance.
Missing from the above diagram is polite. As polite is a dependency for almost
all of the other packages, it was too untidy to include it in the graph.
Also, it is more mature than the other packages and therefore less likely to
be changed significantly and require changes from upstream. However, there
remains a risk that a significant change in polite could require new releases
for numerous other packages in the project.
Managing Change
From the interrelationships shown above, it is clear that managing changes in
the DTOcean project is a challenging task. This topic
will be discussed in another blog post where the procedures for developing
the modules will be detailed. The important lesson from this post is that
communication between the developers working on each module is key to
ensuring that changes are beneficial to the entire project.