Changes

Jonathan Hanks · 7ae0049f
--- a/Code-Guidelines-and-Standards.md
+++ b/Code-Guidelines-and-Standards.md
+# Abstract #
+This document outlines some guidelines for software in CDS.  It is an aspirational document that is at least partially implemented at this time.  It is targeted towards the full time software development effort.
+
+This document is certainly not complete and is also certainly not structured correctly.
+
+# Code Formatting #
+Code should be formatted consistently.  Code should be mechanically formatted.
+This helps other readers jump into a code base without having to sort through various types of brace placement and indentation, it removes one level of noise.
+
+## C/C++ Formatting ##
+Formatting is done via clang-format.  The canonical definition of the style is found in the advligorts source code in the .clang-format file.
+
+Note: not all of the CDS code has been/can be reformatted at this time.  Some code with complex #ifdefs will need to be checked after reformatting and will require better test suites.
+
+_Note:_ This format definition comes out of work with LDAS in the framecpp & nds code bases, and is an effort to become more consistent across the LIGO lab.
+
+## Python Formatting ##
+Python code should be formatted according to PEP 8.
+
+## Go Formatting ##
+Go code should be formatted according to go fmt.
+
+_Note:_ Go prefers tabs to spaces.  That is ok, that is the idiomatic way for this language.
+
+## Perl/TCL/... ##
+TBD, replace with python if feasible due to the knowledge and experience base in the lab.
+
+# Naming Conventions #
+It would be good to standardize on something here.  This is up for discussion and agreement.
+
+Some notes:
+ * Snake case for Python, C, C++
+ * Camel case for Go.
+   * Language idiom
+   * Must respect language rules for exported symbols
+
+# Build Systems #
+# C/C++ use CMake by default
+CMake is the current industry standard.  This should be the preferred system to build C/C++ code.  It provides the ability to build across Windows, OSX, Linux systems.
+
+Newer/Modern CMake idioms should be used.  This is defined with target based operations instead of variable based declarations.
+
+There are places such as building Linux kernel modules or the RT models that CMake does not make sense and other build systems are used.
+
+# Version Control #
+Put code in version control.
+
+Prefer git and hosting on git.ligo.org.
+
+For development use topic and issue branches.
+Do not be afraid to commit WIP (work in progress) code to your branches.  It is better to save items and make sure we have a history than to have things perfect.
+
+Keep the master/head/main branch clean and working.
+
+# Code Review #
+After a project leaves an initial proof of concept stage we should be reviewing updates to the code.
+
+Reviews force us to write better code as we know that someone else will look at the code.  Reviews also act as a way to train other team members on a code base.
+
+By using a git workflow with merge requests we can formalize a review.
+
+# Software Licensing #
+The lab policy on software licensing is at https://dcc.ligo.org/M1500244/.
+
+# Packaging CDS Software #
+
+We have four primary packaging targets for CDS software:
+ 1. Native Debian packages
+ 1. Native Scientific Linux 7
+ 1. Native Rocky Linux 8
+ 1. Conda
+
+_Note:_ Conda is expected to replace the SL7 & Rocky 8 packaging.
+
+## Software that should be natively packaged ##
+We should put all software required to do a basic test stand or IFO into debian packages.  This includes (but is not limited to):
+
+ * Custom Linux kernels
+ * DKMS packages
+ * RCG
+ * System services
+   * DAQD
+ * Required interconnect software (Dolphin, cps_xmit/recv, local_dc)
+ * Basic data tooling
+   * DTT, Ndscope
+ * Supporting libraries and software from the LSC
+   * LDASTools/FrameCPP
+   * NDS client
+   * Root
+ * Core control and monitoring software
+   * EPICS, medm/caqtdm, StripTool, Guardian
+
+The basic idea is that a site can point to the Debian OS repositories + the CDS repositories and get a basic working system.
+
+## Software that should be packaged in Conda ##
+Additional software should probably be put into Conda.  Note there is overlap with native packages, Coda will have a core of workstation software as well.
+ * EPICS
+ * medm/caqtdm
+ * StripTool
+ * DTT
+ * Ndscope
+ * Gwpy
+ * …
+
+_Note:_ this needs to be fully done and integrated at LHO.
+
+# Units of measurement #
+
+## Code must track units by name ##
+
+A variable or constant that represents a value with a unit of measurement must specify the units at the end of the name, e.g. 'time_s', and optionally the units can be specified in the name of the type of the value.
+
+## Code must consistently use the same units. ##
+
+Internally, all values of the same dimension used for the same purpose should use the same units.  Ideally, all values across the code base should use the same units.  Practically, different units will sometimes be used for different purposes.  It's reasonable to use 'time_s' and 'latency_ns' in the same code base.  When different units are used for the same dimension, they must be from the same measurement system.  
+
+## Use SI units ##
+
+Values should be in some SI unit, including Celsius or Kelvin for temperature, seconds, meters, coulombs, kg etc, their standard combinations, and picking appropriate scaling at the designer's discretion.
+
+Units should be used in a broader sense than physical units of measurements. For example, 'counts' for ADC or DAC counts is a useful unit for differentiating from 'volts' values.
+
+## Converting to other units ##
+
+Code must convert to other units from internal units only during I/O  for user convenience, file format or similar reasons.
+
+Inputs must be converted to internal units before any other processing is done on the values.
+
+Conversions between two units should be handled by a single function or macro (allowing for overloading for different types).  A- hoc conversions must be avoided.
+
+In C++ “Strong” types can be used to represent types and prevent unintentional/incorrect conversion of types.
+
+# Architecture #
+## Architect code to be tested ##
+Our code must be tested.
+
+Old code without test suites should receive integration tests + tests on new code.
+
+New code should be designed and submitted with tests.  The ability to test your code will impact the design and layout of code.
+##  General Traits of Tests ##
+
+Tests should provide good code coverage.
+
+Tests that are easy to setup and run are run more often.
+
+Tests that run quickly are run more often.
+
+In the end the only test that matters is the test that is run.
+
+## Speed in Tests ##
+Design your tests to run quickly.  As noted above the faster and easier to run tests, the more they are used.
+
+This includes doing things like mocking/stubbing out timers and input sources.
+
+## Use simulation as needed to setup tests ##
+Use simulation to replace hard to setup/expensive/low availability sources.
+
+As an example the daqd system and transport modules use simulated FE data to do end to end tests tracking integrity of the data and daqd functions from a models shared memory buffer to frame or nds output.  This allows the daqd to be tested without having to have a real time system installed.
+
+Some things really need full hardware setups, but much can be done without a full test stand setup to exact requirements.
+
+## Unit tests can be done in complex situations with mocks and simulation ##
+Complex situations can be tested both in integration tests, and unit tests with some creativity.
+
+In C++ templates or object hierarchies can be used to replace network streams, timers, ... with functionally equivalent inputs.  
+
+In Python functions can be extended to take clocks, sockets, … with default parameters that resolve to the regular ‘physical’ objects used in production that are then overridden with mocks in testing.
+
+## Testing Tools ##
+CMake integrates testing via the ctest system.  Any tests integrated here can be run with the build system.
+
+### C/C++ Catch/Boost Test ###
+Catch is a simple/lightweight unit testing framework.  It is used in parts of the daqd and other software.
+
+Boost::Testing is used by LDASTools
+
+### Python ###
+Use the python testing framework
+
+### Go ###
+Go has a build in test/benchmark (and soon fuzzing) framework.  It should be used for unit tests.
+
+In addition GoConvey is a useful tool/runner for reloading and running your tests as you save files.
+
+# API Design #
+Design APIs to keep objects in known good states.
+
+Design APIs so they are easy to use right and hard to use wrong.
+
+# Documentation #
+## Doxygen for C/C++ code ##
+
+# Static types to let the compiler/IDE help
+Use static types when possible to help prevent issues.
+
+In C++ this includes enum classes and wrapping types in “strong” classes as needed.
+
+In python this includes:
+ * Named parameters
+ * Parameter type annotations
+ * Creating classes or named collections instead of passing dictionaries around
+
+# When not to use external libraries #
+Generally we want to take advantage of external libraries to provide functionality.  There are times when we may want to do an internal re-implementation of a concept.
+
+If we are creating a base library that will be consumed by outside applications and libraries we may want to keep our dependency list low so as to not inflict our library choice on downstream users.  For example this is why the nds2 client does not make use of the Boost C++ libraries, as it would possibly constrict downstream users to the same version of Boost that it was built with.
+
+# Interfacing with multiple languages #
+
+## SWIG ##
+Swig is useful when interfacing C/C++ code with many different languages.
+
+## PyBind11 ##
+When a C++ library need only be exported to Python (and not to Java/Matlab/…) PyBind11 may be a better choice as it provides good integration and keeps the bindings in C++ (instead of the including another language [the SWIG system] into the mix).
+
+## ROOT ##
+Some older applications use a C++ interpreter built into ROOT to expose C++ classes to Python.  **Future use of ROOT is discouraged**.
+
+# Language Specific Notes #
+## C++ ##
+Prefer the std library.  There are cases when it will not provide a good implementation, but when it can be used it provides common vocabulary and generally well tested implementations.
+
+Prefer ranges and container aware systems if possible.  Iterators are powerful, but can be mixed up.  This includes using boost::ranges or std::ranges when it is available in our compilers.  Note this is largely not done in our code.
+
+We have to be cognizant of the C++ version we use.  Because we use native packaging in some of our code for now we need to track Debian GCC releases.
+
+Use RAII and other resource managers (smart pointers) to manage anything that needs to have cleanup.  We should see initialization code but not cleanup code as it is done in our destructors.
+
+New/delete, malloc/free, open/close are signs that we need to wrap something.
+
+# Preprocessor in C/C++ #
+Use typed constants instead of #defines where possible.
+
+Try to minimize the use of the preprocessor.  We have too much #ifdef soup.
+
+When needed and if possible try to keep #ifdefs to a single chain of #ifdefs at the top of a file with typedefs and wrapper function definitions defined there so that the rest of the file can use the typedefs and wrapper functions to avoid many #ifdefs.
+
+# Languages #
+Historically we have had a lot of C/C++ application development + a mix of shell and scripting languages.
+
+For scripting languages we have settled on Python.
+
+We should not be constrained to do everything else in C++.
+
+For integrating w/ existing C++ code, C++ may be the best choice.
+Python may be a good fit depending on application scope and library use.
+
+Other languages may be a good fit.  For example, we have started using Go for areas with high concurrency and network loads.
+
+Kernel code right now is C.