man/style.3

   1 .\" Copyright (c) 2017 Apple Inc. All rights reserved.
   2 .Dd 12 January, 2018
   3 .Dt style 3
   4 .Os Darwin
   5 .Sh NAME
   6 .Nm style
   7 .Nd C language style guide for Darwin low-level userspace projects
   8 .Sh DESCRIPTION
   9 This style's primary objective is to be as friendly to the code review process
  10 as possible. Therefore, the style aims to ensure that code changes to the
  11 project produce diffs that are
  12 .Pp
  13 .Bl -bullet -compact -offset indent
  14 .It
  15 small
  16 .It
  17 unambiguous
  18 .It
  19 viewable in side-by-side comparison tools
  20 .El
  21 .Pp
  22 As a secondary objective, this style also aims to make code as clear as possible
  23 for an uninitiated programmer reading it. "Clever" syntactic shortcuts are
  24 actively discouraged in favor of code that is easy to understand, even if it is
  25 less concise. Coincidentally, this practice also tends to lend itself to
  26 generating more readable diffs.
  27 .Pp
  28 Like any style, consistent adherence across a project is a virtue in and of
  29 itself, resulting in less distraction for the reader. However, these guidelines
  30 should be taken as exactly that: guidelines. No style can be completely adhered
  31 to all the time. When you have convinced yourself that a deviation from the
  32 style is called for, just make sure it is for the greater good and maintains the
  33 style's aims of minimizing diffs and code complexity.
  34 .Sh GENERAL PRINCIPLES
  35 .Ss Vertical space is a commodity
  36 Scrolling vertically has always been easier than scrolling horizontally.
  37 Computer mouse manufacturers went so far as to dedicate hardware to the task of
  38 scrolling vertically when they came up with scroll wheels. Even on modern
  39 trackpads, while it is possible to scroll horizontally, it is far easier to
  40 scroll vertically. You just flick upwards. Do not be afraid to introduce extra
  41 lines of code if it will result in clearer, more human-parseable diffs when
  42 those lines are changed.
  43 .Ss Horizontal space is precious
  44 Scrolling horizontally is typically awkward, imprecise, and does not lend itself
  45 well toward reading on computers or even in print. (Academic journals frequently
  46 publish with two narrow columns per page to make reading easier, for example.)
  47 Lines should be wrapped consciously; this should not be left to the editor. A
  48 soft-wrapping scheme that looks good in your editor may not look good in someone
  49 else's editor (or with a different configuration for the same editor).
  50 .Pp
  51 Just as natural language comments are difficult to read in one, long line,
  52 so too are lines of code. Both natural languages and programming languages
  53 deserve conscious, deliberate wrapping to improve readability.
  54 .Pp
  55 Wrap at a column width narrow enough to accommodate side-by-side patch
  56 review. 80 is more likely to accommodate this, but 120 might be fine too. Pick a
  57 reasonable column width and stick to it. Think about the lines you are wrapping.
  58 If you have to wrap a line, do so in a way that is clear, and be willing to make
  59 changes to accommodate that (e.g. don't be afraid to declare a variable
  60 separately if having the declaration and assignment on the same line causes it
  61 to wrap in an unclear way).
  62 .Ss Indentation is for scope indication and nothing else
  63 Indentation's sole purpose is to indicate scope. You should not use indentation
  64 to align characters on two lines of source code (beyond, of course, aligning
  65 the first characters of each line if they are both in the same scope).
  66 .Pp
  67 Given this aspect of the style, it does not particularly matter whether the
  68 author chooses spaces or tabs for indentation, and therefore the style makes no
  69 prescription (other than to pick one and stick with it).
  70 .Pp
  71 This style also has another implication: tabs and spaces should never appear
  72 in sequence. Each line will be a series of tabs followed by the first character
  73 of code. Tabs will never appear after the initial indentation of the line.
  74 .Ss Don't require leaving the source to understand it
  75 Always think of future maintainers and document your thought process for them.
  76 Remember, a "future maintainer" might be you after you've forgotten why you did
  77 something. For non-trivial changes, you should not rely on linking to a
  78 ticket-tracking system to provide context and explanation for a piece of code.
  79 You should strive to ensure the reader of your code does not have to
  80 context-switch out of reading it in order to understand it.
  81 .Pp
  82 This is not to say that linking to external resources in code is bad, but
  83 if a change's purpose can be reasonably expressed without interrupting how the
  84 code reads and flows, just do it. You don't need to publish a whitepaper in
  85 comments, but don't just give a link or ticket number with no context.
  86 .Ss Each line of code should minimize entropy
  87 It is actually very difficult to construct a hash comparison scheme that humans
  88 can use without error consistently, and there have been successful social
  89 engineering attacks on getting humans to read two hashes that are "close enough"
  90 as identical. This means that humans need a lot of help telling the difference
  91 between two lines of text.
  92 .Pp
  93 For any expression, divide it up into fundamental atoms (variable declarations,
  94 conditionals, etc.) and then assign each of those atoms to its own line of code.
  95 If you do this, when you change a single atom, it is immediately obvious that
  96 .Em only
  97 that atom changed and nothing else did. The more atoms share lines of code, the
  98 more likely it is that changes to them will generate complex diffs that humans
  99 will have difficulty understanding.
 100 .Ss Don't assume a specific editor
 101 Assume people will be reading your code in a tool that you do not control and
 102 cannot influence. Try viewing your code in such a tool and make sure that it is
 103 understandable. If you follow the guidelines of this style, your code may appear
 104 different in another viewer (in terms of how many columns are required to
 105 display a single line), but its structure will appear identical.
 106 .Sh SPECIFIC GUIDELINES
 107 .Ss Column Width and Line Wrap
 108 80 columns opens the door for a three-way, side-by-side comparison, but it could
 109 be impractical for a number of reasons. 120 columns should provide a nice
 110 balance, but really all that matters is that you pick a width and stick to it.
 111 .Pp
 112 When indenting a continuation line, indent over by two additional tabs. This
 113 visually separates the indented line from the next line, which may itself be
 114 indented. If there is an operator in the middle of the line, the operator should
 115 .Em not
 116 be wrapped to the continuation line.
 117 .Pp
 118 .Em Good
 119 .Bd -literal -offset indent
 120 if (some_condition && some_other_condition &&
 121         yet_another_condition) {
 122     exit(0);
 123 }
 124 .Ed
 125 .Pp
 126 .Em Bad
 127 .Bd -literal -offset indent
 128 if (some_condition && some_other_condition &&
 129     yet_another_condition) {
 130     exit(0);
 131 }
 132
 133 if (some_condition && some_other_condition
 134     && yet_another_condition) {
 135     exit(0);
 136 }
 137 .Ed
 138 .Pp
 139 Notice on the good example that the
 140 .Ic exit(0)
 141 line is made obviously distinct from the indented conditions above it. It's very
 142 clear on quick visual inspection that it's not a part of the conditional checks.
 143 The
 144 .Ic &&
 145 is left on the first line because, when reviewing a patch to this area, it will
 146 be immediately clear to the reviewer that that line continues to the next one.
 147 .Pp
 148 .Ss Avoid prettifying alignment
 149 Indentation is used only for indicating scope, so no consideration is given to
 150 visual alignment of equal signs, colons, etc. across multiple lines.
 151 .Pp
 152 .Em Good
 153 .Bd -literal -offset indent
 154 typedef enum {
 155     THING0 = 0,
 156     THING1 = 1,
 157     THING_THAT_IS_REALLY_LONG = 2,
 158 } thing_t;
 159 .Ed
 160 .Pp
 161 .Em Bad
 162 .Bd -literal -offset indent
 163 enum {
 164     THING0                    = 0,
 165     THING1                    = 1,
 166     THING_THAT_IS_REALLY_LONG = 2,
 167 };
 168 .Ed
 169 .Pp
 170 This creates bloated diffs. If you have to re-align a ton of lines after you've
 171 added something longer, you get a bunch of whitespace diffs. So for variable
 172 declarations, enumerations, assignments, etc. just keep every line independent.
 173 .Pp
 174 There is one exception to this rule, and that is if you choose to define a
 175 flagset in terms of its raw hexadecimal values and wish to align them. In this
 176 case, it is a significant benefit to have these values aligned, and you may do
 177 so with spaces.
 178 .Pp
 179 .Em Example
 180 .Bd -literal -offset indent
 181 typedef enum {
 182         F_INIT   = 0x00,
 183         F_FOO    = 0x01,
 184         F_BARBAZ = 0x02,
 185         F_CAD    = 0x04,
 186         F_FAD    = 0x08,
 187         F_FUD    = 0x10,
 188         F_FLAME  = 0x20,
 189         F_FOOD   = 0x40,
 190 } flag_t;
 191 .Ed
 192 .Ss Only one blank line at a time
 193 Use blank lines to separate logical chunks of code. Do not use more than one.
 194 .Ss Initialization
 195 C99 has named initializers for structures. Prefer those to initializing members
 196 one-by-one later on. Both structures and arrays should be initialized in the
 197 same style, with each element of the initializer being on its own line. This is
 198 so that when an element is added to or removed from the initialization list,
 199 that change gets its own line of diff.
 200 .Pp
 201 The exception to this is the string literal.
 202 .Pp
 203 .Em Good
 204 .Bd -literal -offset indent
 205 struct my_struct baz = {
 206     .ms_foo = 1,
 207     .ms_bar = NULL,
 208 };
 209
 210 char *strings[] = {
 211     "string",
 212     "other string",
 213 };
 214 .Ed
 215 .Em Bad
 216 .Bd -literal -offset indent
 217 struct my_struct baz = { 1, NULL };
 218
 219 struct my_struct baz = {
 220     1,
 221     NULL
 222 };
 223
 224 struct my_struct baz = { .ms_foo = 1, .ms_bar = NULL, };
 225 .Ed
 226 .Pp
 227 The last element of an initializer list should be followed by a comma. This is
 228 so that when you add a new element to that list, it's a one-line diff rather
 229 rather than a two-line diff (one line of diff to add the
 230 .Ic ,
 231 to the previous-last element, and another line of diff to add the new-last
 232 element).
 233 .Pp
 234 .Em Good
 235 .Bd -literal -offset indent
 236 enum {
 237     THING0,
 238     THING1,
 239 };
 240
 241 struct my_point p = {
 242     .x = 1,
 243     .y = 0,
 244     .z = 1,
 245 };
 246 .Ed
 247 .Pp
 248 .Em Bad
 249 .Bd -literal -offset indent
 250 enum {
 251     THING0, THING1,
 252 };
 253
 254 enum {
 255     THING0,
 256     THING1
 257 };
 258
 259 struct my_point p = { .x = 1, .y = 0, .z = 1 };
 260 .Ed
 261 .Pp
 262 Note that, if your project requires ANSI C compliance, you should disregard this
 263 guideline, as it will not work under C89.
 264 .Ss Avoid function name overloading
 265 The
 266 .Xr clang 1
 267 compiler supports extensions to the C language which allow for function name
 268 overloading. Name overloading generally leads to code which is difficult to
 269 read and introspect and should be avoided.
 270 .Ss Prefix `struct` members
 271 Any
 272 .Ic struct
 273 which is shared or exported should have a common prefix for each member. This
 274 helps avoid collisions with preprocessor macros.
 275 .Pp
 276 .Em Good
 277 .Bd -literal -offset indent
 278 struct foobar {
 279         int64_t fb_baz;
 280         char *fb_string;
 281 };
 282 .Ed
 283 .Pp
 284 .Em Bad
 285 .Bd -literal -offset indent
 286 struct foobar {
 287         int64_t baz;
 288         char *string;
 289 };
 290 .Ed
 291 .Pp
 292 .Ss Types end with `_t`
 293 A type is indicated with
 294 .Ic _t
 295 at the end of the
 296 .Ic typedef ,
 297 whether the type refers to a
 298 .Ic struct ,
 299 .Ic union ,
 300 .Ic enum ,
 301 etc. All types are indicated this way. Types are in all lower-case letters.
 302 .Pp
 303 .Em Good
 304 .Bd -literal -offset indent
 305 typedef uint64_t handle_t;
 306 typedef enum foo foo_t;
 307 typedef union bar bar_t;
 308 .Ed
 309 .Pp
 310 .Em Bad
 311 .Bd -literal -offset indent
 312 typedef uint64_t Handle;
 313 typedef enum foo foo_e;
 314 typedef union bar bar_u;
 315 .Ed
 316 .Ss Use explicitly-sized integer types
 317 Avoid integer types whose names do not indicate size, such as
 318 .Ic int
 319 or
 320 .Ic long .
 321 Instead, use the types from
 322 .Ic stdint.h
 323 (e.g.
 324 .Ic int64_t ,
 325 .Ic uint32_t ,
 326 etc.), which explicitly indicate size. You may use size-ambiguous integer types
 327 if an API requires it.
 328 .Ss Use `sizeof()` on variables rather than types where appropriate
 329 The
 330 .Ic sizeof()
 331 operator can take both types and variables as arguments. Where possible and
 332 relevant, always pass a variable. This ensures that if the variable's type
 333 changes, the proper size is used automatically.
 334 .Pp
 335 .Em Good
 336 .Bd -literal -offset indent
 337 uuid_t ident;
 338 memcpy(ident, buff, sizeof(ident));
 339 .Ed
 340 .Pp
 341 .Em Bad
 342 .Bd -literal -offset indent
 343 uuid_t ident;
 344 memcpy(ident, buff, sizeof(uuid_t));
 345 .Ed
 346 .Pp
 347 .Em IMPORTANT :
 348 When applied to a
 349 .Ic char * ,
 350 .Ic sizeof()
 351 will return the width of a pointer,
 352 .Em not
 353 the size of the string literal it points to, so take care to only use
 354 .Xr strlen 3
 355 for such cases.
 356 .Pp
 357 Relatedly, when applied to an array variable that is a parameter in a function's
 358 parameter list,
 359 .Ic sizeof()
 360 will return the width of a pointer,
 361 .Em not
 362 the size of the type.
 363 .Pp
 364 .Em Good
 365 .Bd -literal -offset indent
 366 char *string = "the quick brown fox";
 367 size_t len = strlen(string);
 368
 369 void
 370 foo(uuid_t u)
 371 {
 372         uuid_t u2;
 373         memcpy(u2, u, sizeof(uuid_t));
 374 }
 375 .Ed
 376 .Pp
 377 .Em Bad
 378 .Bd -literal -offset indent
 379 char *string = "the quick brown fox";
 380 size_t len = sizeof(string) - 1;
 381
 382 void
 383 foo(uuid_t u)
 384 {
 385         uuid_t u2;
 386
 387         // sizeof(u) == sizeof(void *) in this context.
 388         memcpy(u2, u, sizeof(u));
 389 }
 390 .Ed
 391 .Ss Functions which take no parameters have a parameter list of `void`
 392 In C, an empty function parameter list means that
 393 .Em any
 394 set of parameters is acceptable. In virtually all cases where you do this, you
 395 mean to have a parameter list of
 396 .Ic void .
 397 .Pp
 398 .Em Good
 399 .Bd -literal -offset indent
 400 void
 401 foo(void)
 402 {
 403     do_a_thing_without_arguments();
 404 }
 405 .Ed
 406 .Pp
 407 .Em Bad
 408 .Bd -literal -offset indent
 409 void
 410 foo()
 411 {
 412     do_a_thing_without_arguments();
 413 }
 414 .Ed
 415 .Ss Preprocessor macros
 416 Preprocessor definitions are written in all-caps. Macros which are function-like
 417 may be lower-case provided they do not double-evaluate their arguments.
 418 Function-like macros that do double-evaluate their arguments should be in
 419 all-caps.
 420 .Pp
 421 .Em Good
 422 .Bd -literal -offset indent
 423 #define FOO 1
 424 #define halt() abort()
 425
 426 // Does not double-evaluate _a and _b such that max(i++, j) is safe.
 427 #define max(_a, _b) ({ \\
 428     typeof(_a) a1 = (_a); \\
 429     typeof(_b) b1 = (_b); \\
 430     (a1 < b1 ? b1 : a1); \\
 431 })
 432
 433 // Double-evaluates _a and _b, so MAX(i++, j) is not safe.
 434 #define MAX(_a, _b) ((_a) < (_b) ? (_b) : (_a))
 435 .Ed
 436 .Pp
 437 .Em Bad
 438 .Bd -literal -offset indent
 439 #define kFoo 1
 440
 441 // Double-evaluates _a and _b.
 442 #define max(_a, _b) ((_a) < (_b) ? (_b) : (_a))
 443 .Ed
 444 .Pp
 445 Where possible, you should prefer inline functions to preprocessor macros, or
 446 split a macro into a preprocessor piece and an inline function piece.
 447 .Pp
 448 .Em Example
 449 .Bd -literal -offset indent
 450 static inline void
 451 _debug_uint64_impl(const char *name, uint64_t val)
 452 {
 453     fprintf(stderr, "%s = %llu\\n", name, val);
 454 }
 455
 456 #define debug_uint64(_v) do { \\
 457         _debug_uint64_impl(#_v, _v); \\
 458 } while (0)
 459 .Ed
 460 .Pp
 461 In this example, the preprocessor is used to do something that only the
 462 preprocessor can do: stringify the input variable's name. But once that work is
 463 done, the actual logging of the value is done by an inline function. This keeps
 464 the code much more readable.
 465 .Ss Preprocessor macro parameters should be distinguished
 466 Preprocessor macro expansion can run afoul of naming collisions with other
 467 variables that are in the same scope as the macro being expanded. To help avoid
 468 such collisions, parameters to preprocessor macros should have a prefix, suffix
 469 or both. Another good option is to use a
 470 .Ic _[A-Z]
 471 prefix, since it is reserved by the C standard and will not collide with
 472 preprocessor evaluation.
 473 .Pp
 474 .Em Example
 475 .Bd -literal -offset indent
 476 #define foo2(_x_) ((_x_) * 2)
 477 #define foo4(_x) ((_x) * 4)
 478 #define foo8(_X) ((_X) * 8)
 479 .Ed
 480 .Ss Preprocessor macro parameters should always be evaluated
 481 When passing a parameter to a preprocessor macro, it should always be referred
 482 to within parentheses to force evaluation. The exception is for parameters
 483 intended to be string literals.
 484 .Pp
 485 .Em Good
 486 .Bd -literal -offset indent
 487 #define add2(__x) ((__x) + 2)
 488 #define println(__fmt, ...) printf(__fmt "\\n", ## __VA_ARGS__)
 489 .Ed
 490 .Pp
 491 .Em Bad
 492 .Bd -literal -offset indent
 493 #define add2(__x) x + 2
 494 .Ed
 495 .Ss Preprocessor directives always start at column 0
 496 Preprocessor directives do not have scope, and therefore they always start at
 497 column zero.
 498 .Pp
 499 .Em Good
 500 .Bd -literal -offset indent
 501 if (do_loop) {
 502         for (i = 0; i < 10; i++) {
 503 #if CONFIG_FOOBAR
 504                 foobar(i);
 505 #else
 506                 foobaz(i);
 507 #endif
 508         }
 509 }
 510 .Ed
 511 .Pp
 512 .Em Bad
 513 .Bd -literal -offset indent
 514 if (do_loop) {
 515         for (i = 0; i < 10; i++) {
 516         #if CONFIG_FOOBAR
 517                 foobar(i);
 518         #else
 519                 foobaz(i);
 520         #endif
 521         }
 522 }
 523 .Ed
 524 .Ss Always reference string length directly
 525 Do not hard-code the size of a string. Use either
 526 .Ic sizeof(str) - 1
 527 or
 528 .Ic strlen(str) .
 529 In the latter case,
 530 .Xr clang 1
 531 is smart enough to recognize when a constant string is being passed to
 532 .Xr strlen(3)
 533 and replace the function call with the string's actual length.
 534 .Pp
 535 .Em Good
 536 .Bd -literal -offset indent
 537 char str[] = "string";
 538 frob_string(str, sizeof(str) - 1);
 539 frob_string(str, strlen(str));
 540 .Ed
 541 .Pp
 542 .Em Bad
 543 .Bd -literal -offset indent
 544 char str[] = "string";
 545 frob_string(str, 6);
 546 .Ed
 547 .Ss Don't pointlessly validate inputs
 548 If you control all call sites for a function, then there is no point to
 549 validating the inputs to that function. If none of your call sites pass
 550 .Ic NULL ,
 551 to a pointer parameter, for example, then the a
 552 .Ic NULL
 553 input indicates that there is state corruption in your address space. You may
 554 think that it's good to catch such corruption, but
 555 .Ic NULL
 556 is just
 557 .Em one
 558 possible invalid pointer value. What if the invalid input is
 559 .Ic 0x1 ?
 560 What if it is
 561 .Ic 0x2 ?
 562 Should you also check for those?
 563 .Pp
 564 This kind of input checking complicates code. Because it indicates state
 565 corruption, the only sensible thing to do in that situation would be to abort.
 566 But the operating system has mechanisms in place to detect the reference of an
 567 invalid resource, such as virtual memory and use-after-free detection. There is
 568 no point to you duplicating these mechanisms.
 569 .Pp
 570 Of course, you should always validate inputs
 571 .Em when they come from untrusted external sources
 572 (such as a file or IPC message), but if the inputs only ever comes from your
 573 program, you should trust them.
 574 .Pp
 575 .Em Good
 576 .Bd -literal -offset indent
 577 static foo_t *
 578 get_item(foo_t *arr, size_t idx)
 579 {
 580         return &arr[idx];
 581 }
 582
 583 int
 584 only_call_site(foo_t *f)
 585 {
 586         foo_t *arr = calloc(10, sizeof(arr[0]));
 587         if (!arr) {
 588                 return errno;
 589         }
 590
 591         *f = get_item(arr, 0);
 592         return 0;
 593 }
 594 .Ed
 595 .Pp
 596 .Em Bad
 597 .Bd -literal -offset indent
 598 static foo_t *
 599 get_item(foo_t *arr, size_t idx)
 600 {
 601         if (!arr) {
 602                 // No point to this check since we'll abort immediately below when we
 603                 // try to dereference `arr`. The crash report will have more than enough
 604                 // information to diagnose the NULL pointer dereference if it ever
 605                 // happens.
 606                 abort();
 607         }
 608         return &arr[idx];
 609 }
 610
 611 int
 612 only_call_site(foo_t *f)
 613 {
 614         foo_t *arr = calloc(10, sizeof(arr[0]));
 615         if (!arr) {
 616                 return errno;
 617         }
 618
 619         *f = get_item(arr, 0);
 620         return 0;
 621 }
 622 .Ed
 623 .Ss Abort on bad API inputs
 624 The C language provides precious few compile-time validation mechanisms, and so
 625 in many cases it is not possible to fully describe to the compiler the range of
 626 expected inputs for an API. So your API should validate input from its caller
 627 and abort on invalid input. Returning an error in such a case is pointless,
 628 since the caller probably isn't checking the return code anyway. The only sure
 629 way to get the programmer's attention is to abort the calling process with a
 630 helpful message. The
 631 .Ic os_crash
 632 routine allows you to supply such a message that the crash reporter on Darwin
 633 will display in its crash report.
 634 .Pp
 635 .Em Good
 636 .Bd -literal -offset indent
 637 uint8_t
 638 foo_a_bar(uint8_t number)
 639 {
 640         if (number > (UINT8_MAX / 2)) {
 641                 os_crash("number given to foo_a_bar() too large");
 642         }
 643         return (number * 2);
 644 }
 645 .Ed
 646 .Pp
 647 .Em Bad
 648 .Bd -literal -offset indent
 649 int
 650 foo_a_bar(uint8_t number, uint8_t *new_number)
 651 {
 652         if (number > (UINT8_MAX / 2)) {
 653                 return EINVAL;
 654         }
 655         *new_number = (number * 2);
 656         return 0;
 657 }
 658 .Ed
 659 .Ss Don't mingle POSIX return codes and errors
 660 Some POSIX routines have return values that indicate whether you should check
 661 .Ic errno ,
 662 and others just return the error directly. While POSIX generally documents what
 663 does what pretty well, there are lots of SPIs scattered around the system that
 664 use both conventions and aren't documented at all, leaving you to spelunk
 665 through the implementation to find out what's what.
 666 .Pp
 667 To avoid confusion, do not re-use the same variable for the return codes from
 668 these functions. If an API returns a code indicating that you should check
 669 .Ic errno ,
 670 name it
 671 .Ic ret
 672 or similar. If it returns the error directly, name it
 673 .Ic error
 674 or similar and make it of type
 675 .Ic errno_t .
 676 This makes it very clear to the person reading the code that you did the work to
 677 find out how the API worked. By naming the variable you store the return value
 678 in appropriately, a reader of your code (possibly Future You) can immediately
 679 know what's going on.
 680 .Pp
 681 If you are making new API or SPI that returns an error code, make it return
 682 .Ic errno_t
 683 and do not use the global
 684 .Ic errno
 685 for communicating error information.
 686 .Pp
 687 .Em Good
 688 .Bd -literal -offset indent
 689 #include <sys/types.h>
 690
 691 errno_t error = posix_spawn(NULL, "ls", NULL, NULL, argv, envp);
 692 switch (error) {
 693 case 0:
 694     // Handle success.
 695     break;
 696 case EACCES:
 697     // Handle "permission denied".
 698     break;
 699 }
 700
 701 int ret = reboot(RB_AUTOBOOT);
 702 if (ret == -1) {
 703     switch (errno) {
 704     case EPERM:
 705         // Handle "permission denied".
 706         break;
 707     case EBUSY:
 708         // Handle "reboot already in progress".
 709         break;
 710     }
 711 }
 712 .Ed
 713 .Pp
 714 .Em Bad
 715 .Bd -literal -offset indent
 716 int ret = posix_spawn(NULL, "ls", NULL, NULL, argv, envp);
 717 switch (error) {
 718 case 0:
 719     // Handle success.
 720     break;
 721 case EACCES:
 722     // Handle "permission denied".
 723     break;
 724 }
 725
 726 int error = reboot(RB_AUTOBOOT);
 727 if (error == -1) {
 728     switch (errno) {
 729     case EPERM:
 730         // Handle "permission denied".
 731         break;
 732     case EBUSY:
 733         // Handle "reboot already in progress".
 734         break;
 735     }
 736 }
 737 .Ed
 738 .Ss Avoid complex `if` statements and return distinct error codes
 739 Breaking up a single complex
 740 .Ic if
 741 statement
 742 into multiple distinct checks is both more readable and makes it possible to be
 743 more granular about handling failure cases. It also leads to smaller diffs if
 744 one of those conditions turns out to require special handling.
 745 .Pp
 746 Complex
 747 .Ic if
 748 statements are often associated with input validation and just returning an
 749 error code (usually
 750 .Ic EINVAL )
 751 if any input is invalid. While deciding which error to return in which case is
 752 more of an art than a science, that doesn't mean you should just give up and
 753 return a single error every time there isn't an immediately obvious fit to the
 754 case you've encountered.
 755 .Pp
 756 Ideally, every case where your routine may fail should be represented by a
 757 distinct error code, but this is often not practical. Still, you should attempt
 758 to distinguish each
 759 .Em likely
 760 failure case with its own error code. The POSIX error space is fairly rich, and
 761 error descriptions are brief enough that they can be liberally interpreted. For
 762 example,
 763 .Ic ESRCH
 764 can be used to apply to any situation where a resource could not be located, not
 765 just conditions where there is literally "No such process".
 766 .Pp
 767 This isn't to say that you should never have compound conditions in an
 768 .Ic if
 769 statement, but the groupings should almost always be small, and the grouped
 770 checks should be highly likely to require change as a group when change is
 771 needed.
 772 .Pp
 773 .Em Good
 774 .Bd -literal -offset indent
 775 if (foo->f_int > 10 || foo->f_int < 5)
 776         return ERANGE;
 777 }
 778
 779 if (!foo->f_uaddr) {
 780         return EFAULT;
 781 }
 782
 783 if (foo->f_has_idx && foo->f_idx > 100) {
 784         return ERANGE;
 785 }
 786
 787 if (foo->f_state != FS_INITIALIZED) {
 788         return EBUSY;
 789 }
 790 .Ed
 791 .Pp
 792 .Em Bad
 793 .Bd -literal -offset indent
 794 if (foo->f_int > 10 || foo->f_int < 5 || !foo->f_uaddr || (foo->f_has_idx && foo->f_idx > 100) ||
 795                 foo->f_state != FS_INITIALIZED) {
 796         return EINVAL;
 797 }
 798 .Ed
 799 .Pp
 800 See
 801 .Xr intro 2 ,
 802 .Ic <sys/errno.h> ,
 803 and
 804 .Ic <os/error.h>
 805 for the error codes supported on Darwin.
 806 .Ss Don't NULL-check when calling `free(3)`
 807 .Ic NULL
 808 is valid input to
 809 .Xr free 3 .
 810 It's part of the API contract. Armed with this knowledge, you can do things like
 811 avoid conditional memory calls, which are always weird.
 812 .Pp
 813 .Em Good
 814 .Bd -literal -offset indent
 815 char buff[1024];
 816 char *ptr = buff;
 817 char *what2free = NULL;
 818
 819 if (condition) {
 820     ptr = malloc(8);
 821     what2free = ptr;
 822 }
 823
 824 free(what2free);
 825 .Ed
 826 .Pp
 827 .Em Bad
 828 .Bd -literal -offset indent
 829 char buff[1024];
 830 char *ptr = buff;
 831 bool did_malloc = false;
 832
 833 if (condition) {
 834     ptr = malloc(8);
 835     did_malloc = true;
 836 }
 837
 838 if (did_malloc) {
 839     free(ptr);
 840 }
 841 .Ed
 842 .Ss Distinguish exported and non-exported symbols
 843 Any non-exported symbols should be prefixed with a
 844 .Ic _ .
 845 Thus any
 846 .Ic static
 847 functions, project-local interfaces, etc. should have this prefix. Exported
 848 symbols (API or SPI) should
 849 .Em not
 850 have such a prefix.
 851 .Pp
 852 .Em Good
 853 .Bd -literal -offset indent
 854 static const char *_thing = "thing";
 855 static void _foo(void);
 856
 857 void
 858 _project_local_interface(void);
 859 .Ed
 860 .Em Bad
 861 .Bd -literal -offset indent
 862 static const char *thing = "thing";
 863 static void foo(void);
 864
 865 void
 866 project_local_interface(void);
 867 .Ed
 868 .Pp
 869 Global variables should have a sensible prefix, preferably related to the
 870 project name -- e.g. globals in the
 871 .Xr libxpc 3
 872 project are prefixed with
 873 .Ic xpc_ .
 874 .Pp
 875 You may also consider declaring a global structure which contains all of your
 876 project's shared, unexported global state. This makes it very clear when code is
 877 referencing that state. Also, if your project is a library at the libSystem
 878 layer, this is required if you are ever to adopt
 879 .Xr os_alloc_once 3 .
 880 .Pp
 881 .Em Example
 882 .Bd -literal -offset indent
 883 typedef struct _foobar_globals {
 884         uint64_t fg_global_int;
 885         char *fg_global_string;
 886 } foobar_globals_t;
 887
 888 foobar_globals_t _g;
 889 foobar_globals_t *g = &_g;
 890 .Ed
 891 .Ss Distinguish SPIs meant for one caller
 892 Sometimes projects must create bespoke SPIs for one particular caller, and these
 893 SPIs are not considered suitable for general use. Append a suffix to these SPIs
 894 to indicate their bespokeness and the intended caller with
 895 .Ic _4caller .
 896 For example, if you add an SPI specifically for IOKit, your suffix would likely
 897 be
 898 .Ic _4IOKit .
 899 .Ss Use `#if` instead of `#ifdef` where appropriate
 900 .Ic #ifdef
 901 is to check if a token is
 902 .Em defined at all to anything.
 903 .Ic #if
 904 is to check the token's value. The C standard specifies that when a token is
 905 undefined,
 906 .Ic #if
 907 will evaluate it as
 908 .Ic 0 .
 909 When checking for features, it's almost always more appropriate to use
 910 .Ic #if
 911 since the lack of a feature could still be communicated by setting the token's
 912 value to
 913 .Ic 0 ,
 914 which would pass the
 915 .Ic #ifdef
 916 check.
 917 .Ss Use Function Attributes from `<os/base.h>`
 918 If you're on Darwin,
 919 .Ic libplatform
 920 defines a lot of nice macros for compiler attributes. Use them to decorate your
 921 functions. This gives the compiler lots more information so it can do fancy
 922 optimizations. Things like
 923 .Ic OS_NONNULL
 924 let the compiler know that a parameter should never be
 925 .Ic NULL .
 926 .Ic OS_WARN_RESULT
 927 is great for enforcing that a caller always check the return value of a
 928 function.
 929 .Pp
 930 .Ic OS_MALLOC
 931 lets the compiler know that the function returns a heap allocation, and
 932 .Ic OS_OBJECT_RETURNS_RETAINED
 933 lets ARC know that the function returns an object with a reference that the
 934 caller is responsible for releasing.
 935 .Pp
 936 You can avoid having to decorate all your pointer parameters by using
 937 .Ic OS_ASSUME_NONNULL_BEGIN
 938 and
 939 .Ic OS_ASSUME_NONNULL_END
 940 and specifically annotating variables which
 941 .Em can
 942 be
 943 .Ic NULL
 944 with the
 945 .Ic _Nullable
 946 keyword. Either approach is acceptable.
 947 .Pp
 948 Generally, use these attributes on functions which will have callers who cannot
 949 view the implementation. Putting many of these attributes on (for example) an
 950 inline function is harmless, but the compiler can reason about things like
 951 .Ic OS_NONNULL
 952 and infer it when it can view the implementation at all call sites.
 953 .Pp
 954 So as a rule of thumb, if it's in a header, decorate it appropriately. These
 955 attributes can also serve as nice implicit documentation around API and SPI. For
 956 example, if you have a decoration of
 957 .Ic OS_NONNULL1 ,
 958 you don't have to spell out in the HeaderDoc that you can't pass
 959 .Ic NULL
 960 for that parameter; it'll be right there in the declaration, and the compiler
 961 will catch attempts to do so.
 962 .Ss Distinguish C function definitions from declarations
 963 In C, make the definition of a function findable and distinguishable from its
 964 declaration (if any) through regular expressions. This way, you can find the
 965 implementation of
 966 .Ic foo
 967 by doing a regex search for
 968 .Ic ^foo ,
 969 and you won't get the declaration as a result.
 970 .Pp
 971 .Em Good
 972 .Bd -literal -offset indent
 973 static int foo(int bar);
 974
 975 int
 976 foo(int bar)
 977 {
 978     return bar;
 979 }
 980 .Ed
 981 .Pp
 982 .Em Bad
 983 .Bd -literal -offset indent
 984 static int foo(int bar);
 985
 986 int foo(int bar)
 987 {
 988     return bar;
 989 }
 990 .Ed
 991 .Pp
 992 This has the additional benefit of allowing you to change the name/parameter
 993 list of a function independently of the return type. A diff of either will not
 994 be confused with the rest of the function signature.
 995 .Ss Use HeaderDoc for API declarations
 996 Make them look nice. Include the appropriate decorations (including an explicit
 997 export attribute such as
 998 .Ic OS_EXPORT
 999 so it's very, very clear that it's intended to be API), availability attributes,
1000 and HeaderDoc. Put this stuff before the function.
1001 .Pp
1002 .Em Example
1003 .Bd -literal -offset indent
1004 /*!
1005  * @function foo
1006  * Returns `bar` and ignores another parameter.
1007  *
1008  * @param bar
1009  * The value to return.
1010  *
1011  * @param baz
1012  * The value to ignore.
1013  *
1014  * @result
1015  * The value of `bar`. This routine cannot fail.
1016  */
1017 __API_AVAILABLE(macos(10.14), ios(12.0), tvos(12.0), watchos(5.0))
1018 OS_EXPORT OS_WARN_RESULT OS_NONNULL2
1019 int
1020 foo(int bar, char *baz);
1021 .Ed
1022 .Ss Comments
1023 In general, use C++/C99-style comments. But there may be good reasons to use the
1024 classic C-style comments, such as for HeaderDoc, which requires them, e.g.
1025 .Bd -literal -offset indent
1026 /*!
1027  * Documentation
1028  */
1029 .Ed
1030 .Pp
1031 Long, top-level comments may also use classic C-style comments.
1032 .Pp
1033 C++/C99-style comments may directly follow code on the same line only if they
1034 are extremely brief. Otherwise, in general, comments and code should not share
1035 a line.
1036 .Pp
1037 Also, do not get cute with
1038 .Ic /* */
1039 comments and embed them within code.
1040 .Pp
1041 .Em Good
1042 .Bd -literal -offset indent
1043 // Comment on what the loop does.
1044 for (i = 0; i < cnt; i++) {
1045     // some code...
1046 }
1047
1048 /*
1049  * A top-level or very long comment.
1050  */
1051
1052 int ret = esoteric_spi(); // returns -1 on failure, does not set errno
1053 .Ed
1054 .Pp
1055 .Em Bad
1056 .Bd -literal -offset indent
1057 //Comment
1058
1059 int ret = esoteric_spi(); // This SPI returns -1 on failure but does not set
1060     // errno, so here is a comment explaining that that really should be above
1061     // the line of code rather than immediately following it.
1062
1063 foo(arg1, /* first argument */, arg2 /* second argument */);
1064 .Ed
1065 .Ss `case` and `switch` are indented at the same level
1066 .Ic case
1067 and
1068 .Ic switch
1069 belong at the same column indent because indentation indicates scope, and due to
1070 case fall-through, all cases are in the same scope -- one lower than the
1071 previous. (Unless you scope them explicitly with braces, but you should avoid
1072 doing that if at all possible.)
1073 .Pp
1074 .Em Good
1075 .Bd -literal -offset indent
1076 switch (thing) {
1077 case THING1:
1078     exit(0);
1079     break;
1080 case THING2:
1081     exit(1);
1082     break;
1083 default:
1084     __builtin_unreachable();
1085 }
1086 .Ed
1087 .Pp
1088 .Em Bad
1089 .Bd -literal -offset indent
1090 switch (thing) {
1091 case THING1: {
1092     exit(0);
1093     break;
1094 }
1095 case THING2: {
1096     exit(1);
1097     break;
1098 }
1099 default:
1100     __builtin_unreachable();
1101 }
1102
1103 switch (thing) {
1104     case THING1:
1105         exit(0);
1106         break;
1107     case THING2:
1108         exit(1);
1109         break;
1110     default: {
1111         __builtin_unreachable();
1112     }
1113 }
1114 .Ed
1115 .Ss Use typed `enum`s
1116 If you're declaring an
1117 .Ic enum ,
1118 you should
1119 .Ic typedef
1120 it so the compiler can reason about valid values and know the width of the
1121 .Ic enum
1122 type if possible. The
1123 .Ic OS_ENUM
1124 macro provides the correct behavior for C, C++, and Objective-C.
1125 .Ss Initialize all variables and fail closed
1126 If you pre-declare a variable before using it, initialize it to a sane value. If
1127 this value is something like the return value of the function, initialize it to
1128 a value which indicates failure of the operation. You should
1129 .Em always
1130 do this even if there are no code paths which fail to initialize the variable
1131 later. It's just good practice, and it gives the person reading your code an
1132 indication of what ranges of values the variable is expected to hold.
1133 .Pp
1134 .Em Good
1135 .Bd -literal -offset indent
1136 int result = -1;
1137
1138 if (success) {
1139     result = 0;
1140 }
1141 .Ed
1142 .Pp
1143 .Em Bad
1144 .Bd -literal -offset indent
1145 int result;
1146
1147 if (success) {
1148     result = 0;
1149 }
1150 .Ed
1151 .Pp
1152 Any error variable should always be initialized to a non-success condition. In
1153 general, consider success as something that your code must
1154 .Em explicitly declare
1155 and that the absence of such a declaration indicates failure.
1156 .Pp
1157 .Em Good
1158 .Bd -literal -offset indent
1159 int error = -1;
1160
1161 if (is_root()) {
1162     error = 0;
1163 } else {
1164     error = EPERM;
1165 }
1166 .Ed
1167 .Pp
1168 .Em Bad
1169 .Bd -literal -offset indent
1170 int error = 0;
1171
1172 if (!is_root()) {
1173     error = EPERM;
1174 }
1175 .Ed
1176 .Pp
1177 Note that you may omit an initializer for a complex
1178 .Ic struct
1179 type (such as the
1180 .Xr stat 2
1181 .Ic struct )
1182 but then it is incumbent upon you to ensure that that variable is not used
1183 uninitialized except to populate it. For many
1184 .Ic struct
1185 types, you can initialize them with
1186 .Ic {0} .
1187 This will not work for structures with nested structures though. For those you
1188 can use
1189 .Xr bzero 3
1190 or similar.
1191 .Ss Using `goto` is fine
1192 .Ic goto
1193 has gotten a bad rap, but it's probably the best way in C to do lots of
1194 sequential error handling. You don't
1195 .Em have
1196 to use
1197 .Ic goto
1198 if you don't want to, but if you do, just keep a a couple things in mind.
1199 .Pp
1200 .Bl -bullet -compact -offset indent
1201 .It
1202 Compile with
1203 .Ic -Wsometimes-uninitialized .
1204 With this warning,
1205 .Xr clang 1
1206 will catch cases where a variable may be used uninitialized because a
1207 .Ic goto
1208 skipped the initialization.
1209 .It
1210 Never use
1211 .Ic goto
1212 as a looping construct. The C language has a few different control statements
1213 for looping and iteration. Use one of those; it's not the 70's anymore.
1214 .El
1215 .Pp
1216 These guidelines make it simple to use
1217 .Ic goto
1218 effectively while avoiding the
1219 most common pitfalls.
1220 .Ss Avoid magic Booleans
1221 Sometimes you have to pass a parameter to a function to trigger some sort of
1222 behavior. Avoid using a magic Boolean for these cases. Instead, use an invariant
1223 that describes the behavior you are triggering.
1224 .Pp
1225 .Em Good
1226 .Bd -literal -offset indent
1227 replace_spaces(string, REPLACE_TABS_TOO);
1228 replace_spaces(string, REPLACE_ONLY_SPACES);
1229 .Ed
1230 .Pp
1231 .Em Bad
1232 .Bd -literal -offset indent
1233 replace_spaces(string, true);
1234 replace_spaces(string, false);
1235 .Ed
1236 .Pp
1237 If you find yourself creating many such Boolean values for function parameters,
1238 you should seriously considering defining a set of flags and passing that as one
1239 parameter instead.
1240 .Ss Spaces around binary operators
1241 In general, avoid code that looks crunched together, especially around
1242 operators. Specifically:
1243 .Bl -bullet -compact -offset indent
1244 .It
1245 Unary operators should
1246 .Em not
1247 have spaces around them.
1248 .It
1249 Binary operators
1250 .Em should
1251 have spaces around them.
1252 .It
1253 The ternary operator
1254 .Em should
1255 have spacing around it.
1256 .El
1257 .Pp
1258 .Em Good
1259 .Bd -literal -offset indent
1260 i++;
1261 j = i + k;
1262 k += condition ? i : j;
1263 .Ed
1264 .Pp
1265 .Em Bad
1266 .Bd -literal -offset indent
1267 i ++;
1268 j=i+k
1269 k+=condition?i:j;
1270 .Ed
1271 .Ss Reserve the ternary operator for trivial cases
1272 Don't use the ternary operator to choose between complex or long expressions.
1273 Reserve it for very trivial cases that are highly unlikely to change. In general
1274 if you've found yourself putting the expressions in your usage of ternary
1275 operator on multiple lines, you should just be using an
1276 .Ic if
1277 statement.
1278 .Pp
1279 .Em Good
1280 .Bd -literal -offset indent
1281 i += condition ? j : k;
1282 .Ed
1283 .Pp
1284 .Em Bad
1285 .Bd -literal -offset indent
1286 i += (i < j && j > k || i == j) ? foo(bar, baz, 0, NULL) : frob(bar, 0, NULL, baz);
1287 .Ed
1288 .Ss Spaces around parentheses
1289 .Bl -bullet -compact -offset indent
1290 .It
1291 Put a space between the control statement and the parenthesis indicating its
1292 condition.
1293 .It
1294 Do
1295 .Em not
1296 put a space between the end of a function name and the parenthesis
1297 indicating its argument list.
1298 .It
1299 Do
1300 .Em not
1301 put spaces between any parenthesis and its following content.
1302 .El
1303 .Pp
1304 .Em Good
1305 .Bd -literal -offset indent
1306 if (condition) {
1307     do_thing();
1308 }
1309 .Ed
1310 .Pp
1311 .Em Bad
1312 .Bd -literal -offset indent
1313 if(condition) {
1314     do_thing ();
1315 }
1316
1317 if ( condition ) {
1318     do_thing ( argument );
1319 }
1320 .Ed
1321 .Pp
1322 .Em Worse
1323 .Bd -literal -offset indent
1324 while( condition) {
1325     do_thing( );
1326 }
1327 .Ed
1328 .Ss Braces and statements
1329 Always, always, always use braces for your control statements. Lack of braces
1330 can and has led to serious security issues that were missed during code review,
1331 and putting the braces there from the start means that adding new statements to
1332 that clause does not require you to also add the braces.
1333 .Pp
1334 The clause should be indented on the next line with no blank lines in between.
1335 .Pp
1336 .Em Good
1337 .Bd -literal -offset indent
1338 if (condition) {
1339     do_thing();
1340 }
1341
1342 while (condition) {
1343     do_thing();
1344 }
1345 .Ed
1346 .Pp
1347 .Em Bad
1348 .Bd -literal -offset indent
1349 if (condition) do_thing();
1350
1351 if (condition)
1352     do_thing();
1353
1354 while (condition) do_thing();
1355
1356 while (condition) {
1357
1358     do_thing();
1359 }
1360 .Ed
1361 .Pp
1362 Even trivial uses of braceless
1363 .Ic if
1364 statements are problematic. Consider the following:
1365 .Pp
1366 .Em Bad
1367 .Bd -literal -offset indent
1368 if (error) i++,
1369 i++;
1370 .Ed
1371 .Pp
1372 This is admittedly contrived, but it would be likely to escape code review
1373 because it's very easy to miss that the first line ends with a
1374 .Ic ,
1375 rather than a
1376 .Ic ; .
1377 Braces in
1378 .Ic if
1379 statements are sensitive enough to security that the best policy is to simply
1380 always use them, without exception.
1381 .Pp
1382 Specific rules for braces:
1383 .Bl -bullet -compact -offset indent
1384 .It
1385 .Ic else
1386 goes between two braces on the same line.
1387 .It
1388 The brace which indicates the expression associated with a control flow
1389 statement goes on the same line as that statement or the same line as the last
1390 continuation line of the statement.
1391 .It
1392 The brace which begins the definition of a
1393 .Ic struct ,
1394 .Ic union ,
1395 .Ic enum ,
1396 etc. goes on the same line as the declaration.
1397 .It
1398 The brace concluding the expression associated with a control flow statement
1399 is aligned with the same column as that control flow statement.
1400 .It
1401 The opening brace of a function definition goes on its own line and is
1402 immediately followed by a new line.
1403 .It
1404 Control statements with empty bodies should have empty braces.
1405 .El
1406 .Pp
1407 .Em Good
1408 .Bd -literal -offset indent
1409 if (condition) {
1410     do_thing();
1411 } else {
1412     do_other_thing();
1413 }
1414
1415 void
1416 function(void)
1417 {
1418     return;
1419 }
1420
1421 struct my_struct {
1422     uint32_t thing;
1423 };
1424
1425 for (cur; cur; cur = cur->next) { }
1426 .Ed
1427 .Pp
1428 .Em Bad
1429 .Bd -literal -offset indent
1430 if (condition)
1431 {
1432     do_thing();
1433 }
1434 else
1435 {
1436     do_other_thing();
1437 }
1438
1439 if (condition)
1440 {
1441     do_thing();
1442 }
1443 else
1444     do_other_thing();
1445
1446 void
1447 function(void) {
1448     return;
1449 }
1450
1451 struct my_struct
1452 {
1453     uint32_t thing;
1454 };
1455
1456 for (cur; cur; cur = cur->next)
1457 .Ed
1458 .Pp
1459 .Em Worse
1460 .Bd -literal -offset indent
1461 if (condition)
1462     {
1463     do_thing();
1464     }
1465
1466 void
1467 function(void)
1468 { return;
1469 }
1470 .Ed
1471 .Sh SEE ALSO
1472 .Xr style 9 ,
1473 .Xr intro 2 ,
1474 .Xr errno 3 ,
1475 .Xr types 5
1476 .Sh HISTORY
1477 This style was largely derived from the style that evolved through the
1478 .Xr launchd 8 ,
1479 .Xr libdispatch 3 ,
1480 and
1481 .Xr libxpc 3
1482 projects.