about summary refs log tree commit diff
path: root/content/notes
diff options
context:
space:
mode:
Diffstat (limited to 'content/notes')
-rw-r--r--content/notes/containerd-to-firecracker.md682
-rw-r--r--content/notes/making-sense-intel-amd-cpus.md191
-rw-r--r--content/notes/stuff-about-pcie.md242
-rw-r--r--content/notes/working-with-go.md286
-rw-r--r--content/notes/working-with-nix.md46
5 files changed, 1447 insertions, 0 deletions
diff --git a/content/notes/containerd-to-firecracker.md b/content/notes/containerd-to-firecracker.md
new file mode 100644
index 0000000..b64586b
--- /dev/null
+++ b/content/notes/containerd-to-firecracker.md
@@ -0,0 +1,682 @@
+---
+title: containerd to firecracker
+date: 2021-05-15
+tags:
+  - linux
+  - firecracker
+  - containerd
+  - go
+---
+
+fly.io had an [interesting
+article](https://fly.io/blog/docker-without-docker/) about how they use
+docker images to create VMs for `firecracker`.
+
+They describe the process as follow:
+
+1.  Pull a container from a registry
+2.  Create a loop device to store the container's filesystem on
+3.  Unpack the container into the mounted loop device
+4.  Create a second block device and inject init, kernel, configuration
+    and other stuff
+5.  Attach persistent volumes (if any)
+6.  Create a TAP device and configure it
+7.  Hand it off to Firecracker and boot that thing
+
+That's pretty detailed, and I'm curious how difficult it is to implement
+this. I've been meaning to look into Firecracker for a while and into
+containers'd API, so this is a perfect opportunity to get started. The
+code is available [here](https://github.com/fcuny/containerd-to-vm).
+
+# #1 Pull a container from a registry with `containerd`
+
+`containerd` has a pretty [detailed
+documentation](https://pkg.go.dev/github.com/containerd/containerd).
+From the main page we can see the following example to create a client.
+
+``` go
+import (
+  "github.com/containerd/containerd"
+  "github.com/containerd/containerd/cio"
+)
+
+
+func main() {
+    client, err := containerd.New("/run/containerd/containerd.sock")
+    defer client.Close()
+}
+```
+
+And pulling an image is also pretty straightforward:
+
+``` go
+image, err := client.Pull(context, "docker.io/library/redis:latest")
+```
+
+The `Pull` method returns an
+[`Image`](https://pkg.go.dev/github.com/containerd/containerd@v1.4.4/images#Image)
+and there's a few methods associated with it.
+
+As `containerd` has namespaces, it's possible to specify the namespace
+we want to use when working with the API:
+
+``` go
+ctx := namespaces.WithNamespace(context.Background(), "c2vm")
+image, err := client.Pull(ctx, "docker.io/library/redis:latest")
+```
+
+The image will now be stored in the `c2vm` namespace. We can verify this
+with:
+
+``` bash
+; sudo ctr -n c2vm images ls -q
+docker.io/library/redis:latest
+```
+
+# #2 Create a loop device to store the container's filesystem on
+
+This is going to be pretty straightforward. To create a loop device we
+need to:
+
+1.  pre-allocate space to a file
+2.  convert that file to some format
+3.  mount it to some destination
+
+There's two commons ways to pre-allocate space to a file: `dd` and
+`fallocate` (there's likely way more ways to do this). I'll go with
+`fallocate` for this example.
+
+First, to be safe, we create a temporary file, and use `renameio` to
+handle the renaming (I recommend reading the doc of the module).
+
+``` go
+f, err := renameio.TempFile("", rawFile)
+if err != nil {
+    return err
+}
+defer f.Cleanup()
+```
+
+Now to do the pre-allocation (we're making an assumption here that 2GB
+is enough, we can likely check what's the size of the container before
+doing this):
+
+``` go
+command := exec.Command("fallocate", "-l", "2G", f.Name())
+if err := command.Run(); err != nil {
+    return fmt.Errorf("fallocate error: %s", err)
+}
+```
+
+We can now convert that file to ext4:
+
+``` go
+command = exec.Command("mkfs.ext4", "-F", f.Name())
+if err := command.Run(); err != nil {
+    return fmt.Errorf("mkfs.ext4 error: %s", err)
+}
+```
+
+Now we can rename safely the temporary file to the proper file we want:
+
+``` go
+f.CloseAtomicallyReplace()
+```
+
+And to mount that file
+
+``` go
+command = exec.Command("mount", "-o", "loop", rawFile, mntDir)
+if err := command.Run(); err != nil {
+    return fmt.Errorf("mount error: %s", err)
+}
+```
+
+# #3 Unpack the container into the mounted loop device
+
+Extracting the container using `containerd` is pretty simple. Here's the
+function that I use:
+
+``` go
+func extract(ctx context.Context, client *containerd.Client, image containerd.Image, mntDir string) error {
+    manifest, err := images.Manifest(ctx, client.ContentStore(), image.Target(), platform)
+    if err != nil {
+        log.Fatalf("failed to get the manifest: %v\n", err)
+    }
+
+    for _, desc := range manifest.Layers {
+        log.Printf("extracting layer %s\n", desc.Digest.String())
+        layer, err := client.ContentStore().ReaderAt(ctx, desc)
+        if err != nil {
+            return err
+        }
+        if err := archive.Untar(content.NewReader(layer), mntDir, &archive.TarOptions{NoLchown: true}); err != nil {
+            return err
+        }
+    }
+
+    return nil
+}
+```
+
+Calling `images.Manifest` returns the
+[manifest](https://github.com/opencontainers/image-spec/blob/master/manifest.md)
+from the image. What we care here are the list of layers. Here I'm
+making a number of assumptions regarding their type (we should be
+checking the media type first). We read the layers and extract them to
+the mounted path.
+
+# #4 Create a second block device and inject other stuff
+
+Here I'm going to deviate a bit. I will not create a second loop device,
+and I will not inject a kernel. In their article, they provided a link
+to a snapshot of their `init` process
+(<https://github.com/superfly/init-snapshot>). In order to keep this
+simple, our init is going to be a shell script composed of the content
+of the entry point of the container. We're also going to add a few extra
+files to container (`/etc/hosts` and `/etc/resolv.conf`).
+
+Finally, since we've pre-allocated 2GB for that container, and we likely
+don't need that much, we're also going to resize the image.
+
+## Add init
+
+Let's refer to the [specification for the
+config](https://github.com/opencontainers/image-spec/blob/master/config.md).
+The elements that are of interest to me are:
+
+-   `Env`, which is array of strings. They contain the environment
+    variables that likely we need to run the program
+-   `Cmd`, which is also an array of strings. If there's no entry point
+    provided, this is what is used.
+
+At this point, for this experiment, I'm going to ignore exposed ports,
+working directory, and the user.
+
+First we need to read the config from the container. This is easily
+done:
+
+``` go
+config, err := images.Config(ctx, client.ContentStore(), image.Target(), platform)
+if err != nil {
+    return err
+}
+```
+
+This needs to be read and decoded:
+
+``` go
+configBlob, err := content.ReadBlob(ctx, client.ContentStore(), config)
+var imageSpec ocispec.Image
+json.Unmarshal(configBlob, &imageSpec)
+```
+
+`init` is the first process started by Linux during boot. On a regular
+Linux desktop you likely have a symbolic link from `/usr/bin/init` to
+`/usr/lib/systemd/systemd`, since most distributions have switched to
+`systemd`. For my use case however, I want to run a single process, and
+I want it to be the one from the container. For this we can create a
+simple shell script inside the container (the location does not matter
+for now) with the environment variables and the command.
+
+Naively, this can be done like this:
+
+``` go
+initPath := filepath.Join(mntDir, "init.sh")
+f, err := renameio.TempFile("", initPath)
+if err != nil {
+    return err
+}
+defer f.Cleanup()
+
+writer := bufio.NewWriter(f)
+fmt.Fprintf(writer, "#!/bin/sh\n")
+for _, env := range initEnvs {
+    fmt.Fprintf(writer, "export %s\n", env)
+}
+fmt.Fprintf(writer, "%s\n", initCmd)
+writer.Flush()
+
+f.CloseAtomicallyReplace()
+
+mode := int(0755)
+os.Chmod(initPath, os.FileMode(mode))
+```
+
+We're once again creating a temporary file with `renamio`, and we're
+writing our shell scripts, one line at a time. We only need to make sure
+this executable.
+
+## extra files
+
+Once we have our init file, I also want to add a few extra files:
+`/etc/hosts` and `/etc/resolv.conf`. This files are not always present,
+since they can be injected by other systems. I also want to make sure
+that DNS resolutions are done using my own DNS server.
+
+## resize the image
+
+We've pre-allocated 2GB for the image, and it's likely we don't need as
+much space. We can do this by running `e2fsck` and `resize2fs` once
+we're done manipulating the image.
+
+Within a function, we can do the following:
+
+``` go
+command := exec.Command("/usr/bin/e2fsck", "-p", "-f", rawFile)
+if err := command.Run(); err != nil {
+    return fmt.Errorf("e2fsck error: %s", err)
+}
+
+command = exec.Command("resize2fs", "-M", rawFile)
+if err := command.Run(); err != nil {
+    return fmt.Errorf("resize2fs error: %s", err)
+}
+```
+
+I'm using `docker.io/library/redis:latest` for my test, and I end up
+with the following size for the image:
+
+``` bash
+-rw------- 1 root root 216M Apr 22 14:50 /tmp/fcuny.img
+```
+
+## Kernel
+
+We're going to need a kernel to run that VM. In my case I've decided to
+go with version 5.8, and build a custom kernel. If you are not familiar
+with the process, the firecracker team has [documented how to do
+this](https://github.com/firecracker-microvm/firecracker/blob/main/docs/rootfs-and-kernel-setup.md#creating-a-kernel-image).
+In my case all I had to do was:
+
+``` bash
+git clone https://github.com/torvalds/linux.git linux.git
+cd linux.git
+git checkout v5.8
+curl -o .config -s https://github.com/firecracker-microvm/firecracker/blob/main/resources/microvm-kernel-x86_64.config
+make menuconfig
+make vmlinux -j8
+```
+
+Note that they also have a pretty [good documentation for
+production](https://github.com/firecracker-microvm/firecracker/blob/main/docs/prod-host-setup.md).
+
+# #5 Attach persistent volumes (if any)
+
+I'm going to skip that step for now.
+
+# #6 Create a TAP device and configure it
+
+We're going to need a network for that VM (otherwise it might be a bit
+boring). There's a few solutions that we can take:
+
+1.  create the TAP device
+2.  delegate all that work to a
+    [CNI](https://github.com/containernetworking/cni)
+
+I've decided to use the CNI approach [documented in the Go's
+SDK](https://github.com/firecracker-microvm/firecracker-go-sdk#cni). For
+this to work we need to install the `tc-redirect-tap` CNI plugin
+(available at <https://github.com/awslabs/tc-redirect-tap>).
+
+Based on that documentation, I'll start with the following configuration
+in `etc/cni/conf.d/50-c2vm.conflist`:
+
+``` json
+{
+  "name": "c2vm",
+  "cniVersion": "0.4.0",
+  "plugins": [
+    {
+      "type": "bridge",
+      "bridge": "c2vm-br",
+      "isDefaultGateway": true,
+      "forceAddress": false,
+      "ipMasq": true,
+      "hairpinMode": true,
+      "mtu": 1500,
+      "ipam": {
+        "type": "host-local",
+        "subnet": "192.168.128.0/24",
+        "resolvConf": "/etc/resolv.conf"
+      }
+    },
+    {
+      "type": "firewall"
+    },
+    {
+      "type": "tc-redirect-tap"
+    }
+  ]
+}
+```
+
+# #7 Hand it off to Firecracker and boot that thing
+
+Now that we have all the components, we need to boot that VM. Since I've
+been working with Go so far, I'll also use the [Go
+SDK](https://github.com/firecracker-microvm/firecracker-go-sdk) to
+manage and start the VM.
+
+For this we need the firecracker binary, which we can [find on
+GitHub](https://github.com/firecracker-microvm/firecracker/releases).
+
+The first thing is to configure the list of devices. In our case we will
+have a single device, the boot drive that we've created in the previous
+step.
+
+``` go
+devices := make([]models.Drive, 1)
+devices[0] = models.Drive{
+    DriveID:      firecracker.String("1"),
+    PathOnHost:   &rawImage,
+    IsRootDevice: firecracker.Bool(true),
+    IsReadOnly:   firecracker.Bool(false),
+}
+```
+
+The next step is to configure the VM:
+
+``` go
+fcCfg := firecracker.Config{
+    LogLevel:        "debug",
+    SocketPath:      firecrackerSock,
+    KernelImagePath: linuxKernel,
+    KernelArgs:      "console=ttyS0 reboot=k panic=1 acpi=off pci=off i8042.noaux i8042.nomux i8042.nopnp i8042.dumbkbd init=/init.sh random.trust_cpu=on",
+    Drives:          devices,
+    MachineCfg: models.MachineConfiguration{
+        VcpuCount:   firecracker.Int64(1),
+        CPUTemplate: models.CPUTemplate("C3"),
+        HtEnabled:   firecracker.Bool(true),
+        MemSizeMib:  firecracker.Int64(512),
+    },
+    NetworkInterfaces: []firecracker.NetworkInterface{
+        {
+            CNIConfiguration: &firecracker.CNIConfiguration{
+                NetworkName: "c2vm",
+                IfName:      "eth0",
+            },
+        },
+    },
+}
+```
+
+Finally we can create the command to start and run the VM:
+
+``` go
+command := firecracker.VMCommandBuilder{}.
+    WithBin(firecrackerBinary).
+    WithSocketPath(fcCfg.SocketPath).
+    WithStdin(os.Stdin).
+    WithStdout(os.Stdout).
+    WithStderr(os.Stderr).
+    Build(ctx)
+machineOpts = append(machineOpts, firecracker.WithProcessRunner(command))
+m, err := firecracker.NewMachine(vmmCtx, fcCfg, machineOpts...)
+if err != nil {
+    panic(err)
+}
+
+if err := m.Start(vmmCtx); err != nil {
+    panic(err)
+}
+defer m.StopVMM()
+
+if err := m.Wait(vmmCtx); err != nil {
+    panic(err)
+}
+```
+
+The end result:
+
+    ; sudo ./c2vm -container docker.io/library/redis:latest -firecracker-binary ./hack/firecracker/firecracker-v0.24.3-x86_64 -linux-kernel ./hack/linux/my-linux.bin -out /tmp/redis.img
+    2021/05/15 14:12:59 pulled docker.io/library/redis:latest (38690247 bytes)
+    2021/05/15 14:13:00 mounted /tmp/redis.img on /tmp/c2vm026771514
+    2021/05/15 14:13:00 extracting layer sha256:69692152171afee1fd341febc390747cfca2ff302f2881d8b394e786af605696
+    2021/05/15 14:13:00 extracting layer sha256:a4a46f2fd7e06fab84b4e78eb2d1b6d007351017f9b18dbeeef1a9e7cf194e00
+    2021/05/15 14:13:00 extracting layer sha256:bcdf6fddc3bdaab696860eb0f4846895c53a3192c9d7bf8d2275770ea8073532
+    2021/05/15 14:13:01 extracting layer sha256:b7e9b50900cc06838c44e0fc5cbebe5c0b3e7f70c02f32dd754e1aa6326ed566
+    2021/05/15 14:13:01 extracting layer sha256:5f3030c50d85a9d2f70adb610b19b63290c6227c825639b227ddc586f86d1c76
+    2021/05/15 14:13:01 extracting layer sha256:63dae8e0776cdbd63909fbd9c047c1615a01cb21b73efa87ae2feed680d3ffa1
+    2021/05/15 14:13:01 init script created
+    2021/05/15 14:13:01 umount /tmp/c2vm026771514
+    INFO[0003] Called startVMM(), setting up a VMM on firecracker.sock
+    INFO[0003] VMM logging disabled.
+    INFO[0003] VMM metrics disabled.
+    INFO[0003] refreshMachineConfiguration: [GET /machine-config][200] getMachineConfigurationOK  &{CPUTemplate:C3 HtEnabled:0xc0004e6753 MemSizeMib:0xc0004e6748 VcpuCount:0xc0004e6740}
+    INFO[0003] PutGuestBootSource: [PUT /boot-source][204] putGuestBootSourceNoContent
+    INFO[0003] Attaching drive /tmp/redis.img, slot 1, root true.
+    INFO[0003] Attached drive /tmp/redis.img: [PUT /drives/{drive_id}][204] putGuestDriveByIdNoContent
+    INFO[0003] Attaching NIC tap0 (hwaddr 9e:72:c7:04:6b:80) at index 1
+    INFO[0003] startInstance successful: [PUT /actions][204] createSyncActionNoContent
+    [    0.000000] Linux version 5.8.0 (fcuny@nas) (gcc (Debian 8.3.0-6) 8.3.0, GNU ld (GNU Binutils for Debian) 2.31.1) #1 SMP Mon Apr 12 20:07:40 PDT 2021
+    [    0.000000] Command line: i8042.dumbkbd ip=192.168.128.9::192.168.128.1:255.255.255.0:::off::: console=ttyS0 reboot=k panic=1 acpi=off pci=off i8042.noaux i8042.nomux i8042.nopnp init=/init.sh random.trust_cpu=on root=/dev/vda rw virtio_mmio.device=4K@0xd0000000:5 virtio_mmio.device=4K@0xd0001000:6
+    [    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
+    [    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
+    [    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
+    [    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
+    [    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
+    [    0.000000] BIOS-provided physical RAM map:
+    [    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
+    [    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001fffffff] usable
+    [    0.000000] NX (Execute Disable) protection: active
+    [    0.000000] DMI not present or invalid.
+    [    0.000000] Hypervisor detected: KVM
+    [    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
+    [    0.000000] kvm-clock: cpu 0, msr 2401001, primary cpu clock
+    [    0.000000] kvm-clock: using sched offset of 11918596 cycles
+    [    0.000005] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
+    [    0.000011] tsc: Detected 1190.400 MHz processor
+    [    0.000108] last_pfn = 0x20000 max_arch_pfn = 0x400000000
+    [    0.000151] Disabled
+    [    0.000156] x86/PAT: MTRRs disabled, skipping PAT initialization too.
+    [    0.000166] CPU MTRRs all blank - virtualized system.
+    [    0.000170] x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC
+    [    0.000201] found SMP MP-table at [mem 0x0009fc00-0x0009fc0f]
+    [    0.000257] check: Scanning 1 areas for low memory corruption
+    [    0.000364] No NUMA configuration found
+    [    0.000365] Faking a node at [mem 0x0000000000000000-0x000000001fffffff]
+    [    0.000370] NODE_DATA(0) allocated [mem 0x1ffde000-0x1fffffff]
+    [    0.000490] Zone ranges:
+    [    0.000493]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
+    [    0.000494]   DMA32    [mem 0x0000000001000000-0x000000001fffffff]
+    [    0.000495]   Normal   empty
+    [    0.000497] Movable zone start for each node
+    [    0.000500] Early memory node ranges
+    [    0.000501]   node   0: [mem 0x0000000000001000-0x000000000009efff]
+    [    0.000502]   node   0: [mem 0x0000000000100000-0x000000001fffffff]
+    [    0.000510] Zeroed struct page in unavailable ranges: 98 pages
+    [    0.000511] Initmem setup node 0 [mem 0x0000000000001000-0x000000001fffffff]
+    [    0.004990] Intel MultiProcessor Specification v1.4
+    [    0.004995] MPTABLE: OEM ID: FC
+    [    0.004995] MPTABLE: Product ID: 000000000000
+    [    0.004996] MPTABLE: APIC at: 0xFEE00000
+    [    0.005007] Processor #0 (Bootup-CPU)
+    [    0.005039] IOAPIC[0]: apic_id 2, version 17, address 0xfec00000, GSI 0-23
+    [    0.005041] Processors: 1
+    [    0.005042] TSC deadline timer available
+    [    0.005044] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
+    [    0.005060] KVM setup pv remote TLB flush
+    [    0.005072] KVM setup pv sched yield
+    [    0.005078] PM: hibernation: Registered nosave memory: [mem 0x00000000-0x00000fff]
+    [    0.005079] PM: hibernation: Registered nosave memory: [mem 0x0009f000-0x000fffff]
+    [    0.005081] [mem 0x20000000-0xffffffff] available for PCI devices
+    [    0.005082] Booting paravirtualized kernel on KVM
+    [    0.005084] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645519600211568 ns
+    [    0.005087] setup_percpu: NR_CPUS:128 nr_cpumask_bits:128 nr_cpu_ids:1 nr_node_ids:1
+    [    0.006381] percpu: Embedded 44 pages/cpu s143360 r8192 d28672 u2097152
+    [    0.006404] KVM setup async PF for cpu 0
+    [    0.006410] kvm-stealtime: cpu 0, msr 1f422080
+    [    0.006420] Built 1 zonelists, mobility grouping on.  Total pages: 128905
+    [    0.006420] Policy zone: DMA32
+    [    0.006422] Kernel command line: i8042.dumbkbd ip=192.168.128.9::192.168.128.1:255.255.255.0:::off::: console=ttyS0 reboot=k panic=1 acpi=off pci=off i8042.noaux i8042.nomux i8042.nopnp init=/init.sh random.trust_cpu=on root=/dev/vda rw virtio_mmio.device=4K@0xd0000000:5 virtio_mmio.device=4K@0xd0001000:6
+    [    0.006858] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes, linear)
+    [    0.007003] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes, linear)
+    [    0.007047] mem auto-init: stack:off, heap alloc:off, heap free:off
+    [    0.007947] Memory: 491940K/523896K available (10243K kernel code, 629K rwdata, 1860K rodata, 1408K init, 6048K bss, 31956K reserved, 0K cma-reserved)
+    [    0.007980] random: get_random_u64 called from __kmem_cache_create+0x3d/0x540 with crng_init=0
+    [    0.008053] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=1, Nodes=1
+    [    0.008146] rcu: Hierarchical RCU implementation.
+    [    0.008147] rcu:     RCU restricting CPUs from NR_CPUS=128 to nr_cpu_ids=1.
+    [    0.008151] rcu: RCU calculated value of scheduler-enlistment delay is 25 jiffies.
+    [    0.008152] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
+    [    0.008170] NR_IRQS: 4352, nr_irqs: 48, preallocated irqs: 16
+    [    0.008373] random: crng done (trusting CPU's manufacturer)
+    [    0.008430] Console: colour dummy device 80x25
+    [    0.052276] printk: console [ttyS0] enabled
+    [    0.052685] APIC: Switch to symmetric I/O mode setup
+    [    0.053288] x2apic enabled
+    [    0.053705] Switched APIC routing to physical x2apic.
+    [    0.054213] KVM setup pv IPIs
+    [    0.055559] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1128af0325d, max_idle_ns: 440795261011 ns
+    [    0.056516] Calibrating delay loop (skipped) preset value.. 2380.80 BogoMIPS (lpj=4761600)
+    [    0.057259] pid_max: default: 32768 minimum: 301
+    [    0.057726] LSM: Security Framework initializing
+    [    0.058176] SELinux:  Initializing.
+    [    0.058556] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
+    [    0.059221] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes, linear)
+    [    0.060382] x86/cpu: User Mode Instruction Prevention (UMIP) activated
+    [    0.060510] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
+    [    0.060510] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
+    [    0.060510] Spectre V1 : Mitigation: usercopy/swapgs barriers and __user pointer sanitization
+    [    0.060510] Spectre V2 : Mitigation: Enhanced IBRS
+    [    0.060510] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
+    [    0.060510] Spectre V2 : mitigation: Enabling conditional Indirect Branch Prediction Barrier
+    [    0.060510] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
+    [    0.060510] Freeing SMP alternatives memory: 32K
+    [    0.060510] smpboot: CPU0: Intel(R) Xeon(R) Processor @ 1.20GHz (family: 0x6, model: 0x3e, stepping: 0x4)
+    [    0.060510] Performance Events: unsupported p6 CPU model 62 no PMU driver, software events only.
+    [    0.060510] rcu: Hierarchical SRCU implementation.
+    [    0.060510] smp: Bringing up secondary CPUs ...
+    [    0.060510] smp: Brought up 1 node, 1 CPU
+    [    0.060510] smpboot: Max logical packages: 1
+    [    0.060523] smpboot: Total of 1 processors activated (2380.80 BogoMIPS)
+    [    0.061338] devtmpfs: initialized
+    [    0.061710] x86/mm: Memory block size: 128MB
+    [    0.062341] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645041785100000 ns
+    [    0.063245] futex hash table entries: 256 (order: 2, 16384 bytes, linear)
+    [    0.063946] thermal_sys: Registered thermal governor 'fair_share'
+    [    0.063946] thermal_sys: Registered thermal governor 'step_wise'
+    [    0.064522] thermal_sys: Registered thermal governor 'user_space'
+    [    0.065313] NET: Registered protocol family 16
+    [    0.066398] DMA: preallocated 128 KiB GFP_KERNEL pool for atomic allocations
+    [    0.067057] DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA pool for atomic allocations
+    [    0.067778] DMA: preallocated 128 KiB GFP_KERNEL|GFP_DMA32 pool for atomic allocations
+    [    0.068506] audit: initializing netlink subsys (disabled)
+    [    0.068708] cpuidle: using governor ladder
+    [    0.069097] cpuidle: using governor menu
+    [    0.070636] audit: type=2000 audit(1621113181.800:1): state=initialized audit_enabled=0 res=1
+    [    0.076346] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
+    [    0.077007] ACPI: Interpreter disabled.
+    [    0.077445] SCSI subsystem initialized
+    [    0.077812] pps_core: LinuxPPS API ver. 1 registered
+    [    0.078277] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
+    [    0.079206] PTP clock support registered
+    [    0.079741] NetLabel: Initializing
+    [    0.080111] NetLabel:  domain hash size = 128
+    [    0.080529] NetLabel:  protocols = UNLABELED CIPSOv4 CALIPSO
+    [    0.081113] NetLabel:  unlabeled traffic allowed by default
+    [    0.082072] clocksource: Switched to clocksource kvm-clock
+    [    0.082715] VFS: Disk quotas dquot_6.6.0
+    [    0.083123] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
+    [    0.083855] pnp: PnP ACPI: disabled
+    [    0.084510] NET: Registered protocol family 2
+    [    0.084718] tcp_listen_portaddr_hash hash table entries: 256 (order: 0, 4096 bytes, linear)
+    [    0.085602] TCP established hash table entries: 4096 (order: 3, 32768 bytes, linear)
+    [    0.086365] TCP bind hash table entries: 4096 (order: 4, 65536 bytes, linear)
+    [    0.087025] TCP: Hash tables configured (established 4096 bind 4096)
+    [    0.087749] UDP hash table entries: 256 (order: 1, 8192 bytes, linear)
+    [    0.088481] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes, linear)
+    [    0.089261] NET: Registered protocol family 1
+    [    0.090395] virtio-mmio: Registering device virtio-mmio.0 at 0xd0000000-0xd0000fff, IRQ 5.
+    [    0.091388] virtio-mmio: Registering device virtio-mmio.1 at 0xd0001000-0xd0001fff, IRQ 6.
+    [    0.092222] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1128af0325d, max_idle_ns: 440795261011 ns
+    [    0.093322] clocksource: Switched to clocksource tsc
+    [    0.093824] platform rtc_cmos: registered platform RTC device (no PNP device found)
+    [    0.094618] check: Scanning for low memory corruption every 60 seconds
+    [    0.095394] Initialise system trusted keyrings
+    [    0.095836] Key type blacklist registered
+    [    0.096427] workingset: timestamp_bits=36 max_order=17 bucket_order=0
+    [    0.097849] squashfs: version 4.0 (2009/01/31) Phillip Lougher
+    [    0.107488] Key type asymmetric registered
+    [    0.107905] Asymmetric key parser 'x509' registered
+    [    0.108409] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 252)
+    [    0.109435] Serial: 8250/16550 driver, 1 ports, IRQ sharing disabled
+    [    0.110116] serial8250: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A
+    [    0.111877] loop: module loaded
+    [    0.112426] virtio_blk virtio0: [vda] 441152 512-byte logical blocks (226 MB/215 MiB)
+    [    0.113229] vda: detected capacity change from 0 to 225869824
+    [    0.114143] Loading iSCSI transport class v2.0-870.
+    [    0.114753] iscsi: registered transport (tcp)
+    [    0.115162] tun: Universal TUN/TAP device driver, 1.6
+    [    0.115955] i8042: PNP detection disabled
+    [    0.116498] serio: i8042 KBD port at 0x60,0x64 irq 1
+    [    0.117089] input: AT Raw Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
+    [    0.117932] intel_pstate: CPU model not supported
+    [    0.118448] hid: raw HID events driver (C) Jiri Kosina
+    [    0.119090] Initializing XFRM netlink socket
+    [    0.119555] NET: Registered protocol family 10
+    [    0.120285] Segment Routing with IPv6
+    [    0.120812] NET: Registered protocol family 17
+    [    0.121350] Bridge firewalling registered
+    [    0.122026] NET: Registered protocol family 40
+    [    0.122515] IPI shorthand broadcast: enabled
+    [    0.122961] sched_clock: Marking stable (72512224, 48198862)->(137683636, -16972550)
+    [    0.123796] registered taskstats version 1
+    [    0.124203] Loading compiled-in X.509 certificates
+    [    0.125355] Loaded X.509 cert 'Build time autogenerated kernel key: 6203e6adc37b712d3b220a26b38f3d31311d5966'
+    [    0.126355] Key type ._fscrypt registered
+    [    0.126736] Key type .fscrypt registered
+    [    0.127109] Key type fscrypt-provisioning registered
+    [    0.127657] Key type encrypted registered
+    [    0.144629] IP-Config: Complete:
+    [    0.144968]      device=eth0, hwaddr=9e:72:c7:04:6b:80, ipaddr=192.168.128.9, mask=255.255.255.0, gw=192.168.128.1
+    [    0.146044]      host=192.168.128.9, domain=, nis-domain=(none)
+    [    0.146604]      bootserver=255.255.255.255, rootserver=255.255.255.255, rootpath=
+    [    0.148347] EXT4-fs (vda): mounted filesystem with ordered data mode. Opts: (null)
+    [    0.149098] VFS: Mounted root (ext4 filesystem) on device 254:0.
+    [    0.149761] devtmpfs: mounted
+    [    0.150340] Freeing unused decrypted memory: 2040K
+    [    0.151148] Freeing unused kernel image (initmem) memory: 1408K
+    [    0.156621] Write protecting the kernel read-only data: 14336k
+    [    0.158657] Freeing unused kernel image (text/rodata gap) memory: 2044K
+    [    0.159490] Freeing unused kernel image (rodata/data gap) memory: 188K
+    [    0.160150] Run /init.sh as init process
+    462:C 15 May 2021 21:13:01.903 # oO0OoO0OoO0Oo Redis is starting oO0OoO0OoO0Oo
+    462:C 15 May 2021 21:13:01.904 # Redis version=6.2.3, bits=64, commit=00000000, modified=0, pid=462, just started
+    462:C 15 May 2021 21:13:01.905 # Warning: no config file specified, using the default config. In order to specify a config file use redis-server /path/to/redis.conf
+    462:M 15 May 2021 21:13:01.907 * Increased maximum number of open files to 10032 (it was originally set to 1024).
+    462:M 15 May 2021 21:13:01.909 * monotonic clock: POSIX clock_gettime
+                    _._
+               _.-``__ ''-._
+          _.-``    `.  `_.  ''-._           Redis 6.2.3 (00000000/0) 64 bit
+      .-`` .-```.  ```\/    _.,_ ''-._
+     (    '      ,       .-`  | `,    )     Running in standalone mode
+     |`-._`-...-` __...-.``-._|'` _.-'|     Port: 6379
+     |    `-._   `._    /     _.-'    |     PID: 462
+      `-._    `-._  `-./  _.-'    _.-'
+     |`-._`-._    `-.__.-'    _.-'_.-'|
+     |    `-._`-._        _.-'_.-'    |           https://redis.io
+      `-._    `-._`-.__.-'_.-'    _.-'
+     |`-._`-._    `-.__.-'    _.-'_.-'|
+     |    `-._`-._        _.-'_.-'    |
+      `-._    `-._`-.__.-'_.-'    _.-'
+          `-._    `-.__.-'    _.-'
+              `-._        _.-'
+                  `-.__.-'
+
+    462:M 15 May 2021 21:13:01.922 # Server initialized
+    462:M 15 May 2021 21:13:01.923 * Ready to accept connections
+
+We can do a quick test with the following:
+
+``` bash
+; sudo docker run -it --rm redis redis-cli -h 192.168.128.9
+192.168.128.9:6379> get foo
+(nil)
+192.168.128.9:6379> set foo 1
+OK
+192.168.128.9:6379> get foo
+"1"
+192.168.128.9:6379>
+```
diff --git a/content/notes/making-sense-intel-amd-cpus.md b/content/notes/making-sense-intel-amd-cpus.md
new file mode 100644
index 0000000..22633af
--- /dev/null
+++ b/content/notes/making-sense-intel-amd-cpus.md
@@ -0,0 +1,191 @@
+---
+title: Making sense of Intel and AMD CPUs naming
+date: 2021-12-29
+tags:
+  - amd
+  - intel
+  - cpu
+---
+
+# Intel
+
+## Core
+
+The line up for the core family is i3, i5, i7 and i9. As of December
+2021, the current generation is Alder Lake (12th generation).
+
+The brand modifiers are:
+
+-   **i3**: laptops/low-end desktop
+-   **i5**: mainstream users
+-   **i7**: high-end users
+-   **i9**: enthusiast users
+
+How to read a SKU ? Let's use the
+[i7-12700K](https://ark.intel.com/content/www/us/en/ark/products/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz.html)
+processor:
+
+-   **i7**: high end users
+-   **12**: 12th generation
+-   **700**: SKU digits, usually assigned in the order the processors
+    are developed
+-   **K**: unlocked
+
+List of suffixes:
+
+| suffix | meaning                                |
+|--------|----------------------------------------|
+| G..    | integrated graphics                    |
+| E      | embedded                               |
+| F      | require discrete graphic card          |
+| H      | high performance for mobile            |
+| HK     | high performance for mobile / unlocked |
+| K      | unlocked                               |
+| S      | special edition                        |
+| T      | power optimized lifestyle              |
+| U      | mobile power efficient                 |
+| Y      | mobile low power                       |
+| X/XE   | unlocked, high end                     |
+
+> **Unlocked,** what does that means ? A processor with the **K** suffix
+> is made with the an unlocked clock multiplier. When used with some
+> specific chipset, it's possible to overclock the processor.
+
+### Sockets/Chipsets
+
+For the Alder Lake generation, the supported socket is the
+[LGA<sub>1700</sub>](https://en.wikipedia.org/wiki/LGA_1700).
+
+For now only supported chipset for Alder Lake are:
+
+| feature                     | [z690](https://ark.intel.com/content/www/us/en/ark/products/218833/intel-z690-chipset.html) | [h670](https://www.intel.com/content/www/us/en/products/sku/218831/intel-h670-chipset/specifications.html) | [b660](https://ark.intel.com/content/www/us/en/ark/products/218832/intel-b660-chipset.html) | [h610](https://www.intel.com/content/www/us/en/products/sku/218829/intel-h610-chipset/specifications.html) |
+|-----------------------------|---------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|---------------------------------------------------------------------------------------------|------------------------------------------------------------------------------------------------------------|
+| P and E cores over clocking | yes                                                                                         | no                                                                                                         | no                                                                                          | no                                                                                                         |
+| memory over clocking        | yes                                                                                         | yes                                                                                                        | yes                                                                                         | no                                                                                                         |
+| DMI 4 lanes                 | 8                                                                                           | 8                                                                                                          | 4                                                                                           | 4                                                                                                          |
+| chipset PCIe 4.0 lanes      | up to 12                                                                                    | up to 12                                                                                                   | up to 6                                                                                     | none                                                                                                       |
+| chipset PCIe 3.0 lanes      | up to 16                                                                                    | up to 12                                                                                                   | up to 8                                                                                     | 8                                                                                                          |
+| SATA 3.0 ports              | up to 8                                                                                     | up to 8                                                                                                    | 4                                                                                           | 4                                                                                                          |
+
+### Alder Lake (12th generation)
+
+| model      | p-cores | e-cores | GHz (base) | GHz (boosted) | TDP  |
+|------------|---------|---------|------------|---------------|------|
+| i9-12900K  | 8 (16)  | 8       | 3.2/2.4    | 5.1/3.9       | 241W |
+| i9-12900KF | 8 (16)  | 8       | 3.2/2.4    | 5.1/3.9       | 241W |
+| i7-12700K  | 8 (16)  | 4       | 3.6/2.7    | 4.9/3.8       | 190W |
+| i7-12700KF | 8 (16)  | 4       | 3.6/2.7    | 4.9/3.8       | 190W |
+| i5-12600K  | 6 (12)  | 4       | 3.7/2.8    | 4.9/3.6       | 150W |
+| i5-12600KF | 6 (12)  | 4       | 3.7/2.8    | 4.9/3.6       | 150W |
+
+-   support DDR4 and DDR5 (up to DDR5-4800)
+-   support PCIe 4.0 and 5.0 (16 PCIe 5.0 and 4 PCIe 4.0)
+
+The socket used is the [LGA
+1700](https://en.wikipedia.org/wiki/LGA_1700).
+
+Alder lake is an hybrid architecture, featuring both P-cores
+(performance cores) and E-cores (efficient cores). P-cores are based on
+the [Golden Cove](https://en.wikipedia.org/wiki/Golden_Cove)
+architecture, while the E-cores are based on the
+[Gracemont](https://en.wikipedia.org/wiki/Gracemont_(microarchitecture))
+architecture.
+
+This is a [good
+article](https://www.anandtech.com/show/16881/a-deep-dive-into-intels-alder-lake-microarchitectures/2)
+to read about this model. Inside the processor there's a microcontroller
+that monitors what each thread is doing. This can be used by the OS
+scheduler to hint on which core a thread should be scheduled on (between
+performance or efficiency).
+
+As of December 2021 this is not yet properly supported by the Linux
+kernel.
+
+## Xeon
+
+Xeon is the brand of Intel processor designed for non-consumer servers
+and workstations. The most recent generations are:
+
+-   Skylake (2017)
+-   Cascade lake (2019)
+-   Cooper lake (2020)
+
+The following brand identifiers are used:
+
+-   platinium
+-   gold
+-   silver
+-   bronze
+
+# AMD
+
+## Ryzen
+
+There are multiple generation for this brand of processors. They are
+based on the [zen micro
+architecture](https://en.wikipedia.org/wiki/Zen_(microarchitecture)).
+The current (as of December 2021) generation is Ryzen 5000.
+
+The brand modifiers are:
+
+-   ryzen 3: entry level
+-   ryzen 5: mainstream
+-   ryzen 9: high end performance
+-   ryzen 9:enthusiast
+
+List of suffixes:
+
+| suffix | meaning                                    |
+|--------|--------------------------------------------|
+| X      | high performance                           |
+| G      | integrated graphics                        |
+| T      | power optimized lifecycle                  |
+| S      | low power desktop with integrated graphics |
+| H      | high performance mobile                    |
+| U      | standard mobile                            |
+| M      | low power mobile                           |
+
+## EPYC
+
+EPYC is the AMD brand of processors for the server market, based on the
+zen architecture. They use the
+[SP3](https://en.wikipedia.org/wiki/Socket_SP3) socket. The EPYC
+processor is chipset free.
+
+## Threadripper
+
+The threadripper is for high performance desktop. It uses the
+[TR4](https://en.wikipedia.org/wiki/Socket_TR4) socket. At the moment
+there's only one chipset that supports this process, the
+[X399](https://en.wikipedia.org/wiki/List_of_AMD_chipsets#TR4_chipsets).
+
+The threadripper based on zen3 architecture is not yet released, but
+it's expected to hit the market in the first half of Q1 2022.
+
+## Sockets/Chipsets
+
+The majority of these processors use the [AM4
+socket](https://en.wikipedia.org/wiki/Socket_AM4). The threadripper line
+uses different sockets.
+
+There are multiple
+[chipset](https://en.wikipedia.org/wiki/Socket_AM4#Chipsets) for the AM4
+socket. The more advanced ones are the B550 and the X570.
+
+The threadripper processors use the TR4, sTRX4 and sWRX8 sockets.
+
+## Zen 3
+
+Zen 3 was released in November 2020.
+
+| model         | cores   | GHz (base) | GHz (boosted) | PCIe lanes | TDP  |
+|---------------|---------|------------|---------------|------------|------|
+| ryzen 5 5600x | 6 (12)  | 3.7        | 4.6           | 24         | 65W  |
+| ryzen 7 5800  | 8 (16)  | 3.4        | 4.6           | 24         | 65W  |
+| ryzen 7 5800x | 8 (16)  | 3.8        | 4.7           | 24         | 105W |
+| ryzen 9 5900  | 12 (24) | 3.0        | 4.7           | 24         | 65W  |
+| ryzen 9 5900x | 12 (24) | 3.7        | 4.8           | 24         | 105W |
+| ryzen 9 5950x | 16 (32) | 3.4        | 4.9           | 24         | 105W |
+
+-   support PCIe 3.0 and PCIe 4.0 (except for the G series)
+-   only support DDR4 (up to DDR4-3200)
diff --git a/content/notes/stuff-about-pcie.md b/content/notes/stuff-about-pcie.md
new file mode 100644
index 0000000..a3644f1
--- /dev/null
+++ b/content/notes/stuff-about-pcie.md
@@ -0,0 +1,242 @@
+---
+title: Stuff about PCIe
+date: 2022-01-03
+tags:
+  - linux
+  - harwdare
+---
+
+# Speed
+
+The most common versions are 3 and 4, while 5 is starting to be
+available with newer Intel processors.
+
+| ver | encoding  | transfer rate | x1         | x2          | x4         | x8         | x16         |
+|-----|-----------|---------------|------------|-------------|------------|------------|-------------|
+| 1   | 8b/10b    | 2.5GT/s       | 250MB/s    | 500MB/s     | 1GB/s      | 2GB/s      | 4GB/s       |
+| 2   | 8b/10b    | 5.0GT/s       | 500MB/s    | 1GB/s       | 2GB/s      | 4GB/s      | 8GB/s       |
+| 3   | 128b/130b | 8.0GT/s       | 984.6 MB/s | 1.969 GB/s  | 3.94 GB/s  | 7.88 GB/s  | 15.75 GB/s  |
+| 4   | 128b/130b | 16.0GT/s      | 1969 MB/s  | 3.938 GB/s  | 7.88 GB/s  | 15.75 GB/s | 31.51 GB/s  |
+| 5   | 128b/130b | 32.0GT/s      | 3938 MB/s  | 7.877 GB/s  | 15.75 GB/s | 31.51 GB/s | 63.02 GB/s  |
+| 6   | 128b/130  | 64.0 GT/s     | 7877 MB/s  | 15.754 GB/s | 31.51 GB/s | 63.02 GB/s | 126.03 GB/s |
+
+This is a
+[useful](https://community.mellanox.com/s/article/understanding-pcie-configuration-for-maximum-performance)
+link to understand the formula:
+
+    Maximum PCIe Bandwidth = SPEED * WIDTH * (1 - ENCODING) - 1Gb/s
+
+We remove 1Gb/s for protocol overhead and error corrections. The main
+difference between the generations besides the supported speed is the
+encoding overhead of the packet. For generations 1 and 2, each packet
+sent on the PCIe has 20% PCIe headers overhead. This was improved in
+generation 3, where the overhead was reduced to 1.5% (2/130) - see
+[8b/10b encoding](https://en.wikipedia.org/wiki/8b/10b_encoding) and
+[128b/130b encoding](https://en.wikipedia.org/wiki/64b/66b_encoding).
+
+If we apply the formula, for a PCIe version 3 device we can expect
+3.7GB/s of data transfer rate:
+
+    8GT/s * 4 lanes * (1 - 2/130) - 1G = 32G * 0.985 - 1G = ~30Gb/s -> 3750MB/s
+
+# Topology
+
+The easiest way to see the PCIe topology is with `lspci`:
+
+    $ lspci -tv
+    -[0000:00]-+-00.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Root Complex
+               +-01.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
+               +-01.1-[01]----00.0  OCZ Technology Group, Inc. RD400/400A SSD
+               +-01.3-[02-03]----00.0-[03]----00.0  ASPEED Technology, Inc. ASPEED Graphics Family
+               +-01.5-[04]--+-00.0  Intel Corporation I350 Gigabit Network Connection
+               |            +-00.1  Intel Corporation I350 Gigabit Network Connection
+               |            +-00.2  Intel Corporation I350 Gigabit Network Connection
+               |            \-00.3  Intel Corporation I350 Gigabit Network Connection
+               +-02.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
+               +-03.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
+               +-04.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
+               +-07.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
+               +-07.1-[05]--+-00.0  Advanced Micro Devices, Inc. [AMD] Zeppelin/Raven/Raven2 PCIe Dummy Function
+               |            +-00.2  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Platform Security Processor
+               |            \-00.3  Advanced Micro Devices, Inc. [AMD] Zeppelin USB 3.0 Host controller
+               +-08.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-1fh) PCIe Dummy Host Bridge
+               +-08.1-[06]--+-00.0  Advanced Micro Devices, Inc. [AMD] Zeppelin/Renoir PCIe Dummy Function
+               |            +-00.1  Advanced Micro Devices, Inc. [AMD] Zeppelin Cryptographic Coprocessor NTBCCP
+               |            +-00.2  Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode]
+               |            \-00.3  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) HD Audio Controller
+               +-14.0  Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller
+               +-14.3  Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge
+               +-18.0  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 0
+               +-18.1  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 1
+               +-18.2  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 2
+               +-18.3  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 3
+               +-18.4  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 4
+               +-18.5  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 5
+               +-18.6  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 6
+               \-18.7  Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7
+
+# View a single device
+
+    $ lspci -s 0000:01:00.0
+    01:00.0 Non-Volatile memory controller: OCZ Technology Group, Inc. RD400/400A SSD (rev 01)
+
+# Reading `lspci` output
+
+    $ sudo lspci -vvv -s 0000:01:00.0
+    01:00.0 Non-Volatile memory controller: OCZ Technology Group, Inc. RD400/400A SSD (rev 01) (prog-if 02 [NVM Express])
+        Subsystem: OCZ Technology Group, Inc. RD400/400A SSD
+        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
+        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
+        Latency: 0, Cache Line Size: 64 bytes
+        Interrupt: pin A routed to IRQ 41
+        NUMA node: 0
+        Region 0: Memory at ef800000 (64-bit, non-prefetchable) [size=16K]
+        Capabilities: [40] Power Management version 3
+            Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
+            Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
+        Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+
+            Address: 0000000000000000  Data: 0000
+        Capabilities: [70] Express (v2) Endpoint, MSI 00
+            DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 unlimited
+                ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0.000W
+            DevCtl: CorrErr- NonFatalErr- FatalErr- UnsupReq-
+                RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop- FLReset-
+                MaxPayload 128 bytes, MaxReadReq 512 bytes
+            DevSta: CorrErr+ NonFatalErr- FatalErr- UnsupReq+ AuxPwr+ TransPend-
+            LnkCap: Port #0, Speed 8GT/s, Width x4, ASPM L1, Exit Latency L1 <4us
+                ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
+            LnkCtl: ASPM L1 Enabled; RCB 64 bytes, Disabled- CommClk+
+                ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
+            LnkSta: Speed 8GT/s (ok), Width x4 (ok)
+                TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
+            DevCap2: Completion Timeout: Range ABCD, TimeoutDis+ NROPrPrP- LTR+
+                 10BitTagComp- 10BitTagReq- OBFF Not Supported, ExtFmt- EETLPPrefix-
+                 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
+                 FRS- TPHComp- ExtTPHComp-
+                 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
+            DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis- LTR- OBFF Disabled,
+                 AtomicOpsCtl: ReqEn-
+            LnkCap2: Supported Link Speeds: 2.5-8GT/s, Crosslink- Retimer- 2Retimers- DRS-
+            LnkCtl2: Target Link Speed: 8GT/s, EnterCompliance- SpeedDis-
+                 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
+                 Compliance De-emphasis: -6dB
+            LnkSta2: Current De-emphasis Level: -3.5dB, EqualizationComplete+ EqualizationPhase1+
+                 EqualizationPhase2+ EqualizationPhase3+ LinkEqualizationRequest-
+                 Retimer- 2Retimers- CrosslinkRes: unsupported
+        Capabilities: [b0] MSI-X: Enable+ Count=8 Masked-
+            Vector table: BAR=0 offset=00002000
+            PBA: BAR=0 offset=00003000
+        Capabilities: [100 v2] Advanced Error Reporting
+            UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol-
+            UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
+            UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
+            CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr+
+            CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- AdvNonFatalErr-
+            AERCap: First Error Pointer: 14, ECRCGenCap+ ECRCGenEn- ECRCChkCap+ ECRCChkEn-
+                MultHdrRecCap- MultHdrRecEn- TLPPfxPres- HdrLogCap-
+            HeaderLog: 05000001 0000010f 02000010 0f86d1a0
+        Capabilities: [178 v1] Secondary PCI Express
+            LnkCtl3: LnkEquIntrruptEn- PerformEqu-
+            LaneErrStat: 0
+        Capabilities: [198 v1] Latency Tolerance Reporting
+            Max snoop latency: 0ns
+            Max no snoop latency: 0ns
+        Capabilities: [1a0 v1] L1 PM Substates
+            L1SubCap: PCI-PM_L1.2+ PCI-PM_L1.1- ASPM_L1.2+ ASPM_L1.1- L1_PM_Substates+
+                  PortCommonModeRestoreTime=255us PortTPowerOnTime=400us
+            L1SubCtl1: PCI-PM_L1.2- PCI-PM_L1.1- ASPM_L1.2- ASPM_L1.1-
+                   T_CommonMode=0us LTR1.2_Threshold=0ns
+            L1SubCtl2: T_PwrOn=10us
+        Kernel driver in use: nvme
+        Kernel modules: nvme
+
+A few things to note from this output:
+
+-   **GT/s** is the number of transactions supported (here, 8 billion
+    transactions / second). This is gen3 controller (gen1 is 2.5 and
+    gen2 is 5)xs
+-   **LNKCAP** is the capabilities which were communicated, and
+    **LNKSTAT** is the current status. You want them to report the same
+    values. If they don't, you are not using the hardware as it is
+    intended (here I'm assuming the hardware is intended to work as a
+    gen3 controller). In case the device is downgraded, the output will
+    be like this: `LnkSta: Speed 2.5GT/s (downgraded), Width x16 (ok)`
+-   **width** is the number of lanes that can be used by the device
+    (here, we can use 4 lanes)
+-   **MaxPayload** is the maximum size of a PCIe packet
+
+# Debugging
+
+PCI configuration registers can be used to debug various PCI bus issues.
+
+The various registers define bits that are either set (indicated with a
+'+') or unset (indicated with a '-'). These bits typically have
+attributes of 'RW1C' meaning you can read and write them and need to
+write a '1' to clear them. Because these are status bits, if you wanted
+to 'count' the occurrences of them you would need to write some software
+that detected the bits getting set, incremented counters, and cleared
+them over time.
+
+The 'Device Status Register' (DevSta) shows at a high level if there
+have been correctable errors detected (CorrErr), non-fatal errors
+detected (UncorrErr), fata errors detected (FataErr), unsupported
+requests detected (UnsuppReq), if the device requires auxillary power
+(AuxPwr), and if there are transactions pending (non posted requests
+that have not been completed).
+
+    10000:01:00.0 Non-Volatile memory controller: Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] (prog-if 02 [NVM Express])
+    ...
+            Capabilities: [100 v1] Advanced Error Reporting
+                    UESta:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
+                    UEMsk:  DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
+                    UESvrt: DLP+ SDES+ TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
+                    CESta:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
+                    CEMsk:  RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
+                    AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
+
+-   The Uncorrectable Error Status (UESta) reports error status of
+    individual uncorrectable error sources (no bits are set above):
+    -   Data Link Protocol Error (DLP)
+    -   Surprise Down Error (SDES)
+    -   Poisoned TLP (TLP)
+    -   Flow Control Protocol Error (FCP)
+    -   Completion Timeout (CmpltTO)
+    -   Completer Abort (CmpltAbrt)
+    -   Unexpected Completion (UnxCmplt)
+    -   Receiver Overflow (RxOF)
+    -   Malformed TLP (MalfTLP)
+    -   ECRC Error (ECRC)
+    -   Unsupported Request Error (UnsupReq)
+    -   ACS Violation (ACSViol)
+-   The Uncorrectable Error Mask (UEMsk) controls reporting of
+    individual errors by the device to the PCIe root complex. A masked
+    error (bit set) is not recorded or reported. Above shows no errors
+    are being masked)
+-   The Uncorrectable Severity controls whether an individual error is
+    reported as a Non-fatal (clear) or Fatal error (set).
+-   The Correctable Error Status reports error status of individual
+    correctable error sources: (no bits are set above)
+    -   Receiver Error (RXErr)
+    -   Bad TLP status (BadTLP)
+    -   Bad DLLP status (BadDLLP)
+    -   Replay Timer Timeout status (Timeout)
+    -   REPLAY NUM Rollover status (Rollover)
+    -   Advisory Non-Fatal Error (NonFatalIErr)
+-   The Correctable Erro Mask (CEMsk) controls reporting of individual
+    errors by the device to the PCIe root complex. A masked error (bit
+    set) is not reported to the RC. Above shows that Advisory Non-Fatal
+    Errors are being masked - this bit is set by default to enable
+    compatibility with software that does not comprehend Role-Based
+    error reporting.
+-   The Advanced Error Capabilities and Control Register (AERCap)
+    enables various capabilities (The above indicates the device capable
+    of generating ECRC errors but they are not enabled):
+    -   First Error Pointer identifies the bit position of the first
+        error reported in the Uncorrectable Error Status register
+    -   ECRC Generation Capable (GenCap) indicates if set that the
+        function is capable of generating ECRC
+    -   ECRC Generation Enable (GenEn) indicates if ECRC generation is
+        enabled (set)
+    -   ECRC Check Capable (ChkCap) indicates if set that the function
+        is capable of checking ECRC
+    -   ECRC Check Enable (ChkEn) indicates if ECRC checking is enabled
diff --git a/content/notes/working-with-go.md b/content/notes/working-with-go.md
new file mode 100644
index 0000000..b5e690e
--- /dev/null
+++ b/content/notes/working-with-go.md
@@ -0,0 +1,286 @@
+---
+title: Working with Go
+date: 2021-08-05
+tags:
+  - emacs
+  - go
+---
+
+*This document assumes go version \>= 1.16*.
+
+# Go Modules
+
+[Go modules](https://blog.golang.org/using-go-modules) have been added
+in 2019 with Go 1.11. A number of changes were introduced with [Go
+1.16](https://blog.golang.org/go116-module-changes). This document is a
+reference for me so that I can find answers to things I keep forgetting.
+
+## Creating a new module
+
+To create a new module, run `go mod init golang.fcuny.net/m`. This will
+create two files: `go.mod` and `go.sum`.
+
+In the `go.mod` file you'll find:
+
+-   the module import path (prefixed with `module`)
+-   the list of dependencies (within `require`)
+-   the version of go to use for the module
+
+## Versioning
+
+To bump the version of a module:
+
+``` bash
+$ git tag v1.2.3
+$ git push --tags
+```
+
+Then as a user:
+
+``` bash
+$ go get -d golang.fcuny.net/m@v1.2.3
+```
+
+## Updating dependencies
+
+To update the dependencies, run `go mod tidy`
+
+## Editing a module
+
+If you need to modify a module, you can check out the module in your
+workspace (`git clone <module URL>`).
+
+Edit the `go.mod` file to add
+
+``` go
+replace <module URL> => <path of the local checkout>
+```
+
+Then modify the code of the module and the next time you compile the
+project, the cloned module will be used.
+
+This is particularly useful when trying to debug an issue with an
+external module.
+
+## Vendor-ing modules
+
+It's still possible to vendor modules by running `go mod vendor`. This
+can be useful in the case of a CI setup that does not have access to
+internet.
+
+## Proxy
+
+As of version 1.13, the variable `GOPROXY` defaults to
+`https://proxy.golang.org,direct` (see
+[here](https://github.com/golang/go/blob/c95464f0ea3f87232b1f3937d1b37da6f335f336/src/cmd/go/internal/cfg/cfg.go#L269)).
+As a result, when running something like
+`go get golang.org/x/tools/gopls@latest`, the request goes through the
+proxy.
+
+There's a number of ways to control the behavior, they are documented
+[here](https://golang.org/ref/mod#private-modules).
+
+There's a few interesting things that can be done when using the proxy.
+There's a few special URLs (better documentation
+[here](https://golang.org/ref/mod#goproxy-protocol)):
+
+| path                  | description                                                                              |
+|-----------------------|------------------------------------------------------------------------------------------|
+| $mod/@v/list          | Returns the list of known versions - there's one version per line and it's in plain text |
+| $mod/@v/$version.info | Returns metadata about a version in JSON format                                          |
+| $mod/@v/$version.mod  | Returns the `go.mod` file for that version                                               |
+
+For example, looking at the most recent versions for `gopls`:
+
+``` bash
+; curl -s -L https://proxy.golang.org/golang.org/x/tools/gopls/@v/list|sort -r|head
+v0.7.1-pre.2
+v0.7.1-pre.1
+v0.7.1
+v0.7.0-pre.3
+v0.7.0-pre.2
+v0.7.0-pre.1
+v0.7.0
+v0.6.9-pre.1
+v0.6.9
+v0.6.8-pre.1
+```
+
+Let's check the details for the most recent version
+
+``` bash
+; curl -s -L https://proxy.golang.org/golang.org/x/tools/gopls/@v/list|sort -r|head
+v0.7.1-pre.2
+v0.7.1-pre.1
+v0.7.1
+v0.7.0-pre.3
+v0.7.0-pre.2
+v0.7.0-pre.1
+v0.7.0
+v0.6.9-pre.1
+v0.6.9
+v0.6.8-pre.1
+```
+
+And let's look at the content of the `go.mod` for that version too:
+
+``` bash
+; curl -s -L https://proxy.golang.org/golang.org/x/tools/gopls/@v/v0.7.1-pre.2.mod
+module golang.org/x/tools/gopls
+
+go 1.17
+
+require (
+        github.com/BurntSushi/toml v0.3.1 // indirect
+        github.com/google/go-cmp v0.5.5
+        github.com/google/safehtml v0.0.2 // indirect
+        github.com/jba/templatecheck v0.6.0
+        github.com/sanity-io/litter v1.5.0
+        github.com/sergi/go-diff v1.1.0
+        golang.org/x/mod v0.4.2
+        golang.org/x/sync v0.0.0-20210220032951-036812b2e83c // indirect
+        golang.org/x/sys v0.0.0-20210510120138-977fb7262007
+        golang.org/x/text v0.3.6 // indirect
+        golang.org/x/tools v0.1.6-0.20210802203754-9b21a8868e16
+        golang.org/x/xerrors v0.0.0-20200804184101-5ec99f83aff1 // indirect
+        honnef.co/go/tools v0.2.0
+        mvdan.cc/gofumpt v0.1.1
+        mvdan.cc/xurls/v2 v2.2.0
+)
+```
+
+# Tooling
+
+## LSP
+
+`gopls` is the default implementation of the language server protocol
+maintained by the Go team. To install the latest version, run
+`go install golang.org/x/tools/gopls@latest`
+
+## `staticcheck`
+
+[`staticcheck`](https://staticcheck.io/) is a great tool to run against
+your code to find issues. To install the latest version, run
+`go install honnef.co/go/tools/cmd/staticcheck@latest`.
+
+# Emacs integration
+
+## `go-mode`
+
+[This is the mode](https://github.com/dominikh/go-mode.el) to install to
+get syntax highlighting (mostly).
+
+## Integration with LSP
+
+Emacs has a pretty good integration with LSP, and ["Eglot for better
+programming experience in
+Emacs"](https://whatacold.io/blog/2022-01-22-emacs-eglot-lsp/) is a good
+starting point.
+
+### eglot
+
+[This is the main mode to install](https://github.com/joaotavora/eglot).
+
+The configuration is straightforward, this is what I use:
+
+``` elisp
+;; for go's LSP I want to use staticcheck and placeholders for completion
+(customize-set-variable 'eglot-workspace-configuration
+                        '((:gopls .
+                                  ((staticcheck     . t)
+                                   (matcher         . "CaseSensitive")
+                                   (usePlaceholders . t)))))
+
+;; ensure we load eglot for some specific modes
+(dolist (hook '(go-mode-hook nix-mode-hook))
+  (add-hook hook 'eglot-ensure))
+```
+
+`eglot` integrates well with existing modes for Emacs, mainly xref,
+flymake, eldoc.
+
+# Profiling
+
+## pprof
+
+[pprof](https://github.com/google/pprof) is a tool to visualize
+performance data. Let's start with the following test:
+
+``` go
+package main
+
+import (
+    "strings"
+    "testing"
+)
+
+func BenchmarkStringJoin(b *testing.B) {
+    input := []string{"a", "b"}
+    for i := 0; i <= b.N; i++ {
+        r := strings.Join(input, " ")
+        if r != "a b" {
+            b.Errorf("want a b got %s", r)
+        }
+    }
+}
+```
+
+Let's run a benchmark with
+`go test . -bench=. -cpuprofile cpu_profile.out`:
+
+``` go
+goos: linux
+goarch: amd64
+pkg: golang.fcuny.net/m
+cpu: Intel(R) Core(TM) i3-1005G1 CPU @ 1.20GHz
+BenchmarkStringJoin-4           41833486                26.85 ns/op            3 B/op          1 allocs/op
+PASS
+ok      golang.fcuny.net/m      1.327s
+```
+
+And let's take a look at the profile with
+`go tool pprof cpu_profile.out`
+
+``` bash
+File: m.test
+Type: cpu
+Time: Aug 15, 2021 at 3:01pm (PDT)
+Duration: 1.31s, Total samples = 1.17s (89.61%)
+Entering interactive mode (type "help" for commands, "o" for options)
+(pprof) top
+Showing nodes accounting for 1100ms, 94.02% of 1170ms total
+Showing top 10 nodes out of 41
+      flat  flat%   sum%        cum   cum%
+     240ms 20.51% 20.51%      240ms 20.51%  runtime.memmove
+     220ms 18.80% 39.32%      320ms 27.35%  runtime.mallocgc
+     130ms 11.11% 50.43%      450ms 38.46%  runtime.makeslice
+     110ms  9.40% 59.83%     1150ms 98.29%  golang.fcuny.net/m.BenchmarkStringJoin
+     110ms  9.40% 69.23%      580ms 49.57%  strings.(*Builder).grow (inline)
+     110ms  9.40% 78.63%     1040ms 88.89%  strings.Join
+      70ms  5.98% 84.62%      300ms 25.64%  strings.(*Builder).WriteString
+      50ms  4.27% 88.89%      630ms 53.85%  strings.(*Builder).Grow (inline)
+      40ms  3.42% 92.31%       40ms  3.42%  runtime.nextFreeFast (inline)
+      20ms  1.71% 94.02%       20ms  1.71%  runtime.getMCache (inline)
+```
+
+We can get a breakdown of the data for our module:
+
+``` bash
+(pprof) list golang.fcuny.net
+Total: 1.17s
+ROUTINE ======================== golang.fcuny.net/m.BenchmarkStringJoin in /home/fcuny/workspace/gobench/app_test.go
+     110ms      1.15s (flat, cum) 98.29% of Total
+         .          .      5:   "testing"
+         .          .      6:)
+         .          .      7:
+         .          .      8:func BenchmarkStringJoin(b *testing.B) {
+         .          .      9:   b.ReportAllocs()
+      10ms       10ms     10:   input := []string{"a", "b"}
+         .          .     11:   for i := 0; i <= b.N; i++ {
+      20ms      1.06s     12:           r := strings.Join(input, " ")
+      80ms       80ms     13:           if r != "a b" {
+         .          .     14:                   b.Errorf("want a b got %s", r)
+         .          .     15:           }
+         .          .     16:   }
+         .          .     17:}
+```
diff --git a/content/notes/working-with-nix.md b/content/notes/working-with-nix.md
new file mode 100644
index 0000000..9e697d5
--- /dev/null
+++ b/content/notes/working-with-nix.md
@@ -0,0 +1,46 @@
+---
+title: working with nix
+date: 2022-05-10
+tags:
+  - linux
+  - nix
+---
+
+# the `nix develop` command
+
+The `nix develop` command is for working on a repository. If our
+repository contains a `Makefile`, it will be used by the various
+sub-commands.
+
+`nix develop` supports multiple
+[phases](https://nixos.org/manual/nixpkgs/stable/#sec-stdenv-phases) and
+they map as follow:
+
+| phase          | default to     | command                   | note |
+|----------------|----------------|---------------------------|------|
+| configurePhase | `./configure`  | `nix develop --configure` |      |
+| buildPhase     | `make`         | `nix develop --build`     |      |
+| checkPhase     | `make check`   | `nix develop --check`     |      |
+| installPhase   | `make install` | `nix develop --install`   |      |
+
+In the repository, running `nix develop --build` will build the binary
+**using the Makefile**. This is different from running `nix build`.
+
+# the `nix build` and `nix run` commands
+
+## for Go
+
+For Go, there's the `buildGoModule`. Looking at the
+[source](https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/go-modules/generic/default.nix)
+we can see there's a definition of what will be done for each phases. As
+a result, we don't have to define them ourselves.
+
+If we run `nix build` in the repository, it will run the default [build
+phase](https://github.com/NixOS/nixpkgs/blob/fb7287e6d2d2684520f756639846ee07f6287caa/pkgs/development/go-modules/generic/default.nix#L171).
+
+# `buildInputs` or `nativeBuildInputs`
+
+-   `nativeBuildInputs` is intended for architecture-dependent
+    build-time-only dependencies
+-   `buildInputs` is intended for architecture-independent
+    build-time-only dependencies