diff options
Diffstat (limited to 'content/notes')
-rw-r--r-- | content/notes/containerd-to-firecracker.md | 55 | ||||
-rw-r--r-- | content/notes/cpu-power-management.md | 27 | ||||
-rw-r--r-- | content/notes/making-sense-intel-amd-cpus.md | 88 | ||||
-rw-r--r-- | content/notes/stuff-about-pcie.md | 125 | ||||
-rw-r--r-- | content/notes/working-with-go.md | 35 | ||||
-rw-r--r-- | content/notes/working-with-nix.md | 13 |
6 files changed, 166 insertions, 177 deletions
diff --git a/content/notes/containerd-to-firecracker.md b/content/notes/containerd-to-firecracker.md index 52ab201..9716735 100644 --- a/content/notes/containerd-to-firecracker.md +++ b/content/notes/containerd-to-firecracker.md @@ -1,11 +1,6 @@ --- title: containerd to firecracker date: 2021-05-15 -tags: - - linux - - firecracker - - containerd - - go --- fly.io had an [interesting @@ -34,7 +29,7 @@ code is available [here](https://git.fcuny.net/containerd-to-vm/). documentation](https://pkg.go.dev/github.com/containerd/containerd). From the main page we can see the following example to create a client. -``` go +```go import ( "github.com/containerd/containerd" "github.com/containerd/containerd/cio" @@ -49,7 +44,7 @@ func main() { And pulling an image is also pretty straightforward: -``` go +```go image, err := client.Pull(context, "docker.io/library/redis:latest") ``` @@ -60,7 +55,7 @@ and there's a few methods associated with it. As `containerd` has namespaces, it's possible to specify the namespace we want to use when working with the API: -``` go +```go ctx := namespaces.WithNamespace(context.Background(), "c2vm") image, err := client.Pull(ctx, "docker.io/library/redis:latest") ``` @@ -68,7 +63,7 @@ image, err := client.Pull(ctx, "docker.io/library/redis:latest") The image will now be stored in the `c2vm` namespace. We can verify this with: -``` bash +```bash ; sudo ctr -n c2vm images ls -q docker.io/library/redis:latest ``` @@ -89,7 +84,7 @@ There's two commons ways to pre-allocate space to a file: `dd` and First, to be safe, we create a temporary file, and use `renameio` to handle the renaming (I recommend reading the doc of the module). -``` go +```go f, err := renameio.TempFile("", rawFile) if err != nil { return err @@ -101,7 +96,7 @@ Now to do the pre-allocation (we're making an assumption here that 2GB is enough, we can likely check what's the size of the container before doing this): -``` go +```go command := exec.Command("fallocate", "-l", "2G", f.Name()) if err := command.Run(); err != nil { return fmt.Errorf("fallocate error: %s", err) @@ -110,7 +105,7 @@ if err := command.Run(); err != nil { We can now convert that file to ext4: -``` go +```go command = exec.Command("mkfs.ext4", "-F", f.Name()) if err := command.Run(); err != nil { return fmt.Errorf("mkfs.ext4 error: %s", err) @@ -119,13 +114,13 @@ if err := command.Run(); err != nil { Now we can rename safely the temporary file to the proper file we want: -``` go +```go f.CloseAtomicallyReplace() ``` And to mount that file -``` go +```go command = exec.Command("mount", "-o", "loop", rawFile, mntDir) if err := command.Run(); err != nil { return fmt.Errorf("mount error: %s", err) @@ -137,7 +132,7 @@ if err := command.Run(); err != nil { Extracting the container using `containerd` is pretty simple. Here's the function that I use: -``` go +```go func extract(ctx context.Context, client *containerd.Client, image containerd.Image, mntDir string) error { manifest, err := images.Manifest(ctx, client.ContentStore(), image.Target(), platform) if err != nil { @@ -185,10 +180,10 @@ Let's refer to the [specification for the config](https://github.com/opencontainers/image-spec/blob/master/config.md). The elements that are of interest to me are: -- `Env`, which is array of strings. They contain the environment - variables that likely we need to run the program -- `Cmd`, which is also an array of strings. If there's no entry point - provided, this is what is used. +- `Env`, which is array of strings. They contain the environment + variables that likely we need to run the program +- `Cmd`, which is also an array of strings. If there's no entry point + provided, this is what is used. At this point, for this experiment, I'm going to ignore exposed ports, working directory, and the user. @@ -196,7 +191,7 @@ working directory, and the user. First we need to read the config from the container. This is easily done: -``` go +```go config, err := images.Config(ctx, client.ContentStore(), image.Target(), platform) if err != nil { return err @@ -205,7 +200,7 @@ if err != nil { This needs to be read and decoded: -``` go +```go configBlob, err := content.ReadBlob(ctx, client.ContentStore(), config) var imageSpec ocispec.Image json.Unmarshal(configBlob, &imageSpec) @@ -221,7 +216,7 @@ for now) with the environment variables and the command. Naively, this can be done like this: -``` go +```go initPath := filepath.Join(mntDir, "init.sh") f, err := renameio.TempFile("", initPath) if err != nil { @@ -262,7 +257,7 @@ we're done manipulating the image. Within a function, we can do the following: -``` go +```go command := exec.Command("/usr/bin/e2fsck", "-p", "-f", rawFile) if err := command.Run(); err != nil { return fmt.Errorf("e2fsck error: %s", err) @@ -277,7 +272,7 @@ if err := command.Run(); err != nil { I'm using `docker.io/library/redis:latest` for my test, and I end up with the following size for the image: -``` bash +```bash -rw------- 1 root root 216M Apr 22 14:50 /tmp/fcuny.img ``` @@ -289,7 +284,7 @@ with the process, the firecracker team has [documented how to do this](https://github.com/firecracker-microvm/firecracker/blob/main/docs/rootfs-and-kernel-setup.md#creating-a-kernel-image). In my case all I had to do was: -``` bash +```bash git clone https://github.com/torvalds/linux.git linux.git cd linux.git git checkout v5.8 @@ -322,7 +317,7 @@ this to work we need to install the `tc-redirect-tap` CNI plugin Based on that documentation, I'll start with the following configuration in `etc/cni/conf.d/50-c2vm.conflist`: -``` json +```json { "name": "c2vm", "cniVersion": "0.4.0", @@ -365,7 +360,7 @@ The first thing is to configure the list of devices. In our case we will have a single device, the boot drive that we've created in the previous step. -``` go +```go devices := make([]models.Drive, 1) devices[0] = models.Drive{ DriveID: firecracker.String("1"), @@ -377,7 +372,7 @@ devices[0] = models.Drive{ The next step is to configure the VM: -``` go +```go fcCfg := firecracker.Config{ LogLevel: "debug", SocketPath: firecrackerSock, @@ -403,7 +398,7 @@ fcCfg := firecracker.Config{ Finally we can create the command to start and run the VM: -``` go +```go command := firecracker.VMCommandBuilder{}. WithBin(firecrackerBinary). WithSocketPath(fcCfg.SocketPath). @@ -670,7 +665,7 @@ The end result: We can do a quick test with the following: -``` bash +```bash ; sudo docker run -it --rm redis redis-cli -h 192.168.128.9 192.168.128.9:6379> get foo (nil) diff --git a/content/notes/cpu-power-management.md b/content/notes/cpu-power-management.md index bcb14b7..bbbd2e6 100644 --- a/content/notes/cpu-power-management.md +++ b/content/notes/cpu-power-management.md @@ -1,11 +1,6 @@ --- title: CPU power management date: 2023-01-22 -tags: - - harwdare - - amd - - intel - - cpu --- ## Maximum power consumption of a processor @@ -17,30 +12,34 @@ The Intel CPU has 80 cores while the AMD one has 128 cores. For Intel, this give The TDP is the average value the processor can sustain forever, and this is the power the cooling solution needs to be designed at for reliability. The TDP is measured under worst case load, with all cores running at 1.8Ghz (the base frequency). ## C-State vs. P-State + We have two ways to control the power consumption: + - disabling a subsystem - decrease the voltage This is done by using -- *C-State* is for optimization of power consumption -- *P-State* is for optimization of the voltage and CPU frequency -*C-State* means that one or more subsystem are executing nothing, one or more subsystem of the CPU is at idle, powered down. +- _C-State_ is for optimization of power consumption +- _P-State_ is for optimization of the voltage and CPU frequency -*P-State* the subsystem is actually running, but it does not require full performance, so the voltage and/or frequency it operates is decreased. +_C-State_ means that one or more subsystem are executing nothing, one or more subsystem of the CPU is at idle, powered down. + +_P-State_ the subsystem is actually running, but it does not require full performance, so the voltage and/or frequency it operates is decreased. The states are numbered starting from 0. The higher the number, the more power is saved. `C0` means no power saving. `P0` means maximum performance (thus maximum frequency, voltage and power used). ### C-state A timeline of power saving using C states is as follow: + 1. normal operation is at c0 2. the clock of idle core is stopped (C1) 3. the local caches (L1/L2) of the core are flushed and the core is powered down (C3) 4. when all the cores are powered down, the shared cache of the package (L3/LLC) are flushed and the whole package/CPU can be powered down | state | description | -|-------|-----------------------------------------------------------------------------------------------------------------------------| +| ----- | --------------------------------------------------------------------------------------------------------------------------- | | C0 | operating state | | C1 | a state where the processor is not executing instructions, but can return to an executing state essentially instantaneously | | C2 | a state where the processor maintains all software-visible state, but may take longer to wake up | @@ -65,6 +64,7 @@ Running `cpuid` we can find all the supported C-states for a processor (Intel(R) ``` If I interpret this correctly: + - there's one `C0` - there's two sub C-states for `C1` - there's two sub C-states for `C3` @@ -78,7 +78,7 @@ P-states allow to change the voltage and frequency of the CPU core to decrease t A P-state refers to different frequency-voltage pairs. The highest operating point is the maximum state which is `P0`. | state | description | -|-------|--------------------------------------------| +| ----- | ------------------------------------------ | | P0 | maximum power and frequency | | P1 | less than P0, voltage and frequency scaled | | P2 | less than P1, voltage and frequency scaled | @@ -88,7 +88,7 @@ A P-state refers to different frequency-voltage pairs. The highest operating poi The ACPI Specification defines the following four global "Gx" states and six sleep "Sx" states | GX | name | Sx | description | -|------|----------------|------|-----------------------------------------------------------------------------------| +| ---- | -------------- | ---- | --------------------------------------------------------------------------------- | | `G0` | working | `S0` | The computer is running and executing instructions | | `G1` | sleeping | `S1` | Processor caches are flushed and the CPU stop executing instructions | | `G1` | sleeping | `S2` | CPU powered off, dirty caches flushed to RAM | @@ -102,10 +102,12 @@ When we are in any C-states, we are in `G0`. ## Speed Select Technology [Speed Select Technology](https://en.wikichip.org/wiki/intel/speed_select_technology) is a set of power management controls that allows a system administrator to customize per-core performance. By configuring the performance of specific cores and affinitizing workloads to those cores, higher software performance can be achieved. SST supports multiple types of customization: + - Frequency Prioritization (SST-CP) - allows specific cores to clock higher by reducing the frequency of cores running lower-priority software. - Speed Select Base Freq (SST-BF) - allows specific cores to run higher base frequency (P1) by reducing the base frequencies (P1) of other cores. ## Turbo Boost + TDP is the maximum power consumption the CPU can sustain. When the power consumption is low (e.g. many cores are in P1+ states), the CPU frequency can be increased beyond base frequency to take advantage of the headroom, since this condition does not increase the power consumption beyond TDP. Modern CPUs are heavily reliant on "Turbo(Intel)" or "boost (AMD)" ([TBT](https://en.wikichip.org/wiki/intel/turbo_boost_technology) and [TBTM](https://en.wikichip.org/wiki/intel/turbo_boost_max_technology)). @@ -113,4 +115,5 @@ Modern CPUs are heavily reliant on "Turbo(Intel)" or "boost (AMD)" ([TBT](https: In our case, the Intel 6122 is rated at 1.8GHz, A.K.A "stamp speed". If we want to run the CPU at a consistent frequency, we'd have to choose 1.8GHz or below, and we'd lose significant performance if we were to disable turbo/boost. ### Turbo boost max + During the manufacturing process, Intel is able to test each die and determine which cores possess the best overclocking capabilities. That information is then stored in the CPU in order from best to worst. diff --git a/content/notes/making-sense-intel-amd-cpus.md b/content/notes/making-sense-intel-amd-cpus.md index 2d7bb8a..75392c6 100644 --- a/content/notes/making-sense-intel-amd-cpus.md +++ b/content/notes/making-sense-intel-amd-cpus.md @@ -1,10 +1,6 @@ --- title: Making sense of Intel and AMD CPUs naming date: 2021-12-29 -tags: - - amd - - intel - - cpu --- ## Intel @@ -14,22 +10,24 @@ tags: The line up for the core family is i3, i5, i7 and i9. As of January 2023, the current generation is [Raptor Lake](https://en.wikipedia.org/wiki/Raptor_Lake) (13th generation). The brand modifiers are: -- **i3**: laptops/low-end desktop -- **i5**: mainstream users -- **i7**: high-end users -- **i9**: enthusiast users + +- **i3**: laptops/low-end desktop +- **i5**: mainstream users +- **i7**: high-end users +- **i9**: enthusiast users How to read a SKU ? Let's use the [i7-12700K](https://ark.intel.com/content/www/us/en/ark/products/134594/intel-core-i712700k-processor-25m-cache-up-to-5-00-ghz.html) processor: -- **i7**: high end users -- **12**: 12th generation -- **700**: SKU digits, usually assigned in the order the processors - are developed -- **K**: unlocked + +- **i7**: high end users +- **12**: 12th generation +- **700**: SKU digits, usually assigned in the order the processors + are developed +- **K**: unlocked List of suffixes: | suffix | meaning | -|--------|----------------------------------------| +| ------ | -------------------------------------- | | G.. | integrated graphics | | E | embedded | | F | require discrete graphic card | @@ -48,12 +46,12 @@ List of suffixes: #### Raptor Lake (13th generation) -Raptor lake is an hybrid architecture, featuring both P-cores (performance cores) and E-cores (efficient cores), similar to Alder lake. P-cores are based on the [Raptor cove](https://en.wikipedia.org/wiki/Golden_Cove#Raptor_Cove) architecture, while the E-cores are based on the [Gracemont](https://en.wikipedia.org/wiki/Gracemont_(microarchitecture)) architecture (same as for Alder lake). +Raptor lake is an hybrid architecture, featuring both P-cores (performance cores) and E-cores (efficient cores), similar to Alder lake. P-cores are based on the [Raptor cove](https://en.wikipedia.org/wiki/Golden_Cove#Raptor_Cove) architecture, while the E-cores are based on the [Gracemont](<https://en.wikipedia.org/wiki/Gracemont_(microarchitecture)>) architecture (same as for Alder lake). Available processors: | model | p-cores | e-cores | GHz (base) | GHz (boosted) | TDP | -|------------|---------|---------|------------|---------------|----------| +| ---------- | ------- | ------- | ---------- | ------------- | -------- | | i9-13900KS | 8 (16) | 16 | 3.2/2.4 | 6/4.3 | 150/253W | | i9-13900K | 8 (16) | 16 | 3.0/2.0 | 5.8/4.3 | 125/253W | | i9-13900KF | 8 (16) | 16 | 3.0/2.0 | 5.8/4.3 | 125/253W | @@ -71,19 +69,19 @@ Available processors: For the Raptor Lake generation, as for the Alder lake generation, the supported socket is the [LGA<sub>1700</sub>](https://en.wikipedia.org/wiki/LGA_1700). List of Raptor lake chipsets: -| feature | b760[^7] | h770[^8] | z790[^9] | +| feature | b760[^7] | h770[^8] | z790[^9] | |-----------------------------|----------|----------|----------| -| P and E cores over clocking | no | no | yes | -| memory over clocking | yes | yes | yes | -| DMI 4 lanes | 4 | 8 | 8 | -| chipset PCIe 5.0 lanes | | | | -| chipset PCIe 4.0 lanes | | | | -| chipset PCIe 3.0 lanes | | | | -| SATA 3.0 ports | up to 4 | up to 8 | up to 8 | +| P and E cores over clocking | no | no | yes | +| memory over clocking | yes | yes | yes | +| DMI 4 lanes | 4 | 8 | 8 | +| chipset PCIe 5.0 lanes | | | | +| chipset PCIe 4.0 lanes | | | | +| chipset PCIe 3.0 lanes | | | | +| SATA 3.0 ports | up to 4 | up to 8 | up to 8 | #### Alder Lake (12th generation) -Alder lake is an hybrid architecture, featuring both P-cores (performance cores) and E-cores (efficient cores). P-cores are based on the [Golden Cove](https://en.wikipedia.org/wiki/Golden_Cove) architecture, while the E-cores are based on the [Gracemont](https://en.wikipedia.org/wiki/Gracemont_(microarchitecture)) architecture. +Alder lake is an hybrid architecture, featuring both P-cores (performance cores) and E-cores (efficient cores). P-cores are based on the [Golden Cove](https://en.wikipedia.org/wiki/Golden_Cove) architecture, while the E-cores are based on the [Gracemont](<https://en.wikipedia.org/wiki/Gracemont_(microarchitecture)>) architecture. This is a [good article](https://www.anandtech.com/show/16881/a-deep-dive-into-intels-alder-lake-microarchitectures/2) to read about this model. Inside the processor there's a microcontroller that monitors what each thread is doing. This can be used by the OS scheduler to hint on which core a thread should be scheduled on (between performance or efficiency). @@ -92,7 +90,7 @@ As of December 2021 this is not yet properly supported by the Linux kernel. Available processors: | model | p-cores | e-cores | GHz (base) | GHz (boosted) | TDP | -|------------|---------|---------|------------|---------------|------| +| ---------- | ------- | ------- | ---------- | ------------- | ---- | | i9-12900K | 8 (16) | 8 | 3.2/2.4 | 5.1/3.9 | 241W | | i9-12900KF | 8 (16) | 8 | 3.2/2.4 | 5.1/3.9 | 241W | | i7-12700K | 8 (16) | 4 | 3.6/2.7 | 4.9/3.8 | 190W | @@ -100,15 +98,15 @@ Available processors: | i5-12600K | 6 (12) | 4 | 3.7/2.8 | 4.9/3.6 | 150W | | i5-12600KF | 6 (12) | 4 | 3.7/2.8 | 4.9/3.6 | 150W | -- support DDR4 and DDR5 (up to DDR5-4800) -- support PCIe 4.0 and 5.0 (16 PCIe 5.0 and 4 PCIe 4.0) +- support DDR4 and DDR5 (up to DDR5-4800) +- support PCIe 4.0 and 5.0 (16 PCIe 5.0 and 4 PCIe 4.0) For the Alder Lake generation, the supported socket is the [LGA<sub>1700</sub>](https://en.wikipedia.org/wiki/LGA_1700). For now only supported chipset for Alder Lake are: | feature | z690[^1] | h670[^2] | b660[^3] | h610[^4] | q670[^6] | w680[^5] | -|-----------------------------|----------|----------|----------|----------|----------|----------| +| --------------------------- | -------- | -------- | -------- | -------- | -------- | -------- | | P and E cores over clocking | yes | no | no | no | no | yes | | memory over clocking | yes | yes | yes | no | - | yes | | DMI 4 lanes | 8 | 8 | 4 | 4 | 8 | 8 | @@ -121,37 +119,38 @@ For now only supported chipset for Alder Lake are: Xeon is the brand of Intel processor designed for non-consumer servers and workstations. The most recent generations are: | name | availability | -|-----------------|--------------| +| --------------- | ------------ | | Skylake | 2015 | | Cascade lake | 2019 | | Cooper lake | 2022 | | Sapphire rapids | 2023 | The following brand identifiers are used: -- platinium -- gold -- silver -- bronze + +- platinium +- gold +- silver +- bronze ## AMD ### Ryzen -There are multiple generation for this brand of processors. They are based on the [zen micro architecture](https://en.wikipedia.org/wiki/Zen_(microarchitecture)). +There are multiple generation for this brand of processors. They are based on the [zen micro architecture](<https://en.wikipedia.org/wiki/Zen_(microarchitecture)>). The current (as of January 2023) generation is Ryzen 7000. The brand modifiers are: -- ryzen 3: entry level -- ryzen 5: mainstream -- ryzen 9: high end performance -- ryzen 9: enthusiast +- ryzen 3: entry level +- ryzen 5: mainstream +- ryzen 9: high end performance +- ryzen 9: enthusiast List of suffixes: | suffix | meaning | -|--------|---------------------------------------------------------------------------------| +| ------ | ------------------------------------------------------------------------------- | | X | high performance | | G | integrated graphics | | T | power optimized lifecycle | @@ -184,7 +183,7 @@ The threadripper processors use the TR4, sTRX4 and sWRX8 sockets. Zen 3 was released in November 2020. | model | cores | GHz (base) | GHz (boosted) | PCIe lanes | TDP | -|---------------|---------|------------|---------------|------------|------| +| ------------- | ------- | ---------- | ------------- | ---------- | ---- | | ryzen 5 5600x | 6 (12) | 3.7 | 4.6 | 24 | 65W | | ryzen 7 5800 | 8 (16) | 3.4 | 4.6 | 24 | 65W | | ryzen 7 5800x | 8 (16) | 3.8 | 4.7 | 24 | 105W | @@ -192,8 +191,8 @@ Zen 3 was released in November 2020. | ryzen 9 5900x | 12 (24) | 3.7 | 4.8 | 24 | 105W | | ryzen 9 5950x | 16 (32) | 3.4 | 4.9 | 24 | 105W | -- support PCIe 3.0 and PCIe 4.0 (except for the G series) -- only support DDR4 (up to DDR4-3200) +- support PCIe 3.0 and PCIe 4.0 (except for the G series) +- only support DDR4 (up to DDR4-3200) ### Zen 4 @@ -204,7 +203,7 @@ Zen 4 was released in September 2022. - all desktop processors feature 2 x 4 lane PCIe interfaces (mostly for M.2 storage devices) | model | cores | GHz (base) | GHz (boosted) | TDP | -|-----------------|---------|------------|---------------|------| +| --------------- | ------- | ---------- | ------------- | ---- | | ryzen 5 7600x | 6 (12) | 4.7 | 5.3 | 105W | | ryzen 5 7600 | 6 (12) | 3.8 | 5.1 | 65W | | ryzen 7 7800X3D | 8 (16) | | 5.0 | 120W | @@ -216,7 +215,6 @@ Zen 4 was released in September 2022. | ryzen 9 7950X | 16 (32) | 4.5 | 5.7 | 170W | | ryzen 9 7950X3D | 16 (32) | 4.2 | 5.7 | 120W | - [^1]: https://ark.intel.com/content/www/us/en/ark/products/218833/intel-z690-chipset.html [^2]: https://www.intel.com/content/www/us/en/products/sku/218831/intel-h670-chipset/specifications.html diff --git a/content/notes/stuff-about-pcie.md b/content/notes/stuff-about-pcie.md index b783924..b540d24 100644 --- a/content/notes/stuff-about-pcie.md +++ b/content/notes/stuff-about-pcie.md @@ -1,9 +1,6 @@ --- title: Stuff about PCIe date: 2022-01-03 -tags: - - linux - - harwdare --- ## Speed @@ -12,7 +9,7 @@ The most common versions are 3 and 4, while 5 is starting to be available with newer Intel processors. | ver | encoding | transfer rate | x1 | x2 | x4 | x8 | x16 | -|-----|-----------|---------------|------------|-------------|------------|------------|-------------| +| --- | --------- | ------------- | ---------- | ----------- | ---------- | ---------- | ----------- | | 1 | 8b/10b | 2.5GT/s | 250MB/s | 500MB/s | 1GB/s | 2GB/s | 4GB/s | | 2 | 8b/10b | 5.0GT/s | 500MB/s | 1GB/s | 2GB/s | 4GB/s | 8GB/s | | 3 | 128b/130b | 8.0GT/s | 984.6 MB/s | 1.969 GB/s | 3.94 GB/s | 7.88 GB/s | 15.75 GB/s | @@ -76,12 +73,14 @@ An easy way to see the PCIe topology is with `lspci`: \-18.7 Advanced Micro Devices, Inc. [AMD] Family 17h (Models 00h-0fh) Data Fabric: Device 18h; Function 7 Now, how do we read this ? + ``` +-[10000:00]-+-02.0-[01]----00.0 Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] | \-03.0-[02]----00.0 Intel Corporation NVMe Datacenter SSD [3DNAND, Beta Rock Controller] ``` This is a lot of information, how do we read this ? + - The first part in brackets (`[10000:00]`) is the domain and the bus. - The second part (`02.0` is still unclear to me) - The third number (between brackets) is the device on the bus @@ -171,18 +170,18 @@ lspci -v -s 0000:01:00.0 A few things to note from this output: -- **GT/s** is the number of transactions supported (here, 8 billion - transactions / second). This is gen3 controller (gen1 is 2.5 and - gen2 is 5)xs -- **LNKCAP** is the capabilities which were communicated, and - **LNKSTAT** is the current status. You want them to report the same - values. If they don't, you are not using the hardware as it is - intended (here I'm assuming the hardware is intended to work as a - gen3 controller). In case the device is downgraded, the output will - be like this: `LnkSta: Speed 2.5GT/s (downgraded), Width x16 (ok)` -- **width** is the number of lanes that can be used by the device - (here, we can use 4 lanes) -- **MaxPayload** is the maximum size of a PCIe packet +- **GT/s** is the number of transactions supported (here, 8 billion + transactions / second). This is gen3 controller (gen1 is 2.5 and + gen2 is 5)xs +- **LNKCAP** is the capabilities which were communicated, and + **LNKSTAT** is the current status. You want them to report the same + values. If they don't, you are not using the hardware as it is + intended (here I'm assuming the hardware is intended to work as a + gen3 controller). In case the device is downgraded, the output will + be like this: `LnkSta: Speed 2.5GT/s (downgraded), Width x16 (ok)` +- **width** is the number of lanes that can be used by the device + (here, we can use 4 lanes) +- **MaxPayload** is the maximum size of a PCIe packet ## Debugging @@ -213,53 +212,53 @@ that have not been completed). CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+ AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- -- The Uncorrectable Error Status (UESta) reports error status of - individual uncorrectable error sources (no bits are set above): - - Data Link Protocol Error (DLP) - - Surprise Down Error (SDES) - - Poisoned TLP (TLP) - - Flow Control Protocol Error (FCP) - - Completion Timeout (CmpltTO) - - Completer Abort (CmpltAbrt) - - Unexpected Completion (UnxCmplt) - - Receiver Overflow (RxOF) - - Malformed TLP (MalfTLP) - - ECRC Error (ECRC) - - Unsupported Request Error (UnsupReq) - - ACS Violation (ACSViol) -- The Uncorrectable Error Mask (UEMsk) controls reporting of - individual errors by the device to the PCIe root complex. A masked - error (bit set) is not recorded or reported. Above shows no errors - are being masked) -- The Uncorrectable Severity controls whether an individual error is - reported as a Non-fatal (clear) or Fatal error (set). -- The Correctable Error Status reports error status of individual - correctable error sources: (no bits are set above) - - Receiver Error (RXErr) - - Bad TLP status (BadTLP) - - Bad DLLP status (BadDLLP) - - Replay Timer Timeout status (Timeout) - - REPLAY NUM Rollover status (Rollover) - - Advisory Non-Fatal Error (NonFatalIErr) -- The Correctable Erro Mask (CEMsk) controls reporting of individual - errors by the device to the PCIe root complex. A masked error (bit - set) is not reported to the RC. Above shows that Advisory Non-Fatal - Errors are being masked - this bit is set by default to enable - compatibility with software that does not comprehend Role-Based - error reporting. -- The Advanced Error Capabilities and Control Register (AERCap) - enables various capabilities (The above indicates the device capable - of generating ECRC errors but they are not enabled): - - First Error Pointer identifies the bit position of the first - error reported in the Uncorrectable Error Status register - - ECRC Generation Capable (GenCap) indicates if set that the - function is capable of generating ECRC - - ECRC Generation Enable (GenEn) indicates if ECRC generation is - enabled (set) - - ECRC Check Capable (ChkCap) indicates if set that the function - is capable of checking ECRC - - ECRC Check Enable (ChkEn) indicates if ECRC checking is enabled +- The Uncorrectable Error Status (UESta) reports error status of + individual uncorrectable error sources (no bits are set above): + - Data Link Protocol Error (DLP) + - Surprise Down Error (SDES) + - Poisoned TLP (TLP) + - Flow Control Protocol Error (FCP) + - Completion Timeout (CmpltTO) + - Completer Abort (CmpltAbrt) + - Unexpected Completion (UnxCmplt) + - Receiver Overflow (RxOF) + - Malformed TLP (MalfTLP) + - ECRC Error (ECRC) + - Unsupported Request Error (UnsupReq) + - ACS Violation (ACSViol) +- The Uncorrectable Error Mask (UEMsk) controls reporting of + individual errors by the device to the PCIe root complex. A masked + error (bit set) is not recorded or reported. Above shows no errors + are being masked) +- The Uncorrectable Severity controls whether an individual error is + reported as a Non-fatal (clear) or Fatal error (set). +- The Correctable Error Status reports error status of individual + correctable error sources: (no bits are set above) + - Receiver Error (RXErr) + - Bad TLP status (BadTLP) + - Bad DLLP status (BadDLLP) + - Replay Timer Timeout status (Timeout) + - REPLAY NUM Rollover status (Rollover) + - Advisory Non-Fatal Error (NonFatalIErr) +- The Correctable Erro Mask (CEMsk) controls reporting of individual + errors by the device to the PCIe root complex. A masked error (bit + set) is not reported to the RC. Above shows that Advisory Non-Fatal + Errors are being masked - this bit is set by default to enable + compatibility with software that does not comprehend Role-Based + error reporting. +- The Advanced Error Capabilities and Control Register (AERCap) + enables various capabilities (The above indicates the device capable + of generating ECRC errors but they are not enabled): + - First Error Pointer identifies the bit position of the first + error reported in the Uncorrectable Error Status register + - ECRC Generation Capable (GenCap) indicates if set that the + function is capable of generating ECRC + - ECRC Generation Enable (GenEn) indicates if ECRC generation is + enabled (set) + - ECRC Check Capable (ChkCap) indicates if set that the function + is capable of checking ECRC + - ECRC Check Enable (ChkEn) indicates if ECRC checking is enabled ## Compute Express Link (CXL) -[Compute Express Link](https://en.wikipedia.org/wiki/Compute_Express_Link) (CXL) is an open standard for high-speed central processing unit (CPU)-to-device and CPU-to-memory connections, designed for high performance data center computers. The standard is built on top of the PCIe physical interface with protocols for I/O, memory, and cache coherence. +[Compute Express Link](https://en.wikipedia.org/wiki/Compute_Express_Link) (CXL) is an open standard for high-speed central processing unit (CPU)-to-device and CPU-to-memory connections, designed for high performance data center computers. The standard is built on top of the PCIe physical interface with protocols for I/O, memory, and cache coherence. diff --git a/content/notes/working-with-go.md b/content/notes/working-with-go.md index af7bf20..fbfba88 100644 --- a/content/notes/working-with-go.md +++ b/content/notes/working-with-go.md @@ -1,12 +1,9 @@ --- title: Working with Go date: 2021-08-05 -tags: - - emacs - - go --- -*This document assumes go version \>= 1.16*. +_This document assumes go version \>= 1.16_. ## Go Modules @@ -22,22 +19,22 @@ create two files: `go.mod` and `go.sum`. In the `go.mod` file you'll find: -- the module import path (prefixed with `module`) -- the list of dependencies (within `require`) -- the version of go to use for the module +- the module import path (prefixed with `module`) +- the list of dependencies (within `require`) +- the version of go to use for the module ### Versioning To bump the version of a module: -``` bash +```bash $ git tag v1.2.3 $ git push --tags ``` Then as a user: -``` bash +```bash $ go get -d golang.fcuny.net/m@v1.2.3 ``` @@ -52,7 +49,7 @@ workspace (`git clone <module URL>`). Edit the `go.mod` file to add -``` go +```go replace <module URL> => <path of the local checkout> ``` @@ -85,14 +82,14 @@ There's a few special URLs (better documentation [here](https://golang.org/ref/mod#goproxy-protocol)): | path | description | -|-----------------------|------------------------------------------------------------------------------------------| +| --------------------- | ---------------------------------------------------------------------------------------- | | $mod/@v/list | Returns the list of known versions - there's one version per line and it's in plain text | | $mod/@v/$version.info | Returns metadata about a version in JSON format | | $mod/@v/$version.mod | Returns the `go.mod` file for that version | For example, looking at the most recent versions for `gopls`: -``` bash +```bash ; curl -s -L https://proxy.golang.org/golang.org/x/tools/gopls/@v/list|sort -r|head v0.7.1-pre.2 v0.7.1-pre.1 @@ -108,7 +105,7 @@ v0.6.8-pre.1 Let's check the details for the most recent version -``` bash +```bash ; curl -s -L https://proxy.golang.org/golang.org/x/tools/gopls/@v/list|sort -r|head v0.7.1-pre.2 v0.7.1-pre.1 @@ -124,7 +121,7 @@ v0.6.8-pre.1 And let's look at the content of the `go.mod` for that version too: -``` bash +```bash ; curl -s -L https://proxy.golang.org/golang.org/x/tools/gopls/@v/v0.7.1-pre.2.mod module golang.org/x/tools/gopls @@ -183,7 +180,7 @@ starting point. The configuration is straightforward, this is what I use: -``` elisp +```elisp ;; for go's LSP I want to use staticcheck and placeholders for completion (customize-set-variable 'eglot-workspace-configuration '((:gopls . @@ -206,7 +203,7 @@ flymake, eldoc. [pprof](https://github.com/google/pprof) is a tool to visualize performance data. Let's start with the following test: -``` go +```go package main import ( @@ -228,7 +225,7 @@ func BenchmarkStringJoin(b *testing.B) { Let's run a benchmark with `go test . -bench=. -cpuprofile cpu_profile.out`: -``` go +```go goos: linux goarch: amd64 pkg: golang.fcuny.net/m @@ -241,7 +238,7 @@ ok golang.fcuny.net/m 1.327s And let's take a look at the profile with `go tool pprof cpu_profile.out` -``` bash +```bash File: m.test Type: cpu Time: Aug 15, 2021 at 3:01pm (PDT) @@ -265,7 +262,7 @@ Showing top 10 nodes out of 41 We can get a breakdown of the data for our module: -``` bash +```bash (pprof) list golang.fcuny.net Total: 1.17s ROUTINE ======================== golang.fcuny.net/m.BenchmarkStringJoin in /home/fcuny/workspace/gobench/app_test.go diff --git a/content/notes/working-with-nix.md b/content/notes/working-with-nix.md index 3d208e4..7da8ec7 100644 --- a/content/notes/working-with-nix.md +++ b/content/notes/working-with-nix.md @@ -1,9 +1,6 @@ --- title: working with nix date: 2022-05-10 -tags: - - linux - - nix --- ## the `nix develop` command @@ -17,7 +14,7 @@ sub-commands. they map as follow: | phase | default to | command | note | -|----------------|----------------|---------------------------|------| +| -------------- | -------------- | ------------------------- | ---- | | configurePhase | `./configure` | `nix develop --configure` | | | buildPhase | `make` | `nix develop --build` | | | checkPhase | `make check` | `nix develop --check` | | @@ -40,7 +37,7 @@ phase](https://github.com/NixOS/nixpkgs/blob/fb7287e6d2d2684520f756639846ee07f62 ## `buildInputs` or `nativeBuildInputs` -- `nativeBuildInputs` is intended for architecture-dependent - build-time-only dependencies -- `buildInputs` is intended for architecture-independent - build-time-only dependencies +- `nativeBuildInputs` is intended for architecture-dependent + build-time-only dependencies +- `buildInputs` is intended for architecture-independent + build-time-only dependencies |