The Idea Factory

  • Post author:
  • Post category:Book

This book is a history of the Bell Labs run by AT&T for much of the 20th century. These are the labs which produced many of the things I use day to day -- Unix and the C programming language for example, although this book focuses on other people present at the lab, and a bit earlier than the Unix people. Unix, a history and a memoir for example is set in the same location but later in time. One interesting point the book makes early is that the America of the early 20th century wasn't super into scientists, it was much more about engineers. So for example Edison was an engineer whose super power was systematically grinding through a problem space looking for solutions to a problem, but not necessarily actually understanding the mechanism that caused the solution to work. A really good example, although not one of Edison's, is adding lead to fuel to stop engine knocking and wear -- they literally walked the periodic table until they found an element that worked. I am left wondering how much of this failure to understand the underlying mechanism was a contributor to the longer term environmental and health implications…

Continue ReadingThe Idea Factory

virtio-vsock: python examples of running the server in the guest

I've been using virtio-serial for communications between Linux hypervisors and guest virtual machines for ages. Lots of other people do it to -- the qemu guest agent for example is implemented like this. In fact, I think that's where I got my original thoughts on the matter from. However, virtio-serial is actually fairly terrible to write against as a programming model, because you're left to do all the multiplexing of various requests down the channel and surely there's something better? Well... There is! virtio-vsock is basically the same concept, except it uses the socket interface. You can have more than one connection open and the sockets layer handles multiplexing by magic. This massively simplifies the programming model for supporting concurrent users down the channel. So that's actually pretty cool. I should credit Kata Containers with noticing this quality of life improvement nearly a decade before I did, but I get there in the end. The virtio-vsock model is only a little bit weird. The "address" for the guest virtual machine is a "CID" (Context ID). The hypervisor process is always at CID 0, CID 1 is reserved and unused, and CID 2 is any process on the host which is not…

Continue Readingvirtio-vsock: python examples of running the server in the guest

Fugitive Telemetry

  • Post author:
  • Post category:Book

This is the fifth murderbot book and it's a fun read just like the rest. Unfortunately, it's also really short just like most of the others and I find that the story is therefore just a bit simple and two dimensional. It is nice that the story isn't just a repeat of previous entries in the series, although I would say that this one is relatively free standing in that it doesn't progress the overall story arc. That said, no regrets reading this one.

Continue ReadingFugitive Telemetry

Children of Memory

  • Post author:
  • Post category:Book

This is the third book in this series, coming after Children of Time and Children of Ruin. While I really liked the first of the books in the series, the second felt weaker. While this one doesn't review as well as the second I think it's actually a stronger book. Whilst sometimes a bit repetitive I think the ideas presented here are novel, and the book does a good job of finding a new way of discussing the tensions that refugees and mass immigration create for societies. This book is also an interesting combination of science fiction and fantasy -- the familiar territory of a failing colonization ship sent out on a hope and a prayer, and then a fantasy story about a little girl trying to save her family and a group of strangers come to town. Overall, I enjoyed this book.

Continue ReadingChildren of Memory

Taming Silicon Valley

  • Post author:
  • Post category:Book

The similarities and contrast between this book and AI Snake Oil are striking. For example, AI Snake Oil describes generative AI as something which largely works but is sometimes wrong, whereas this book is very concerned about how they've been rushed out the door in the wake of the unexpected popularity of ChatGPT despite clear issues with hallucinations and unacceptable content generation. Yet the books agree on many things too -- the widespread use of creators' content without permission, weaponization of generative AI political misinformation, the dangers of deep fakes, and the lack of any form of factual verification (or understanding of the world at all) in the statistical approaches used to generate the content. Big tech has no answer for these "negative externalities" that they are enabling and would really rather we all pretend they're not a thing. This book pushes much harder on the issue of how unregulated big tech is, and how it is repeatedly allowed to cause harm to society in returns for profits. It will be interesting to see if any regulation with teeth is created in this space. I find the assertion made in this book that large language models should not be open…

Continue ReadingTaming Silicon Valley

Network Effect

  • Post author:
  • Post category:Book

I'm not really sure why, but I found it harder to get going on this book than the others in the series. It might have been that I was also reading a particularly good non-fiction book at the same time, or it might have been that the premise for these books is starting to wear a bit thin. I'm unsure. That said, while the start of the book covers familiar territory, the overall story rapid diverges into new things and I found it quite readable once I build up some momentum. In the end, I enjoyed this book and would definitely read it again sometime.

Continue ReadingNetwork Effect

On GitHub merge queues, reliability, and multiplexed serial connections

Assuming anyone was paying attention, which I suspect they are not, they would have noticed that there are a lot of queued up pull requests for Shaken Fist right now. There are a couple of factors leading to that — there are now several bots which do automated pull requests for code hygiene purposes; and a couple of months ago I decided to give GitHub’s new merge queue functionality a go in order to keep the CI resource requirements for Shaken Fist under control. All CI runs on four machines in my home office, and there were periods of time where the testing backlog would be more than 24 hours long. I can’t simply buy more hardware, and I didn’t really want to test things less.

The basic idea of GitHub merge queues is that you have a quick set of initial tests which determine if the pull request smells, and then only run the full test suite on non-stinky code which a human has signed off on. Once the human signs off, the code will only merge if the full suite passes, and GitHub manages a queue of merge attempts to keep that reasonable.

(more…)

Continue ReadingOn GitHub merge queues, reliability, and multiplexed serial connections

AI Snake Oil

  • Post author:
  • Post category:Book

Nick recommended I read this book, so here it is. The book starts by providing an analogy for how we talk about AI -- imagine that all transport vehicles were grouped by one generic term instead of a variety like "car", "bus'', "rocket", and "boat". Imagine the confusion a conversation would experience if I was talking about boats and you were talking about rockets. This is one of the issues right now with discussions of "AI" -- there are several kinds of AI, but the commentary is all grouped together and conflating the various types. I think this is probably a specific example of what Ben Goldacre talks about in Bad Science -- science reporting by non-scientists is often overly credulous, and misses the subtleties. Next we need to decide what is in fact AI versus being something else which might be like AI, but not really AI. The book poses three questions to help here: Would a human performing this role require training? If so this might be AI. Image generation is a good example where. Is the behaviour of the system specified directly in code, or is it learnt from examples or a database search? The later is…

Continue ReadingAI Snake Oil

Understanding the Intel 4004 clock circuit

Noting that the Intel 4004 was normally sold as a chip set called the Intel MCS-4, the standard clock circuit used appears to be this (from this PDF, kindly provided by this vendor of MCS-4 test boards): Which means I want to work out what this circuit is doing. First off, let's understand these flip flops. I found this really good course on Computer Organization and Design from Intermation that I think is worth more attention that it appears to have received. I especially like how the sequence of videos starts by explaining the precursor memory types including core memory. There is of course a series of relevant Ben Eater videos as well, so I've linked to those as well. These concepts directly map to the flip flop usage in the Intel MCS clock circuit. As a bonus, a dude named Brek on YouTube built his own core memory, but the only documentation I can find is a series of YouTube videos from 2015. Certainly the links in the Hackaday article are all 404 errors now. I made a playlist of just those videos for convenience. So, this circuit starts to look like two clock dividers based on watching all…

Continue ReadingUnderstanding the Intel 4004 clock circuit

FastCDC, puzzlefs, and de-duplicating container and VM images

Since about 2017, a group at Cisco has been working on an "OCI native operating system" under the title "project machine", which is a terrible project name. I note that most of the people publicly involved in the project according to github commits no longer work at Cisco, so I cannot vouch for the health of the overall project. That said, they did come up with some interesting ideas along the way and given its a quiet time of year I figured I could do some reading. Firstly, Docker / OCI images store their layers as tar files. This is quite inefficient, as the tar file format itself is really intended for ancient tape drives, doesn't support concepts such as random seeks, isn't particularly well defined (there are a few competing implementations), and generally wasn't intended for these things. So instead, that team wrote atomfs, which stores the layers as squashfs filesystems. It should be noted that the only container runtime which appears to actually support atomfs is the project machine itself, so its not a super useful format in the real world. Secondly, the team appears to have fairly rapidly moved on to puzzlefs instead of persuing full support…

Continue ReadingFastCDC, puzzlefs, and de-duplicating container and VM images

End of content

No more pages to load