What *actually* is a Buffer?
Well, Hi there! 🙋♂️
Probably many of you have (or had) the same doubt as me in some point of your development journey.
Unlike arrays, lists, sets, dictionaries, or even things like JSON, Buffers aren’t the most popular subject to talk about in those introductory courses/articles, so it’s most likely that you come here yourself searching for this. Your problems are over!
A quick disclaimer right before we begin, I’ll use NodeJS code as example through this “tutorial”, but the most important part (the concepts) are reusable, just search on how to implement those in your language of choice. 😉
What are Buffers?
A buffer is a container for raw bytes. A byte just means eight bits (0 or 1), so a byte might look like 00110101, but this you probably already know.
At the lowest level, all data in your computer is represented by bits. And even though in modern languages (like JavaScript/Node) you just work with syntax sugar abstractions like int, float, double, strings, booleans, arrays, and others, you have to keep in mind that these are all abstractions, fundamentally they’re all 0’s and 1's.
You’re probably at this points like: “Wow, a bunch of things I learned in my first day at CS in college, impressive”. But I’m trying to make you notice that, if, anytime your program wants to communicate with outside itself, it needs to work through bytes, that’s what every single language understand, that’s what every single language is build on top — Who is the Captain Obvious right now? — And that’s the main reason why whenever you try to read a file, or work with things like sockets, you always get a Buffer as a response (which you can roughly understand as an array of bytes for now).
You probably noticed that there are 8 pairs printed right above, and other thing you probably noticed is that they aren’t in binary, so what’s this?
The above Buffer is a Buffer of 8 bytes, that’s why we have 8 pairs, each pair is a byte, but they are represented in Hexadecimal System, only for simplicity sake (but they are actually stored in binary).
Now we have a overview of what is a Buffer and how it works, let’s work more on this (here starts the node code).
Buffers in Node.JS
There are lot of ways to create a buffer but the simplest is probably just using the alloc
method. If you know you’re going to fill up the buffer immediately then you can use allocUnsafe
which is more efficient but doesn’t clear out random unused bytes from the buffer.
Or you can use the from
method, where you’ll most commonly be passing in an array of numbers, a string, or another buffer. If you pass a string, you’ll want to also pass an encoding (or use the default ‘utf8’).
Note that new Buffer
is considered deprecated, so you’ll want to use the alloc
, allocUnsafe
, or from
methods instead.
Now, a little example with the alloc method:
You can copy a part of other Buffer (or the whole buffer if you want to):
Things you may ask
at least, I thought about it.
- Then what’s the difference between an array of bytes, and a Buffer?
A
ByteBuffer
is more like a builder. It creates abyte[]
. Unlike arrays, it has more convenient helper methods. (e.g. theappend(byte)
method). It's not that straightforward in terms of usage. Just like with arrays, theByteBuffer
has a fixed size. So, when you instantiate it, you already have to specify the size of the buffer.
So, the main difference between them is that Buffer was made to do what it should do, at the same time that array is a more generalist option.
2. Ok, but what differs a Buffer from a Stream then?
I’ll let Marc Gravell’s words answer this to you:
Many data-structures (lists, collections, etc) act as containers — they hold a set of objects. But not a stream; if a list is a bucket, then a stream is a hose. You can pull data from a stream, or push data into a stream — but normally only once and only in one direction (there are exceptions of course). For example, TCP data over a network is a stream; you can send (or receive) chunks of data, but only in connection with the other computer, and usually only once — you can’t rewind the Internet.
Streams can also manipulate data passing through them; compression streams, encryption streams, etc. But again — the underlying metaphor here is a hose of data. A file is also generally accessed (at some level) as a stream; you can access blocks of sequential data. Of course, most file systems also provide random access, so streams do offer things like Seek, Position, Length etc — but not all implementations support such. It has no meaning to seek some streams, or get the length of an open socket.
In addition to this, I can only say that I plan on make another blog post explaining Streams using Dart (because the syntax of it in Dart are just delicious). So stay tuned 😜
___________________________________________________________________
I think that’s it, any kind of question, or accuracy adjustment you think to be needed to this post, just contact me.
See ya! 🙋♂️