Comparing queue implementations in Javascript
March 01, 2021 • javascript, computer science, asymptotics
While recently reviewing some basic data structures and algorithms, I wondered what the best implementation for a queue in Javascript is, since there is no built–in queue. I decided to compare three different implementations by running some empirical tests.
I also wanted to get some practice with D3 and making data visualizations in React, so I used this to get some actual data to visualize. Be sure to check out the plots below!
Quick Queue Refresher
A queue is a data structure that stores and retrieves data in a first–in, first–out manner. It supports an enqueue operation to add an item to the queue, and a dequeue operation to retrieve the next item in the queue.
While not strictly part of the definition, the expectation is that these enqueue and dequeue operations take a constant amount of time, meaning they take the same amount of time regardless of what’s being added to or retrieved from the queue.
Queues can also support other operations, like checking if it is empty, peeking at the next item, and getting the total number of items in the queue.
Implementations tested
I looked at three implementations based on arrays, plain objects, and (singly) linked lists. I required the implementations to all
have the same interface, meaning that an item
can be added to a queue
with queue.enqueue(item)
and the next item
can be retrieved with queue.dequeue()
.
One benefit of this is that the same test code can be used for all the implementations.
Array implementation
The simplest implementation is probably just using a built-in array and the push and unshift array methods. Here’s the array implementation I used:
const buildArrayQueue() => ({
items: [],
isEmpty() {
return this.items.length === 0
},
enqueue(item) {
this.items.push(item)
},
dequeue() {
if (!this.isEmpty()) {
return this.items.unshift()
} else {
throw new Error('Cannot dequeue empty queue')
}
}
})
The function buildArrayQueue
returns a queue object that is essentially a wrapper around an array (this.items
) and its
push
and unshift
methods.
Linked list implementation
The linked list implementation below uses an object to maintain references to the next node (this.head
) and the last node (this.tail
).
Calling queue.enqueue(item)
creates a node nextNode
containing item
as its value and adds it to the tail of the list,
checking for the edge case where the queue is empty. queue.dequeue()
returns the item from the next node in the list, throwing an error
if the queue is empty.
const buildLinkedListQueue => ({
head: null,
tail: null,
isEmpty() {
return this.head === null
},
enqueue(item) {
const nextNode = { val: item, next: null }
if (this.isEmpty()) {
this.head = nextNode
this.tail = nextNode
} else {
this.tail.next = nextNode
this.tail = nextNode
}
},
dequeue() {
if (!this.isEmpty()) {
const nextItem = this.head.val
this.head = this.head.next
return nextItem
} else {
throw new Error('Cannot dequeue empty queue')
}
}
})
Object implementation
The following object implementation is like a mix between the array and linked list implementations.
buildObjectQueue
returns an object that assigns each item an integer key. It keeps track of the the index of the next item in line
with this.head
, and it keeps track of the index of where to put the next item enqueued with this.tail
.
const buildObjectQueue = () => ({
head: 0,
tail: 0,
isEmpty() {
return this.head === this.tail
},
enqueue(item) {
this[this.tail] = item
this.tail++
},
dequeue() {
if (!this.isEmpty()) {
const nextItem = this[this.head]
this.head++
if (this.head === this.tail) {
this.head = 0
this.tail = 0
}
return nextItem
} else {
throw new Error("Cannot dequeue empty queue")
}
},
})
Tests
To keep things simple, I decided to simply measure the time it took for a queue to enqueue an item and also to dequeue an item. To see how the times scale with how large the queue is, I ran the same tests when the queues had one hundred items to ten million items and each power of ten in between, i.e., six orders of magnitude.
Here’s the important part of the code used to test the queue implementations:
const randomInt = (min, max) =>
Math.floor(Math.random() * (max - min + 1) + min)
function testQueue(queue, queueSize) {
// initialize queue
for (let i = 0; i < queueSize; i++) {
queue.enqueue(randomInt(0, 100000))
}
// test enqueue time
const randInt = randomInt(0, 100000)
const enqueueStart = process.hrtime.bigint()
queue.enqueue(randInt)
const enqueueEnd = process.hrtime.bigint()
// test dequeue time
const dequeueStart = process.hrtime.bigint()
queue.dequeue()
const dequeueEnd = process.hrtime.bigint()
return [enqueueEnd - enqueueStart, dequeueEnd - dequeueStart]
}
The function testQueue
initializes queue
with queueSize
random integers between zero and one hundred thousand,
and then measures the time before and after a call to queue.enqueue
and queue.dequeue
using Node’s process.hrtime.bigint
(the tests were run with Node version 14.15.2).
process.hrtime.bigint
returns a time relative to an arbitrary time in the past
and is thus used to measure differences in time. It returns the time in nanoseconds, i.e., in billionths of a second, as a bigint
.
You can read the docs here.
The actual code running the tests is
const numIterations = 400
const queueSizes = [100, 1000, 10000, 100000, 1000000, 10000000]
for (let queueSize of queueSizes) {
const arrayTimes = []
const linkedListTimes = []
const objectTimes = []
for (let i = 0; i < numIterations; i++) {
objectTimes.push(testQueue(buildObjectQueue(), queueSize))
arrayTimes.push(testQueue(buildArrayQueue(), queueSize))
linkedListTimes.push(testQueue(buildLinkedListQueue(), queueSize))
}
/*
code to write data to file
*/
}
At each queueSize
, I ran testQueue
for each implementation four hundred times, using a new queue object each time.
Predictions
I expected the enqueue times to be mostly independent of the size of the queue for all the implementations since the operations in all the enqueue implementations are constant-time operations.
I also expected the dequeue times to be independent of queue size, except for the array implementation. This is because
it uses the array method Array.unshift
.
The unshift
method takes an item
and adds it to the array, putting it at index 0. But
this then forces all the elements already in the array to be shifted down by one. This means that the whole unshift
operation
takes time proportional to the number of elements in the array. For example, unshift
should take about ten times
longer on an array with one thousand elements than an array with one hundred elements.
Reindexing array elements should be a fast operation, so this effect should be significant only at large array sizes. In addition, we also have to add the overhead time for function calls and whatnot.
Putting this all together, I expect the array dequeue times to be related to the queue size by the equation
for some numbers and . represents the unshift
effect, and represents the constant overhead time costs.
Hopefully you recognize this as the equation for a straight line: .
Results
Below is an interactive data visualization of the test results. You can choose what data is shown and what kind of plot to show it in by clicking the options below the plot.
Note I rescaled the units from nanoseconds to microseconds. I’ve chosen outliers to be anything either one and a half times the interquartile range above the 75th percentile or one and a half times the interquartile range below the 25th percentile, where the interquartile range is the difference between the 75th and 25th percentiles.
To better see how the enqueue and dequeue times scale with queue size, I’ve plotted the median times for each queue size below.
Because the queue sizes span six orders of magnitude, I’ve actually plotted the logarithm base ten of the enqueue and dequeue times1 against the logarithm base ten of the queue size to make the plots look nicer.
So for example, means , which is ten million, and means which is ten thousand. More generally, means .
Median enqueue times
The median enqueue times are somewhat constant over the six orders of magnitude of queue sizes, but there is a slight increase. They range from about 0.25 microseconds to about 1.5 microseconds. More on this increase below. For each queue size they are roughly the same for all three implementations.
Median dequeue times
The median dequeue times for the linked list and object implementations show a similar slight increase as the median enqueue times. The array implementation times very gradually increase and then start to rapidly increase.
Thoughts
Looking at the data, the results agree with my predictions for the most part. Some quick observations:
I noticed that in many of the scatterplots there seems to be a noticeable decrease in times over the first fifty or so iterations. I suspect this might be due to under-the-hood optimizations by the compiler caused by the many repeated calls to
enqueue
anddequeue
(we say these methods, which are functions really, become “hot”), but I wasn’t quite sure how to check this.In basically all the datasets we see some large outliers. These outliers I think are likely due to OS interruptions and the garbage collector.
Median array dequeue times
I predicted earlier that the array implementation median dequeue times should follow a linear behavior for large queue size . Looking at the graph, the points don’t exactly follow a straight line, but this is because I took the logarithms of and , meaning I actually plotted versus .
To see that this graph indeed shows the behavior I expected, let’s take the logarithm of both sides of the equation for a straight line :
Remember that is some number that represents the reindexing effect when using unshift
and that this effect is small, so
should be a pretty small number. This means that for relatively small queue sizes , the product is a lot smaller than , so
, which is constant. In other words, for small queue sizes, should be roughly constant.
On the other hand, once gets large enough that the product is much greater than , then
where the last equality follows from the property of logarithms that . Since we’re plotting versus , this approximate equation for has the same form as the equation for a straight line , where is , slope is 1, is , and is . This means that for large queue sizes, the graph of should be approximately linear.
We see both these behaviors for small and large queue sizes. Fitting by eye, and seem to fit the data well (try entering “plot log base 10 (0.34 *10^n + 489.5) between n = 2 and n = 7” on Wolfram Alpha to see for yourself).
But I should point out that the times for the array dequeue times don’t actually exactly follow the equation . This was an example of asymptotic analysis. What this equation really represents is the small and large behavior combined in a concise notation.
The enqueue and other dequeue times
Turning to the linked list and object dequeue times, as well as the enqueue times for all three implementations, to be honest I’m not quite sure why there is a slight increase as the queue size increases. It seems there might be some sort of dependence?
At first I thought, well, we’re using Node, which uses Chrome’s V8 Javascript engine, and I remembered reading about how V8 uses an optimization technique known as hidden classes for fast object property accesses and that adding properties to already-existing objects causes hidden class transitions.
I checked whether the classes were changing as items were enqueued and dequeued using V8’s %HaveSameMap
function and
no, there were no hidden class transitions.2
I was particularly surprised that the object implementation had no hidden class transitions, since it relies on explicitly adding and removing properties to the queue object. Reading the article on hidden classes linked above, it turns out that integer properties are treated differently than noninteger properties, so that adding integer properties does not cause hidden class transitions.
Then I looked into how the enqueue times changed for a given queue as the queue size grew up to one million items as well as the dequeue times as the queue is dequeued. Since checking the times after enqueueing and dequeueing each item would give me two million data points, which is far more than I need or want, I recorded the times after every one thousand items.
Besides some occasional outliers, the dequeue and enqueue times are basically independent of queue size, except of course for the array dequeue times as discussed above. Moreover, the times are slightly faster than the fastest times recorded in the earlier tests.
The code I ran here looks similar to the testQueue
code earlier:
const randomInt = (min, max) =>
Math.floor(Math.random() * (max - min + 1) + min)
function testSameQueue(queue, queueSize) {
const enqueueTimes = []
const dequeueTimes = []
for (let i = 0; i < queueSize; i++) {
const randInt = randomInt(0, 100000)
const enqueueStart = process.hrtime.bigint()
queue.enqueue(randInt)
const enqueueEnd = process.hrtime.bigint()
}
for (let i = 0; i < queueSize; i++) {
const dequeueStart = process.hrtime.bigint()
queue.dequeue()
const dequeueEnd = process.hrtime.bigint()
}
return [enqueueTimes, dequeueTimes]
}
const linkedListResults = testSameQueue(buildLinkedListQueue(), 1000000)
const objectResults = testSameQueue(buildObjectQueue(), 1000000)
const arrayResults = testSameQueue(buildArrayQueue(), 1000000)
I’m not sure what the cause of the discrepancy is. It’s probably the effect of some implementation details I am unaware of. If you know of an explanation, please let me know!
Other considerations
From the data we’ve seen above, it seems that the linked list implementation is the preferred implementation of the three I tested. The array implementation is the least preferred, particularly because of how its dequeue times scale linearly with queue size. The object and linked list implementations performed similarly to each other, with the linked list performing slightly better overall.
One additional thing we haven’t discussed yet is memory.
In short, the object implementation does not automatically remove references to each item as each item is dequeued. This is because
each item is associated with an integer key in the queue object, and when dequeuing an item, this.head
is simply updated
to the next integer key representing the next item to be dequeued.
The item referenced at property key 6, for example, is
only removed when (1) this.head
and this.tail
are reset to zero when the queue is completely emptied and (2) when
another six items are enqueued to overwrite the old items.
In the case that the queue is never completely emptied and items are continually enqueued and dequeued, the queue will keep collecting references to objects that might not be needed after they are dequeued. This means that those objects won’t be garbage collected, leading to a potential memory leak.
You might say, well, just use the delete
operator to remove the reference each time an item is dequeued. You can do this, but
this adds an extra step and adds more time.
The linked list implementation does not suffer from this problem, because by design this.head
is directly a reference to
(a node containing) the next item to be dequeued. So when an item is dequeued, the reference this.head
is moved to the next node, and
the queue has no other references to the dequeued item, allowing it to be garbage collected
(if there are no other remaining references to it elsewhere).
That’s it for this post! Please let me know if you have any comments or suggestions, especially regarding any of the topics I wasn’t sure about.
- One technical note: I kept the units of the enqueue and dequeue times as nanoseconds here to avoid dealing with negative logarithms, and there aren’t really units for . Another way to think about this last point is that the plots are basically the same as log-log plots of vs .↩
- In fact, the linked list and object implementations have the same hidden class because they have the same properties declared at instantiation in the same order.↩