Jun 21 2023

List

Jun 21 2023

Singly linked list
Doubly linked list
Basic operations
List VS array
Additional links

A list is a linear collection of data that allows you to efficiently insert and delete elements from any point in the list.

Singly linked list

A list consists of nodes. Each node contains data and a pointer to the next node. If a node has a pointer only to the next node in line, it is a singly linked list. Schematically, it may look like this:

In the singly linked list, the pointer of the last node to the next one is null. This way, the list is just a pointer to the first node.

Here you can find an example of a singly linked list implementation in C, and here is an example for C++.

Doubly linked list

A node can have a pointer to the next node as well as a pointer to the previous one. This kind of list is called doubly linked. Schematically, it looks like this:

In the doubly linked list, the pointer of the first node to the previous one is null.

Here you can find an example of a doubly linked list implementation in C, and here is an example for C++.

Basic operations

List traversal and an arbitrary element access

To traverse the list, store the pointer to the current node. You can traverse to the next node by the pointer stored in the current node.

To access the nth element of the list, you need to sequentially traverse the first n nodes by their pointers to the next node. As a result, an arbitrary element is accessed in O(n) time.

Search

If you need to find an element in the list, traverse it and check each node for the target element. In the worst case, if the item needed is not there, you will traverse the entire list. In the general case, the search is performed in O(n) time.

Insertion

Inserting elements at the list is performed in O(1) time. To insert an element at the list, create a new node and update several pointers' values.

To insert a new node into a singly linked list, follow these steps:

Select a node after which you want to make an insertion.
In the selected node, save the value of the pointer to the next node.
Assign the pointer to the next node to the address of the newly inserted node.
Assign the pointer to the next node of the inserted one to the value saved in step 2.

Schematically, the insertion looks like this:

To insert a new node into a doubly linked list, follow these steps:

Select the node before which you want to make an insertion.
In the selected node, save the value of the pointer to the previous node.
In the node before the selected one, save the value of the pointer to the next node.
Assign the pointer to the previous node of the selected one to the new node's address.
Assign the pointer to the next node of the node before the selected one to the new node's address.
Assign the pointer to the previous node of the inserted one to the value saved in step 2.
Assign the pointer to the next node of the inserted one to the value saved in step 3.

Schematically, it looks like this:

Deletion

Deletion from a list is performed in O(1) time. To remove a node from the list, update several pointers.

To delete a node in a singly linked list, follow these steps:

Select the node after the node you want to delete.
In the node being deleted, save the value of the pointer to the next element.
Assign the pointer to the next node of the selected one to the node address saved in step 2.
Delete the node and free up memory.

Schematically, it looks like this:

To delete a node in a doubly linked list, follow these steps:

Select the node you want to delete.
In the node being deleted, save the value of the pointer to the previous node.
In the node being deleted, save the value of the pointer to the next node.
Assign the pointer to the previous node of the node after the selected one to the node address saved in step 2.
Assign the pointer to the next node of the node before the selected one to the node address saved in step 3.
Delete the node and free up memory.

Schematically, the deletion looks like this:

List VS array

An array is a linear data structure that continuously stores elements in memory and supports accessing an arbitrary element of a sequence. Schematically, an array looks like this:

To remove an element from the middle of an array, you need to move all the elements after the element you want to remove one position forward:

To delete the 100th element in an array of 1000 elements, you need to make 900 moves. Thus, it takes O(n) to remove an element from any position except the end. Removing an element from the end takes O(1).

Inserting an element in the middle of an array is no easier. In this case, all array elements after the insertion point need to be moved one position back:

To insert an element at the 100th position in an array of 1000 elements, 900 moves are needed. Thus, it takes O(n) to insert an element at any position except the end. Inserting an element at the end takes O(1).

As we can see, the only efficient place for inserting and deleting elements is the end of an array. At the same time, updating several pointers (four or less) is enough to insert or delete an element anywhere in the list. If you need to delete and insert elements in arbitrary places of the sequence, then it is better to store them in a list.

Based on the above, we can highlight another advantage of lists: they ensure that references to elements and iterators to nodes remain valid during any insertions and deletions (except for the nodes being deleted). Things are more complicated when it comes to arrays:

When inserting:
- If there was a reallocation to store more elements, then all iterators and references to elements become invalidated.
- Otherwise, they are valid up to the insertion position. Iterators and references become invalidated starting from the insertion position.
When deleting, all iterators and references after the element being deleted are invalidated.

The disadvantage of lists is the slow access to an arbitrary list element. If you need the 901st list element, you will have to traverse all 900 other list elements. In addition, list elements are usually arranged inconsistently in virtual memory. This means that each move to the next node will be accompanied by an arbitrary memory access.

Array elements are located continuously in virtual memory. Therefore, an access to an arbitrary element is a single access to the address obtained using address arithmetic: an offset equal to n * sizeof(Item) is added to the zero element address. Moreover, if small array elements are accessed sequentially, modern processors recognize this and load data in advance into cache (cache prefetching). For example, when accessing an int array element with a size of 4 bytes, if a cache line has a size of 64 bytes, the processor will load 16 elements into its cache at once. In case of lists, cache misses cause the processor to access virtual memory more often. This is much slower than CPU cache accesses.

Therefore, if you need random access to sequence elements, use an array instead of a list.

A brief comparison of an array and a list: