Studio 2: Obtaining better asymptotic performance for a list
- Pre-studio: Review of Linked Lists
- Part A: Probe the Costs of Linked Lists
- Part B: Improve the Behavior of Appending to a List
- Part C: Improve the Behavior of Querying List Length
- Part D: Further Extensions
Pre-studio: Review of Linked Lists
Open Eclipse and do a Team/Pull on your repo to make sure it is up to date.
This studio focuses on the complexity of operations on linked lists.
We have provided you with some linked-list code from CSE 131 in the
studio2.lists
package.
- Take a look at this code, paying particular attention to the
implementation in
LinkedList.java
, before studio, and try to understand how it works. In particular, take a few minutes to watch this video from CSE131, which discusses how to add an element to the end of the list (theaddLast()
operation). You'll be studying and modifying the code for this procedure as part of your studio.
Part A: Probe the Costs of Linked Lists
We'll start by studying the cost of some methods in the LinkedList
class.
- Go to the
studio2
package in your repository, open theAddLastNoTail
class, and run it.
The work performed by AddLastNoTail
is mostly in the run()
method
of its base class AddLastBase
. That method appends $n$ successive
random integers to the end of the list.
- Refresh the
outputs
folder and find the files of the time and ticks counts generated by runningAddLastNoTail
. Plot the running times in these files as a function of $n$.
Question A1: How would you describe the curves for ticks and time as functions of $n$?
You may be wondering where the ticker calls that generated tick counts
for this program are located. They are in the addLast()
method in
LinkedList.java
in the studio2.lists
package, which you looked at
during the pre-studio.
addLast()
is instrumented to increment the tick counter once for each constant-time statement in the function (other than calls to the ticker itself). This level of detail may not be necessary to get an idea of the function's asymptotic complexity, but check to make sure that you know why eachtick()
call in the function has the count it does.
Question A2: Based on the code of addLast()
and what you've learned
from Studio 1 and our class meetings, explain why the ticks and time
curves for AddLastNoTail
have the shapes you observe.
Part B: Improve the Behavior of Appending to a List
- Find and open the
LinkedListWithTail
class in thestudio2.lists
package.
This is the only class you will modify in this exercise. As given to
you, its behavior is identical to AddLastWithTail
.
-
Modify this class so that it includes a data member
tail
that is always a reference to the last element in the list, and so that it usestail
to help add an element to the end of the list. Ask your TA for help if needed.-
You'll need to modify any method that changes the list to correctly maintain
tail
so that it always points to the last element of the list. -
What should
tail
be after the constructor for the list completes? -
How can you ensure that
tail
is set correctly at the end of theaddLast()
function, assuming it was set correctly at the beginning of the function?The property that
tail
always points to the end of the list is an invariant of the class -- it is a property that must be true true any time you look at the list. Once you know the invariant that you are trying to maintain, you can reason separately about what each function should do to maintain it. We'll see more examples of data structure invariants later in the semester.
-
-
Before you go on, modify the ticker calls in
addLast()
so that they again count one tick for every constant-time statement in the function. Your tick calls should match your new code. -
Open the
AddLastWithTail
class and run it. This class calls your modified code to add $n$ random elements to a list. -
Refresh
outputs
again and produce ticks and time plots for your modified code as usual.
Question B1: what shape of running-time curve do you now see for $n$ elements to a list? Explain why the shape of the curve changed as a result of your changes.
Question B2: how much additional storage per list did your change require?
Question B3: is adding a tail pointer a worthwhile extension to a linked list? When might it not be worthwhile?
Part C: Improve the Behavior of Querying List Length
For this part, continue to modify (only) the LinkedListWithTail
class.
Let's consider the cost of determining the size of a list of $n$ elements.
The Size
class adds $n$ elements to a list, then calls the getSize()
function to determine its size. It counts ticks only for the
getSize()
function, not for adding elements.
- Look at the implementation of
getSize()
, and determine how many ticks it should take on a list of size $n$.
Question C1: give an exact expression for the number of ticks used by
getSize()
as a function of $n$, and justify how you got it.
- Design a modification of the list class so that a call to
getSize()
can always run in constant ($\Theta(1)$) time, regardless of $n$.
Question C2: what invariant did you add to your class to make
constant-time getSize()
possible? What do you have to do to
maintain this invariant?
- Measure your modified list's performance by running the
Size
andAddListNoTail
classes and plotting the resulting running times.
Question C3: what evidence do you have that your getSize()
method
is really constant-time? Did you change the asymptotic complexity
of any other list operations in the process?
Part D: Further Extensions
For this part, there is nothing to code, but you should think about how to implement the suggested extensions of a linked list and how to justify their correctness and their impact on running time.
- The midpoint of a list of length $n$ is Element $\lceil n/2 \rceil$, where Element 1 is the first element in the list. For example, in a list $[a, b, c]$, the midpoint is $b$; in a list $[u, v, w, x, y, z]$, the midpoint is $w$.
Question D1: what is the asymptotic complexity of returning the
midpoint of a list of length $n$, assuming no extensions to the
LinkedList
class?
- Design an extension to the linked list data type that lets you access the midpoint of the list in constant time, no matter how many elements have been inserted.
Question D2: what invariant do you need to maintain? How do
you extend the methods in the LinkedList
class to maintain it?
Does their asymptotic running time change?
- Now consider how to extend the
LinkedList
class so that you can always return the maximum element in the list in constant time.
Question D3: describe your extension strategy and its impact on the complexity of the list operations.
Question D4: suppose we augment LinkedList
so that we can delete
an element from the linked list, given a reference to it, in constant
time (this would require our list to be doubly-linked). How does
the complexity of deletion change if you also want to maintain
constant-time access to the maximum element in the list?
It's challenging (indeed, seemingly impossible) to achieve fast insertion, deletion, and maximum operations by extending a linked list! We'll be looking at other data structures later in the course for which all three operations (and more!) can be implemented efficiently.