Extension : Extension 3.5: Data Sorting and Analysis

Authors
Brennan Morell
Jarett Gross

A big part of Computer Science revolves around finding ways to sort data to be organized in a useful manner. There are many different algorithms that computer scientists use to sort information. You can learn about a few popular ones below:

We will implement our sorting using a naive process of repeatedly selecting the smallest value from a data set of type int[] and swapping it to the front of the collection. In computer science, this is called a Selection Sort.

Procedure

  1. Open the Sorting class, found in the arrays package of the extensions folder.

  2. Prompt the user for the size of the collection

    If the user enters a negative number, continue to prompt them with a useful message until they enter a positive number.

  3. Continue to prompt the user to input numbers one at a time, using an array to store the input data.

  4. After processing all of the numbers entered, the data will be sorted using the following naive algorithm:

    • Create an outer loop to limit the number of scans to be equal to the size of the collection.

    • Create an inner loop to iterate over the unsorted portion of the array, each time selecting the minimum value (keep track of this value and its index in the array)

      The unsorted portion will always be the sub-array with indexes sortCount to size inclusive.

    • Swap this minimum value with the value currently at the head of the unsorted portion of the array

  5. After the sorting the data, we will take advantage of its useful organization to compute the following statistics:

    Mean: The simple average of the data.

    Median: The middle value in the ordered dataset (Note: How does this computation vary for even and odd sized datasets?)

    Min: The smallest value in the ordered dataset.

    Max: The largest value in the ordered dataset.

    Range: The difference between the maximum and minimum values of the data set.

  6. Finally, arrange for your output to display the sorted dataset and accompanying statistics in the following manner:

1 2 3 4 5 
Mean: 3.0
Median: 3.0
Min: 1
Max: 5
Range: 4