Microsoft Word - HW4-Memory Map and RadixSort Float Point Numbers.docx Radix Sort Float Point Numbers with Memory Map Implement a C program that sorts a set of 4-byte float point values in ascending...

1 answer below »

Microsoft Word - HW4-Memory Map and RadixSort Float Point Numbers.docx

Radix Sort Float Point Numbers with Memory Map
Implement a C program that sorts a set of 4-byte float point values in ascending order using radix
sort. The values are saved in a file. The program should read/write the file through memory
mapping. When the program finishes, the sorted values should be saved in the same file.
Submission Instructions
1. Submit: 1) your C program, and 2) one or two screenshots in jpg format showing that your program
eally works.
2. Name your scripts using the pattern NJITID#.c, and name your screenshots using the pattern
NJITID#_index.jpg. NJITID# is the eight-digit NJIT ID (Not your UCID, Rutgers students also have
NJIT IDs). Index is the index number showing the order of the screenshots (e.g., 1 or 2).
3. Submit individual files. DO NOT SUBMIT A ZIP FILE.
Objectives
• To gain more experience with using bitwise operations
• To understand the binary format of float point numbers
• To gain more understanding on radix-sort
• To learn how to read/write files using memory mapping
Requirements and Instructions
• Your program should take one argument, which is the pathname of the file containing the data
to be sorted. For example, to sort the float point values saved in ./file5k, you can use the
following command:
.
adixsort ./file5k
• The number of float-point values saved in the file can be calculated using file size and the size
of each float point value (i.e., 4 bytes). Thus, there is no need to specify the number of values.
• To access the data in the file, your program needs to use memory mapping. Avoid using
conventional calls, e.g., read(), fread(), write(), or fwrite(), to read/write the file. Refer to the
slides on Linux File and Directory Operations, particularly the two examples in the Memory-
mapped files part, for how to read/write files using memory mapping.
• Your program must use radix-sort and work directly on the binary data (i.e., avoid converting
the data into strings or characters). To use binary operations to extract bits from a float-point
number, you need to use a union to include two types, float and int. For example, the following
code is to extract the least significant bit from float point number 0.1.
unsigned int b;
union ufi {
float f;
int i;
} u;
u.f = 0.1;
= u.i & 0x1;
• Since the program sorts the numbers using binary format, your program only needs two
uckets to help with sorting. The memory space that maps the file can be used to merge the
data. This reduces memory space consumption. At the same time, when the last round of radix-
sort, the file automatically saves the sorted values.
• Some of the float point values are negative. Special attention is needed to handle the problem
caused by sign bits. Refer to the slides on Radix Sort for the methods handling signed float
point numbers.
• You can compile gendata.c attached with this assignment and use it to generate random
values and save them into a file. The executable file can also be found in
in in the virtual
machine. The program also reports the sum of the values. For example, to generate 5000
andom values and save them into ./file5kvalues, you can use the following command
./gendata 5000 ./file5kvalues
• You can compile checkdata.c attached with this assignment and use it to check whether
the float point values have been sorted in ascending order. The executable file can also be
found in
in in the virtual machine. The tool also calculates a sum of the values in the file.
Thus, you can compare the sum with the sum reported by gendata. The two sums should be
very similar with minor numerical e
or caused by limited precisions.
./checkdata ./file5kvalues
• Optimize your implementation. For example, to copy a large number of numbers, you can use
memcpy instead of copying the numbers one by one.
Testing
Test 1. The program can co
ectly sort 1 million float point values in a file within 1 minute, and
the file can pass the test with the checkdata program (i.e., sorted; the sum of sorted values is close
to the sum of unsorted values given by gendata when the file was created (difference <5%))
Step 1: Generate a file containing 1 million float point values using gendata, and write
down the sum of these values reported by gendata:
gendata XXXXXXXXXX ./file1mvalues
Step 2: run the program to sort the values:
time ./major_hw3_1 ./file1mvalues
Step 3: check whether the values have been sorted using checkdata, and write down the
sum reported by checkdata:
checkdata ./ file1mvalues
Step 4: compare the sum reported by gendata and the sum reported by checkdata.
Test 2. The program can co
ectly sort 100 million float point values in a file within 2 minute, and
the file can pass the test with the checkdata program (i.e., sorted; the sum of sorted values is close
to the sum of unsorted values given by gendata when the file was created (difference <5%))
Step 1: Generate a file containing 100 million float point values using gendata, and write
down the sum of these values reported by gendata:
gendata XXXXXXXXXX ./file100mvalues
Step 2: run the program to sort the values:
time ./major_hw3_1 ./file100mvalues
Step 3: check whether the values have been sorted using checkdata, and write down the
sum reported by checkdata:
checkdata ./ file100mvalues
Step 4: compare the sum report by gendata and the sum reported by checkdata.

CS288 Intensive Programming in Linux
Sorting
A fundamental application for computers
Done to make finding data (searching) faste
Many different algorithms for sorting
Simple sorting algorithms run in quadratic time O(N2)
u
le sort
selection sort
insertion sort
Conventional sorting algorithms: https:
www.toptal.com/developers/sorting-algorithms
Though the order is “ascending” by default, it is not difficult to figure out how to change the order to “descending”.
2
Radix and Radix Sort
Radix = “The base of a number system” (Webster’s dictionary)
Radix is another term of “base” : number of unique digits, including the digit zero, used to represent numbers
Radix of numbers:
Binary numbers have a radix of 2
decimals have a radix of 10
hexadecimals have a radix of 16.
Radix of texts:
26 if only capital letters are considered
36 if capital letters and decimal digits are considered
62 for capital letters + small letters + decimal digits
3
Radix and Radix Sort
Radix sort was first used in 1890 U.S. census by Hollerith
Used to sort numbers or texts
Very efficient when sorting a large number of elements
O(M*N). M: length of each elements; N: number of elements
May use more space than other sorting algorithms
E.g., bu
le sort is in-place soring.
Basic idea: Bucket sort on each digit, from least significant digit to most significant digit.
4
Bucket Sort in Radix Sort
Use a bucket a
ay of size R for a radix of R
Put elements into the co
ect bucket in the a
ay
R = 5; unique digits (0,1,2,3,4); list = (0,1,3,4,3,2,1,1,0,4,0)
    Buckets
    = 0    0,0,0
    = 1    1,1,1
    = 2    2
    = 3    3,3
    = 4    4,4
Sorted list:
0,0,0,1,1,1,2,3,3,4,4
Radix Sort: bucket sort on every digit
it
For N elements between (L, H), using H-L+1 buckets can sort the elements in one round
Problem: the range (L, H), if there is one, may be too large.
Sorting 4-byte unsigned integers, range is [0, 2^32-1]
Solution(radix sort): apply bucket sort on every digit
it
0 1 0
0 0 0
1 0 1
0 0 1
1 1 1
0 1 1
1 0 0
1 1 0
2
0
5
1
7
3
4
6
Use two buckets
5
Radix Sort: bucket sort on every digit
it
For N elements between (L, H), using H-L+1 buckets can sort the elements in one round
Problem: the range (L, H), if there is one, may be too large.
Sorting 4-byte unsigned integers, range is [0, 2^32-1]
Solution(radix sort): apply bucket sort on every digit
it
0 1 0
0 0 0
1 0 1
0 0 1
1 1 1
0 1 1
1 0 0
1 1 0
2
0
5
1
7
3
4
6
0 1 0
0 0 0
1 0 0
1 1 0
1 0 1
0 0 1
1 1 1
0 1 1
0
1
0 1 0
0 0 0
1 0 0
1 1 0
1 0 1
0 0 1
1 1 1
0 1 1
Merge
Last bits are sorted
6
Radix Sort: bucket sort on every digit
it
For N elements between (L, H), using H-L+1 buckets can sort the elements in one round
Problem: the range (L, H), if there is one, may be too large.
Sorting 4-byte unsigned integers, range is [0, 2^32-1]
Solution(radix sort): apply bucket sort on every digit
it
0 1 0
0 0 0
1 0 1
0 0 1
1 1 1
0 1 1
1 0 0
1 1 0
2
0
5
1
7
3
4
6
0 1 0
0 0 0
1 0 0
1 1 0
1 0 1
0 0 1
1 1 1
0 1 1
7
Radix Sort: bucket sort on every digit
it
For N elements between (L, H), using H-L+1 buckets can sort the elements in one round
Problem: the range (L, H), if there is one, may be too large.
Sorting 4-byte unsigned integers, range is [0, 2^32-1]
Solution(radix sort): apply bucket sort on every digit
it
0 1 0
0 0 0
1 0 1
0 0 1
1 1 1
0 1 1
1 0 0
1 1 0
2
0
5
1
7
3
4
6
0 1 0
0 0 0
1 0 0
1 1 0
1 0 1
0 0 1
1 1 1
0 1 1
0 0 0
1 0 0
1 0 1
0 0 1
0 1 0
1 1 0

checkdata-lozjta41.c gendata-v5rlv3mc.c hw3-memory-map-and-radixsort-float-point-numbers-sghiw2qd.pdf radix-sort-53hmmios.pptx linux-file-and-directory-operations-ya3jfmiz.pptx

Answered Same Day Jul 13, 2022

Solution

Priyang Shaileshbhai answered on Jul 14 2022

87 Votes

SOLUTION.PDF

Microsoft Word - HW4-Memory Map and RadixSort Float Point Numbers.docx Radix Sort Float Point Numbers with Memory Map Implement a C program that sorts a set of 4-byte float point values in ascending...

Solution

Answer To This Question Is Available To Download

Related Questions & Answers

Submit New Assignment